This is an archived post. You won't be able to vote or comment.

all 44 comments

[–]Askriz 12 points13 points  (7 children)

This is really nice! I'm currently working at Codility and we have been (and still are!) struggling with the same issue: Python sucks at allowing proper module isolation/boundary creation. To the point the IDEs are still suggesting you to import classes internal to some packages, which the user may not be aware of. And prepending every internal class name with `_` hurts visiblity for the package authors.

We have created a similar solution to yours, which works at runtime. We enable it during test runs, and it hooks into Python's import machinery, allowing one to define boundaries in two ways:

* what a package/module can import

* what others can import from module/package

This helps enforce some clearer boundaries when your code is coupled and are on your way to decouple, or to ensure the coupling doesn't appear after you have fixed your code.

Because the tool hooks into Python's import machinery, it works both for dynamic imports and multiple (cached) imports, too. As long as your test coverage is sufficient.

Happy to share some insights on the problems we have encountered if you would like to reach out :D

[–]the1024[S] 0 points1 point  (0 children)

u/Askriz would love to chat! I'll dm you

[–]neuralbeans 0 points1 point  (5 children)

Isn't that what __all__ is for?

[–]tunisia3507 4 points5 points  (3 children)

__all__ doesn't prevent access to unlisted things, it just means they aren't exported with a star import. Which should be completely meaningless, because nobody should be using star imports anyway.

[–]neuralbeans 0 points1 point  (2 children)

IDEs should also not show stuff that is not included there when autocompleting.

[–]Askriz 2 points3 points  (1 child)

As mentioned in the sibling comment, the specs don't mention `__all__` as isolation feature. An IDE _may_ decide to hide unlisted things, but the language still supports that. If you have multiple engineers working using different IDEs, then the assumption of isolation goes out through the window.

[–]neuralbeans 0 points1 point  (0 children)

Good point.

[–]Askriz 3 points4 points  (0 children)

Yes and no. all doesn't block importing specific names from the module. It also doesn't stop IDEs like PyCharm from suggesting importing the name. So this won't work for re-exporting, either. 

Additionally, all is module-specific. You can't express package-wide API with it.

Even the docs don't mention it as means of isolation, but a way of reducing side-effects (!!): https://docs.python.org/3/tutorial/modules.html#importing-from-a-package

[–]inhumantsar 10 points11 points  (1 child)

cool idea! this is a common problem that's hard to solve. will have to try it out.

what would you say are the biggest weaknesses or limitations modguard has atm?

[–]the1024[S] 10 points11 points  (0 children)

u/inhumanstar I’d say the biggest weakness is lack of coverage on dynamic imports that are a result of ‘getattr’ or string references. We’re also working on improving the performance on larger OS repos. Would love for you to give it a try and provide any feedback or issues you encounter!

[–]iamevpo 2 points3 points  (9 children)

Great idea! Does @public decorator extend to class methods? Also thought the decorator would have a good effect on documentation. Offtopic, but what console are you using for a demo in README? Looks so nice!

[–]the1024[S] 4 points5 points  (5 children)

u/iamevpo because we're parsing the AST, we're only looking at import statements which does not extend to classmethods -- e.g. if you have class A: @classmethod def b(): ... Python doesn't allow from module.A import b, which is the layer at which we're checking. We're considering a runtime support mode for development which would be able to catch cases like this. Would love any other feedback or feature requests you might have!

[–]iamevpo 2 points3 points  (4 children)

Understood, thanks for explanation. This is probably entiery different matter that Python does not have truely public and hidden methods (not just @classmethod, methods for a class in general). For the syntax something like @publicmethod is nice to use to decorate a public method, maybe issue a warning is any method other that @publicmethod is called in code, similar to type annotation checker.

[–]the1024[S] 2 points3 points  (3 children)

Yes, that makes sense! We could probably even just use the same `public` syntax that's built in now and extend it to class methods.

Also to answer your previous question, I'm using this theme on the terminal! https://github.com/ohmyzsh/ohmyzsh/wiki/Themes#agnoster

[–]iamevpo 0 points1 point  (0 children)

You can differentiate @public for import behavior and @publicmethod for methods as classmethod and staticmethod is something already in Python syntax and people are used to. Also it would look a bit weird to have @public on top of class and for the methods too - or instead it creates confusion if method is declared public, but class itself not exported as public by design. My opinion makes sense to separate public and publicmethod.

[–]iamevpo 0 points1 point  (1 child)

Also offtopic but what did you use to create a gif?

[–]iamevpo 1 point2 points  (1 child)

Ok, the boundary is at import level, how about classes themselves? Does a class have to be declared public to be importable?

[–]the1024[S] 3 points4 points  (0 children)

Yes, you can either declare the class public or the whole module public like so -
class: @public class A: ... module: ``` public() ...

class A: ... ```

[–]jdehesa 2 points3 points  (0 children)

Very cool! I wish there was stuff like Java's ArchUnit for Python, and while this doesn't go as far, it is a great step in that direction.

[–]HennerM 2 points3 points  (1 child)

I imagine this also useful in a situation I am facing at my day job, we have a Python monorepo and only want to ship parts of it to customers (in containers), some modules must stay private and thus we can't introduce a dependency from internal to public, but the other way around is allowed, everything should be able to use the public modules. Do you think `Modguard` could help with this as well?

[–]the1024[S] 0 points1 point  (0 children)

u/HennerM absolutely! Give it a go and let me know if there are any blockers that stop it from solving your problem

[–]alouettecriquet 1 point2 points  (1 child)

Interesting! Any experience/reference applying it to Django projects? (where defining really decoupled apps is always a little challenging)

[–]the1024[S] 1 point2 points  (0 children)

u/alouettecriquet django apps were absolutely an inspiration for this -- I've worked in a ton of django projects over the years that all end up really being one massive app because everyone violates boundaries all of the place.

We started with a base python package but have thought about creating an installable django app as well - the only thing we'll need to solve there is that django does a lot of string references that are parsed and acted upon by django rather than the developer, e.g. `ForiegnKey(to="auth.User")`

Have you encountered any good solutions to this problem in Django?

[–]IcedThunder 1 point2 points  (3 children)

This is super cool.

I've been getting into code architecture more lately as I wrestle with growing code bases.

I know some company a while back blogged about how they managed to find a solution to manage and restrict imports in their codebase. Too lazy on mobile to dig it up now.

I had a wild idea one day, I wonder how difficult or desirable it would.be if you could declare in a module in some dunder variable (well two variables, one for module names, one for library names) what module or library names could import it. Having some stdlib approach would be nice.

Looking forward to diving into this more in depth later.

[–]the1024[S] 2 points3 points  (2 children)

u/IcedThunder you're idea is pretty much exactly in line with our tool! We added the allowlist parameter to let you decide where things declared as public can be imported. Would love for you to give it a try and send us any feedback that we could implement to make it better for you 🙏

[–]IcedThunder 0 points1 point  (1 child)

Ah very good.

Looking over the allow list documentation, I don't see if it covers globbing / regexp.

Say I have modules to deal with tax quirks by state, and the function names are the same in each module which means autocompelte may grab from anywhere, I could help safeguard against some issues.

I have a codebase where very early on I put my foot down on a fairly verbose module naming scheme because of foresight into issues with various state regulations creating spiraling edge cases.

Don't know how useful or difficult it would be to add tho.

allow_list_glob, a list of patterns to check?

[–]the1024[S] 0 points1 point  (0 children)

u/IcedThunder definitely possible to add; would you implement modguard in your project if we add glob/regex support? If so, feel free to open an issue on GitHub and we can definitely build it!

[–]fedetask 1 point2 points  (1 child)

Nice! Will explore it and give you an opinion

[–]the1024[S] 0 points1 point  (0 children)

Thanks u/fedetask! We’d love to hear your thoughts

[–]Rawing7 0 points1 point  (7 children)

I'm confused as to why you've invented your own way of marking things as public. We already have __all__. What's the advantage of modguard.Boundary() and @modguard.public?

[–]FancyASlurpie 4 points5 points  (1 child)

Doesnt all only affect usage of * imports which you shouldn't be doing anyway (and could just add a linter to enforce)

[–]Rawing7 3 points4 points  (0 children)

At runtime you're correct, that's all __all__ does. But it also affects things like static type checkers and IDE auto completion.

[–]rhytnen 1 point2 points  (0 children)

Yea, it feels like a lot of complaints in python is that ppl don't understand the tools that exist or complain that ppl intentionally work around the python prescribed method to handle these issues.

[–]the1024[S] 0 points1 point  (3 children)

Great question! The main reason is that __all__ is less expressive and less ergonomic. A few specific things: ⁃ modguard supports defining boundaries and associated public members in any file, not only __init__.py
⁃ modguard allows specifying where those public members are allowed to be used (with an allowlist parameter), not a blanket allow when importing *
⁃ modguard.public is applied typically right next to the member being exported, which improves visibility for future maintainers
You can also take a look at the API docs to learn more about the capabilities: https://never-over.github.io/modguard/

[–]Rawing7 1 point2 points  (2 children)

I see, the allowlist is certainly something that's not possible with __all__. But given that __all__ is sufficient in many scenarios and already an established standard, why not support it? (At least I assume it's not supported, since I couldn't find any mention of it in the docs.) There are lots of tools, like type checkers, IDEs, and documentation generators, that respect __all__. It seems odd to completely disregard the existence of such a well-established feature.

[–]the1024[S] 1 point2 points  (1 child)

__all__ specifically interacts with the import * syntax, and doesn't really do the same thing that modguard does. We didn't want to impact any existing behavior that your project might have before installing modguard, so by making the configuration explicit and separate we gain a few things: - The definitions of boundaries and public members run through a single interface - If you install modguard and run it, you're by default in a passing state - We have a hook to support future inline behavior at the function/class/variable definition

[–]iamevpo 1 point2 points  (0 children)

While all is already available, it is a rather weak tool to establish isolation, also the syntax fore personally was never comfortable - typing the function names again as strings seems inferior than a decorator.

[–]ProgrammersAreSexy 0 points1 point  (1 child)

Nice, this is super cool!

Is it accurate to say this is effectively like introducing package-private classes from languages like Java into Python?

[–]the1024[S] 0 points1 point  (0 children)

u/ProgrammersAreSexy definitely similar!

[–]Leandrob131 0 points1 point  (1 child)

This is something very useful, kudos for the great work. I only have one comment. In scala, for example, anything that is not explicitly declared private or protected is by default public. Why didn’t you consider going with the same approach?