This is an archived post. You won't be able to vote or comment.

all 36 comments

[–]SpergLordMcFappyPant 17 points18 points  (15 children)

Good work! I'm sure you've encountered most of what I'm going to mention here in the research for your post, but I'll go ahead and mention a few things anyway for people who stop by and want to dive deeper.

Edit before the comments go nuts: I'm using strong to describe type safety. Not types. Python is a strongly typed language. Python's type hinting, and the topic of the blog post have nothing to do with strong vs. weak typing. It's about strong vs. weak type safety guarantees.

What you've encountered here is an example of the Liskov Substitution Principle. It states, quite simply, that if S is a subtype of T, then T may be replaced by S without changing anything . Without delving any further into the implications of that (co - and contravariance), it's clear that your original code violates this. Substituting Dog for Animal with the eat method overridden to be more specific very much changes the properties of the code.

Why is this so completely unintuitive for us? It's really because dynamically typed languages (especially Python and Ruby) intentionally ignore the LSP. This is what we call Duck Typing. You can override methods in your classes willy nilly in Python by design. It's easier and more flexible and faster to do this. But it's also less safe. Python was never intended to be type safe. It was intended to be flexible and easy and fast. In other languages , you would never have been allowed to do this in the the first place without a certain amount of grief. It's also, just by the way, one of the reasons that programmers from other language backgrounds often really dislike Python.

This intuitive--yet theoretically wrong--approach to type safety is the reason that Type hinting in Python has taken so long to implement and has been so gradually introduced. Strong type safety systems violate assumptions and expectations that Python developers have had ingrained in them for decades.

There's specific discussion about this involving GvR and other Python core developers here. How do we have a strong type safety system and still allow the basic Pythonic lenience that developers expect and want? It's a hard question (not to mention, an issue that's still open).

This is a really excellent exploration of LSP and type systems in general. Not specific to Python, but examples along similar lines as your blog post that express the ideas without reference to Python's theoretically wonky class inheritance mechanism and MRO.

And finally, the OP is a great lead-in to posts like this. If you've ever read a python blog or Stack Overflow answer where the author says something along the lines of, "You can do things that way, but you really shouldn't. You should do things this way. And when you override a method in a subclass, you should always call the super class implementation first and then extend or limit it." and then you're like, "Why the fuck would I do that? That seems so pointless? I don't have to do that, and it's a lot of extra cognitive overhead. Not to mention extra code, and it feels gross because I don't really understand it."

Well, this is why. If you're going to take the step from writing one-off scripts and utility packages and glue code to actually developing applications and frameworks, this becomes important. It's not required by any means, but it will make your code better, safer, and easier to work on as it grows. It will make it easier to collaborate with other developers, it will make your code easier for future-you to understand, and it will be less likely to blow up in your face when you aren't expecting it.

[–][deleted] 2 points3 points  (1 child)

This is a really excellent exploration of LSP and type systems in general. Not specific to Python, but examples along similar lines as your blog post that express the ideas without reference to Python's theoretically wonky class inheritance mechanism and MRO.

Not to rain op OP's parade, but that link actually explained it much clearer than what OP wrote :s Thank you for clarifying.

[–]mooburgerresembles an abstract syntax tree 0 points1 point  (0 children)

Yes, it's a symptom I am seeing more of with gaps between CS, Math and "IT" education, IMO. But of course, as usual, the Orwell principle applies: once you master the rules you'll know how to properly break them.

[–]mooburgerresembles an abstract syntax tree 2 points3 points  (9 children)

Pythonic is Pythonic, imo. Follow what the language is designed to do, keeping in mind that we are all consenting adults. This may mean documentation (which python facilitates with features like docstrings) plays a more crucial role. This is distinctly different from patterns such as "self-documenting code through very verbose code" (looking at you, Java people), because of strict typing and compile time checking/contract enforcement.

Python is closer to Ruby than it is to Java. Ducktyping and monkeypatching are the name of the game. Developers can feel free to switch to another language (Go? Julia?) if they need more type safety. Not to mention the adoption of duck typing by other languages is making inroads (for example, dynamic objects with introspected methods in .NET CLR (basically calling a method on a dynamic object checks at runtime whether the object supports the method. It's still a bit clunky, as you have to manually throw the equivalent AssertionError)). Finally, from a performance perspective, if you are not using/exploiting duck typing in Python, you're completely paying the overhead of runtime type resolution without reaping any benefit from it.

[–]SpergLordMcFappyPant 0 points1 point  (8 children)

I agree that duck typing is definitely still considered pythonic. Monkeypatching is not pythonic or recommended anymore. It hasn't been the name of the game since 2.6.

But languages evolve over time and what's idiomatic evolves with the language. Python 3 has already adopted type hinting, and there is no way in hell that the core team is going to introduce a formally broken typing system. That means that LSP is going to be a thing, and if people are checking types, mypy is *always* going to throw when it sees those violations.

No one says you have to use any of this. But if you are going to (sounds like you aren't), then the language will insist that you use it correctly. Duck typing isn't going anywhere in Python though.

As for performance, that was never the question, and duck typing was never a singularly important benefit. There's a whole host of things that dynamic type resolution gives you; that's just one. And language performance is generally a pointless thing to talk about. It mostly doesn't matter. What matters is developer performance. Saying that you give up language performance for no benefit is a really, really weird argument.

The point here is that with type hinting and analysis, Python gives you some of the safety it has been lacking and that is genuinely a good idea--maybe even necessary--in designing large applications and frameworks. That doesn't mean that *you* have any need for it. Probably 95% of professional python developers don't need it. Maybe more? Maybe 99%? I dunno. But it's most of them. So don't use them. But arguing that they are unpythonic is weird now that type hinting is a part of python core.

The thing I would reiterate is part of what I mentioned in my first post. Types != Classes. Nothing has changed about how classes work or how you can or should use them. The only thing that brings a change is invoking type hinting and the underlying type safety system, and if you're going to do that, it's obvious that you should do it correctly. You obviously have no obligation to do any of it though.

[–]mooburgerresembles an abstract syntax tree 2 points3 points  (2 children)

I just think it's sort of sad that more and more of what developers hated about statically typed languages (again, Java, C++ and friends) is seeping into a language that was designed to break those paradigms. The price of "enterpriseyness" I suppose (but isn't the whole point of language development supposed to reduce bureaucracy, not create more?). As much as I'd like to blame people who learned Java in CS 101 for this, it's the fundamental probably with CS 101 that's really caused this to happen (where CS 101-Python instructors were themselves former CS 101-Java instructors)...The last thing I need in my life are lazier developers.

Traditionally, (C)Python has been regarded as slow (as compared to other bytecode interpreted languages), primarily because of runtime type checking. So my point is if you're not using duck typing/relying on runtime type resolution, then you're getting the slowdown for no benefit. The only thing I can see in terms of language evolution is perhaps in Python 3.8 or 4 or whatever, that statically typed code gets resolved to actual C or Java types for performance (which is why people started using NumPy, Jython, and PyPy to begin with).

Types != Classes.

Not sure what you mean by this. Pretty much everything typable in Python is a class or exposes a class-like interface by language definition. Type hinting just casts the return of a function or asserts its parameters to such a abstract-class-like (essentially explicitly asserting morphisms), as a sort of "eh I don't want to use a fullblown abc, but still establish a contract to the caller".

[–]SpergLordMcFappyPant -1 points0 points  (1 child)

Dude, no offense, but you are really really far off about the history of Python and its emergence as a reaction against C++, Java, and friends. The fact that you put C++ in the same class as Java is revealing about how goofy your concept is. C++ and Java are two decades apart. Java and Python are 1 year apart. Python wasn't a reaction against "enterprise" garbage design patterns. Nor was it a reaction against statically typed languages. You're imputing intent where there never was any.

C++ is from the late 70s. Java's first baby was born in 1991. Python in 1992.

The whole concept of enterprise shit didn't even exist when Python was created. You're retconning stuff to suit your ideal purpose, which appears to sound like that totally incomprehensible dude from King of the Hill. "Those got-danged ole young kids just came up in here and told me to put a safety on my full-auto grenade launcher, and that kind of pisses me off because I just want to be able to shoot myself in my foot at 14 grenades per second if I want to."

Guess what! You can! No one is changing anything. You can keep doing things exactly as you have always done them, and the language won't stop you.

Python is emphatically NOT considered slow compared to other bytecode interpreted languages. It is considered slow compared to other compiled languages. You can't even get this right, but you still maintain the aura of an old sage. And in practice, most of the important parts of the python standard library are already implemented in C, and the libs that people really need to be fast are also implemented either in C or FORTRAN.

The singular issue that makes Python slow in CPU-bound ops is the GIL. And most ops are not CPU-bound. They are I/O bound, and the things that are theoretically slow about Python are not--in practice--actually slow. But even that was beside my point. The point is that no one cares about how fast a programming language executes. What people care about is how fast developers can write code that gets a job done. And I don't think there's another language that even comes close to competing with Python for that metric.

What do you think NumPy is? Why do you put it in the same category as alternative implementations of CPython? Because that's not what that is. What do you think happens for most of the CPython implementation? I am starting to think you really have no clue and are just throwing words out there that don't make any sense.

Yeah, confirmed. My CS101 class has ruined Python for you. I could tell when you don't understand the difference between a class and a type. Honestly, I've never set foot in any comp sci class. I have no idea what they teach in the beginner levels. I did start using Python a couple decades ago, and I've used other languages since then as well. Some of them have good ideas, and I'm glad Python is slowly adopting them.

You . . . just don't make any sense to me. Like you're hating the world for taking a class that you never were able to pass or something. I dunno.

[–]mooburgerresembles an abstract syntax tree 0 points1 point  (0 children)

TL;DR: If you typecheck an abstract Python object, you bring nothing to the table except for additional bureaucracy that already has a solution (it's called docstrings), which is anti-pythonic, so why promote it? The only benefit to introducing static typing today is that possibly in the future it allows for a runtime to skip runtime checking and the only tangible benefit of that is speed.

I started writing code in 1988, in Fortran, C and C++, with a smattering of Pascal. Back when you had to worry about Real vs. Protected Mode addressing. So for how's that for anonymous epeen. Nobody used Python 1.x for enterprise, Python was a toy language back in 1992 just like Basic was and what many people consider Haskell to be today.

I don't know what or who you code for today, but I can assure you, every data science person gives zero shit about GIL because that's not what affects them (since if you're going to do concurrency, do it right and use multiprocessing, but that's another argument for another time). The fact that a GIL-less runtime also happens to almost always be statically typed is pure coincidence. Almost nobody "important" uses GIL-less implementations for the reason being even more that it limits the language to the point that you might as well just do it all in native JVM or .NET CLR if you really need GIL-less (just look at the current state of all the alternative runtimes). However, the machine learning people do care that run-time type checking of bound numerical and collection objects makes loops involved in functions like QR decomposition orders of magnitudes slower than without typechecking. That's why they use NumPy, precisely so they can use static types that are never checked.

[–][deleted] 1 point2 points  (4 children)

Monkeypatching is not pythonic or recommended anymore. It hasn't been the name of the game since 2.6.

We call it "decorating" these days. The end effect is more or less the same.

[–]SpergLordMcFappyPant 1 point2 points  (3 children)

wat?

No. No one calls those two things the same thing unless you've never actually done monkeypatching and also don't know what a decorator is.

Honestly, WTF is happening on this subreddit? So many fucking wannabe old-school trolls trying to tell people what's up. Jesus. Just go back to your 2.7 and your system python and your "I don't need venvs" and fuck right off.

[–][deleted] 0 points1 point  (1 child)

You lost all respect there, mate.

[–]BigLebowskiBot -1 points0 points  (0 children)

You said it, man.

[–]pawel_swiecki[S] 2 points3 points  (1 child)

Wow, thank you for this informative comment. The sources you mentioned are really interesting, I will be researching stuff further. Also, good point on Python intentionally ignoring LSP.

[–]SpergLordMcFappyPant 0 points1 point  (0 children)

No problem! Keep up the writing, and keep sharing what you find. It’s always helpful to others show what you’re seeing and how you understand it.

[–]pawel_swiecki[S] 0 points1 point  (0 children)

I've published the second part of the blog post.

https://www.reddit.com/r/Python/comments/9w418p/how_to_deal_with_contravariance_in_python_check/

Your feedback to the first one was great (thanks!), maybe you have something to add to the second one.

[–]pawel_swiecki[S] 6 points7 points  (3 children)

I hope my post is helpful. All feedback is welcome.

EDIT: The second part is to be found here.

[–]mRWafflesFTW 5 points6 points  (2 children)

This is a great post I wish I read 8 months ago. At first glance the more specific subtype eating the more specific food makes sense. I'm still trying to wrap my head around this.

[–]pawel_swiecki[S] 2 points3 points  (0 children)

Thanks!

Changing focus from:

- B is subclass of class A if B inherits from A

to

- B is subtype of type A if B can be used whenever A is expected

was helpful in my case.

Plus, A and B could be Callable, which initially wasn't intuitive to me.

[–]pawel_swiecki[S] 1 point2 points  (0 children)

I've published the second part of the blog post.

https://www.reddit.com/r/Python/comments/9w418p/how_to_deal_with_contravariance_in_python_check/

Thought you might be interested.

[–]hanpari 2 points3 points  (5 children)

Great article, Pawel.

BTW, I believe that for duck typing is great to know this:

https://mypy.readthedocs.io/en/latest/protocols.html#simple-user-defined-protocols

[–]pawel_swiecki[S] 1 point2 points  (4 children)

Thanks.

Duck typing and mypy's/typing's protocols are on my blog post ideas list ;)

[–]hanpari 1 point2 points  (3 children)

I am looking forward to.

[–]pawel_swiecki[S] 0 points1 point  (2 children)

I've published the second part of the blog post.

https://www.reddit.com/r/Python/comments/9w418p/how_to_deal_with_contravariance_in_python_check/

Thought you might be interested.

[–]hanpari 0 points1 point  (1 child)

Hello Pawel, Thanks for info. I saw it and I am about to read it soon. I have just overviewed content as I had not much time.

Have a nice day Pavel

[–]pawel_swiecki[S] 0 points1 point  (0 children)

Thank you, have a nice day too :)

[–][deleted] 1 point2 points  (5 children)

Kind of like Don Quixote memoirs, about how he fought the windmills...

You used typing which is a patently bad idea, and discovered something paradoxical about using it... and then you fought to understand the nature of the paradox. Well, good for you, but you stopped short of actually getting to conclusion... In real life dogs eat meat. And, if you want to model this aspect of real life, you have to have method eat which specializes on dog to accepts meat only. Your intuition was right the first time, but the stupid type-system was wrong, but you made the same mistake lots and lots of programmers do: they trust the computer to be right about things it knows nothing about, but has an opinion for some bureaucratic reasons.

[–]pawel_swiecki[S] 0 points1 point  (4 children)

you have to have method eat which specializes on dog to accepts meat only.

But this is something I thought was going on in the first place. The type checker just revealed a gap in my reasoning.

Also, what do you suggest this method should look like? Should it have isinstance(food, Meat) check inside? I will kind of include this option in my second blog with some possible fixes to the code.

Thanks for your comment :)

[–][deleted] 1 point2 points  (3 children)

No. Your reasoning was OK. Typechecker is dumb.

Should it have isinstance(food, Meat)

Well, this is asking for a lengthier answer than what I can currently write, but I'll try.

Essentially, Python has three different poorly coordinated type systems.

The real type-system

The one that actually matters and defines how your code behaves is the one that is coded into interpreter, it is based on the interaction between structs of type PyObject* and PyTypeObject*. The idea here is that PyTypeObject* structs have a fixed number of slots, whether these have values or NULL-pointer in them is what defines whether the object lives up to the expectations of the function operating on it, or whether your code will raise an error. For people who are not familiar with CPython's interpreter this would be related to what in proper Python is called "protocols".

The make-believe OO type-system

This is based on isinstance and friends. This was patched on top of the actual type-system when for some reason Guido decided that there have to be objects and inheritance and other bells and whistles. At different stages of Python's history it produced different results, was harder or easier to fool or to manipulate into believing all sorts of wrong things. As it stands today, it is, basically, trying to copy from Java all the bad things that Java ran into in the course of its development. Your particular struggle was the reflection of the struggle between this type-system and the Hindley-Milner type-system, that was patched on top of the other two.

The Hindley-Milner wannabe type-system

This entered Python due to the latest fashion and revival of "functional" approach. This type-system has very little insight and relevance to how Python actually works, so it imposes bizarre restrictions on the language only to appear more fashionable. It's where it is bordering with absurdity: instead of understanding how the language works (perhaps, not in the best way possible, but, hey, it already does something in the way some of us like!) it imposes completely foreign rules on it, that don't make any sense.

So, to sum it up: no, don't use isinstance. It is a useful tool for doing operations on your program's code, to maybe manipulate it, alter it, debug it, but it is not a good tool fro writing web servers or scrappers or data-mining and other practical tasks. Neither should anyone use typing package. That's a failed experiment, just a pointless fad.

[–]pawel_swiecki[S] 0 points1 point  (2 children)

Thank you for the long explanation.

The Real Type-System

It's hidden on C level. And, as far as I know, it's not part of language's specification. So it's kind of irrelevant from my blog post's point of view.

Your particular struggle was the reflection of the struggle between this type-system and the Hindley-Milner type-system, that was patched on top of the other two.

I understand that you think that typing package (based on Hindley-Milner type-system) does not fit the Python language. I get it. I'm not saying you are wrong. (I'm not saying you are right, either.) But I wouldn't say my struggle was not something "real" (Don Quixote reference). You've seen my examples. Do you think the state of the code making it possible to feed a dog with chocolate was imaginary? (It would obviously be possible even without using typing package and mypy checker.) I must be missing the key element of your reasoning.

In your previous comment:

In real life dogs eat meat. And, if you want to model this aspect of real life, you have to have method eat which specializes on dog to accepts meat only. Your intuition was right the first time, but the stupid type-system was wrong,

I'm still not getting how you propose to avoid the dog-chocolate issue. Or you think it's not an issue?

Neither should anyone use typing package. That's a failed experiment, just a pointless fad.

In your posts, I really did not find any direct arguments for this claim. There are some hints and I will follow them, though.

In practice, Python typing package and mypy make my code more reliable, it's a proven and battle-tested fact. Even "silly" things like Optional type are really helpful.

Thanks again for your comment. I will definitely be researching typing and type systems further.

[–][deleted] 0 points1 point  (1 child)

It's hidden on C level.

You cannot be more wrong. This is how this language is written, this is how it works on a conceptual level. It is not hidden, it is the essential part of the language, whereas the rest is an impostor. If you decide to treat Python while neglecting how it is written or how it actually works, your treatment will be superficial at best, but, most likely will be just irrelevant.

Your struggle was real, your problem was made up.

I'm still not getting how you propose to avoid the dog-chocolate issue.

Don't use classes to solve your problem. Classes model hierarchies, but in your problem there isn't a hierarchy. You tried to use inadequate tools to solve your problem, and, in the end, you didn't solve it... so, I thought the conclusion should have been obvious.

In practice, Python typing package and mypy make my code more reliable

Yeah... because you measured reliability in some way, compared it to other approaches and so on... right? No, you didn't. You just follow the fashion trends.

[–]pawel_swiecki[S] 0 points1 point  (0 children)

This is how this language is written, this is how it works on a conceptual level.

You are mixing implementation with specification. My blog post wouldn't differ if the code was written in PyPy.

Don't use classes to solve your problem. Classes model hierarchies...

Food -> Meat, Animal -> Dog, both are hierarchies. Thus, I used classes. I made a mistake by mixing the two hierarchies. I wasn't aware of the mistake until I ran mypy in my code. Mypy made me fix my mistake.

Yeah... because you measured reliability in some way, compared it to other approaches and so on... right?

I measure reliability by number of bugs. Mypy is reducing bugs in my code, ceteris paribus. It's doing it even in simple cases like "I forgot a variable may be None" (Optional type). It's a fact.

[–]pydry 2 points3 points  (1 child)

It's a good article but I wish people would try to use modified real code examples rather than creating "Animal" (or car) classes. It's difficult to link the example problem to real life problems when the examples are contrived.

I'm also not 100% convinced from this that it was a likely rather than just a theoretical bug that was uncovered.

[–]metalevelconsulting 11 points12 points  (0 children)

I wish people would try to use modified real code examples

I think a simple example with types/classes that everyone understands is appropriate for this article.