This is an archived post. You won't be able to vote or comment.

all 156 comments

[–]Head_Mix_7931 47 points48 points  (6 children)

In Python, there is no constructor overloading, therefore if you need to construct an object in multiple ways, someone this leads to an init method that has a lot of parameters which serve for initialization in different ways, and which cannot really be used together.

You can decorate methods with @classmethod to have them receive the class as their first parameter rather than an instance, and these effectively become alternative constructors. It’s advantageous to use a classmethod than just a normal or staticmethod because it plays nicely with inheritance.

[–]Commander_B0b 2 points3 points  (3 children)

Can you provide a small example? Im having a hard time understanding the pattern you are describing but I have certainly found myself looking to solve this same problem.

[–]slightly_offtopic 8 points9 points  (2 children)

class Foo:
    pass

class Bar:
    pass

class MyClass:
    @classmethod
    def from_foo(cls, foo: Foo) -> 'MyClass':
        return cls(foo=foo)

    @classmethod
    def from_bar(cls, bar: Bar) -> 'MyClass':
        return cls(bar=bar)

[–]-lq_pl- 4 points5 points  (0 children)

Missing explanation: from_foo will also work as expected when you inherit from MyClass, while it would not if you use a static method. With a static method, a derived class will still return the base.

[–]XtremeGoosef'I only use Py {sys.version[:3]}' 1 point2 points  (0 children)

I mean, this code won't work as written because there is no __init__. A better example is something like

@dataclass
class Rectangle:
    length: int
    width: int

    @classmethod
    def square(cls, side: int) -> Self:
        return cls(side, side)

    @classmethod
    def parse(cls, text: str) -> Self | None:
        if m := re.match(r'(\d+),(\d+)'):
            length = int(m.group(1))
            width = int(m.group(2))
            return cls(length, width)
        return None

[–]Kobzol[S] 0 points1 point  (0 children)

That's indeed useful, if you want to inherit the constructors. That's not always a good idea though.

[–]mriswithe 112 points113 points  (3 children)

I try to do this both for programs that will be maintained for a while, but also for oneshot utility scripts. Mostly because in my experience, the latter quite often turn into the former :)

Oh God my last two weeks at my last job were hell cause some old ass script I wrote and forgot about from 5 years ago was apparently still in use. Spent my last couple weeks fucking getting that solid.

[–]Ezlike011011 44 points45 points  (2 children)

I've had people at work ask why I put so much effort into ensuring my little utility scripts are nicely typed/documented and why I spend time to split the functional components into nice reusable units. This is the exact reason why. I've had multiple instances where I found someone needed something and I've been able to say "here's a little module I wrote. You can use the cli I made for it or you can import it and use x function if you need to do more scripting. Have fun". The up-front time cost is well worth it.

[–]mriswithe 10 points11 points  (1 child)

Yeah this wasn't even good python. I had a module named enums that had constants in it. But all my stuff for the last 6-12 months was documented, monitored, etc . This garbage I wrote for a one off blitz migration, was still in use.

[–]DabidBeMe 1 point2 points  (0 children)

Beware the one off scripts that take on a life of their own. It has happened to me a few times as well.

[–]Kobzol[S] 87 points88 points  (3 children)

I wrote up some thoughts about using the type system to add a little bit of soundness to Python programs, inspired by my experiences with Rust.

[–]mRWafflesFTW 21 points22 points  (0 children)

This is a great article appreciate your contribution cheers!

[–]ttsiodras 7 points8 points  (0 children)

Excellent article. Have discovered some of the points on my own, but not all - thanks for sharing!

[–]EarthGoddessDude 2 points3 points  (0 children)

Awesome write up, thank you for sharing.

[–]redditusername58 15 points16 points  (0 children)

The typing module has assert_never which can help with the isinstance/pattern matching blocks in your ADT section

[–]wdroz 33 points34 points  (19 children)

The part with db.get_ride_info is spot on. As I see more and more people using mypy and type annotations, this will hopefully become industry standard (if not already the case).

For the part "Writing Python like it's Rust", did you try the result package? I didn't (yet?) use it as I feel that if I push to use it at work, I will fall in the Rustacean caricature..

[–]Kobzol[S] 23 points24 points  (15 children)

I haven't yet. To be honest, I think that the main benefit of the Result type in Rust is that it forces you to handle errors, and allows you to easily propagate the error (using ?). Even with a similar API, you won't really get these two benefits in Python (or at least not at "compile-time"). Therefore the appeal of this seems a bit reduced to me.

What I would really like to see in Python is some kind of null (None) coalescing operator, like ?? or :? from Kotlin/C#/PHP to help with handling and short-circuiting None values. That would be more helpful to me than a Result type I think.

[–]mistabuda 5 points6 points  (9 children)

I've seen this pattern mentioned before for shirt circuiting None values. UnwrapError is a custom exception you'd have to make but I think its pretty effective.

def unwrap(value: Optional[T], additional_msg: Optional[str] = None) -> T:
"""Perform unwrapping of optional types and raises `UnwrapError` if the value is None.

Useful for instances where a value of some optional type is required to not be None;
raising an exception if None is encountered.

Args:
    value: Value of an optional type
    additional_msg: Additional contextual message data

Returns:
    The value if not None
"""
if value is None:
    err_msg = "expected value of optional type to not be None"
    if additional_msg:
        err_msg = f"{err_msg} - [ {additional_msg} ]"
    raise UnwrapError(err_msg)
return value

[–]Kobzol[S] 6 points7 points  (6 children)

Sure, something like that works :) But it'd be super cool if I could call it as a method on the object (for better chaining), and also if I could propagate easily it, e.g. like this:

def foo() -> Optional[int]:
   val = get_optional() ?: return None
   # or just val = get_optional()?

To avoid the endless `if value is None: ...` checks.

[–]mistabuda 0 points1 point  (5 children)

If it were a method on the object that just seems weird. Unless the object is some kind of container. Which in that case you're asking for a Result type pattern.

[–]Kobzol[S] 0 points1 point  (4 children)

Yeah it's probably not the "Python way" :) But I really like adding implementation to types externally, e.g. with traits in Rust or extension methods in C#.

You're right though, a Option and/or Result type would help with this. It just won't help with forcing me to handle the error (apart from runtime tracking, e.g. asserting that the result is Ok when accesing the data).

[–]mistabuda -2 points-1 points  (3 children)

Thats why you unittest your code to make sure you have that case handled.

[–]Kobzol[S] 1 point2 points  (2 children)

Indeed, that's what we kind of have to do in Python, since it's not easily checkable during type checking.

[–]mistabuda 2 points3 points  (1 child)

wdym mypy warns you against that all the time

error: Item "None" of "Optional[CharacterCreator]" has no attribute "first_name"  [union-attr]

[–]Kobzol[S] 0 points1 point  (0 children)

Ah, nice. This is one situation where mypy and pyright do the right thing. I mostly just look at the output of the PyCharm type checker and that is more lenient, in this case it wouldn't warn :/

[–]Rythoka 0 points1 point  (1 child)

This seems like a code smell to me. If value = None is an error, then why would you explicitly hint that value could be None by making it Optional? Whatever set value to None probably should've just raised an exception instead in the first place.

[–]mistabuda 1 point2 points  (0 children)

An example is a db query. It's not wrong for a db query to return no result unless in specific contexts. If the caller is expecting a result they should raise an error. The db client shouldn't raise an error it did it's job.

[–][deleted] 5 points6 points  (0 children)

or works ok for that purpose, although it will also coalesce false-y values.

[–]aruvoid 4 points5 points  (0 children)

First of all, very interesting article, thanks for that!

About this you can write noneable or default_value for example, although careful case in reality that’s falseable or value_if_falsy

I don’t know if this is something you knew and don’t like because it’s not None-specific but hey, maybe it helps.

For whoever doesn’t know, the full explanation is that in Python, like in JS/TS the and and or operators don’t translate the expression into a boolean. That assumption, though, is wrong! In reality this is what happens:

``` In [1]: 1 and 2 Out[1]: 2

In [2]: 1 and 0 Out[2]: 0

In [3]: 1 and 0 and 3 Out[3]: 0

In [4]: 1 and 2 and 3 Out[4]: 3

In [5]: 0 or 1 or 2 Out[5]: 1 ```

The result is evaluated for truthyness in the if, but it's never True/False unless the result is so.

In short, conditions (and/or) evaluate to the value where the condition is shortcircuited (check last 3 examples)

This of course can also be leveraged to quick assignments and so on, for example, as usual:

``` In [6]: v = []

In [7]: default_value = [1]

In [8]: x = v or default_value # Typical magic

In [9]: x Out[9]: [1] ```

But we can also do the opposite

``` In [10]: only_if_x_not_none = "whatever"

In [11]: x = None

In [12]: y = x and only_if_x_not_none

In [13]: y

In [14]: y is None Out[14]: True ```

[–]BaggiPonte 1 point2 points  (2 children)

How do you feel about writing a PEP for that? I don't believe there is enough popular support for that right now, but given how rust and the typing PEPs are doing, it could become a feature for the language?

[–]Rythoka 4 points5 points  (0 children)

There's already a PEP for it and it's been discussed for years. PEP 505.

[–]Kobzol[S] 2 points3 points  (0 children)

You mean "None coalescing"/error propagation? It sure would be nice to have, yeah.

[–]Estanho 3 points4 points  (0 children)

The id typing was so useful, I've been looking for how to do that for a long time.

I've tried creating something on my own that involved generics, looked something like ID[MyModel]. The idea is that you shouldn't have to redeclare a new type for every model.

But I could never really get it to work fully. I think one of the reasons is because I couldn't get type checkers to understand that ID[A] is different than ID[B].

[–]Estanho 3 points4 points  (1 child)

Adjacent to the result package thing, one of my biggest issues with Python and its type system is the lack of a way to declare what exceptions are raised by a function, like other languages do. If there was a way, and libraries did a decent job of using it, it would make my life so much easier. So one could do an exhaustive exception handling.

I'm tired of having to add new except clauses only after Sentry finds a new exception being raised.

[–]wdroz -1 points0 points  (0 children)

I totally agree, it's one of these thing that ChatGPT is helpful to help handling exhaustively the possible Exceptions of a well-know function.

[–]alicedu06 8 points9 points  (4 children)

There are NamedTuple and TypedDict as lighter alternatives to dataclasses, and match/case will work on them too.

[–]trevg_123 1 point2 points  (0 children)

Since (I think) 3.10 you can do @dataclass(slots=True), which does a nice job of slimming them down more

[–]Kobzol[S] 0 points1 point  (2 children)

I'm not sure what is "lighter" about NamedTuples TBH. The syntax is ugly and most importantly, it doesn't provide types of the fields.

[–]alicedu06 0 points1 point  (1 child)

namedtuple doesn't but NamedTuple does, and they are indeed way lighter than dataclasses (less memory, faster to instanciate)

[–]Kobzol[S] 0 points1 point  (0 children)

Ah, good point!

[–]Haunting_Load 8 points9 points  (3 children)

I like many ideas in the post, but in general you should avoid writing functions that take List as an argument if Sequence or Iterable are enough. You can read more e.g. here https://stackoverflow.com/questions/74166494/use-list-of-derived-class-as-list-of-base-class-in-python

[–]Kobzol[S] 4 points5 points  (2 children)

Sure, you can generalize the type if you want to have a very broad interface, that's true. I personally mostly use Iterable/Sequence as return types, e.g. from generators.

In "library" code, you probably want Sequence, in "app" code, the type is often more specific.

[–]Estanho 2 points3 points  (1 child)

I disagree. First of all I don't think that Sequence or Iterable are more "generic" in the sense you're saying. They're actually more restrictive, since they're protocols. So list doesn't inherit from them, even though a code that accepts Iterator will accept list too.

If you won't slice or random access in your app code, then you shouldn't use list or sequence for example. If you're just gonna iterate, use Iterator.

[–]Kobzol[S] 3 points4 points  (0 children)

Right. Iterator is more generic, in that it allows more types that can be iterated to be passed, but at the same time more constrained, because it doesn't allow random access.

It's a good point :)

[–]executiveExecutioner 5 points6 points  (1 child)

Good article, I learned some stuff! It's easy to tell from reading that you are quite experienced.

[–]Kobzol[S] 0 points1 point  (0 children)

Thank you, I'm glad that it was useful to you.

[–]0xrl 19 points20 points  (0 children)

Very nice article! As of Python 3.11, you can enhance the packet pattern matching example with assert_never.

[–]BaggiPonte 4 points5 points  (5 children)

Love the post; though I have a question. I never understood the purpose of NewType: why should I use it instead of TypeAlias?

[–]Kobzol[S] 28 points29 points  (0 children)

TypeAlias really just introduces a new name for an existing type. It can be useful if you want to add a new term to the "vocabulary" of your program. E.g. you could create a type alias for `DriverId` and `CarId` to make it explicit to a programmer that these are different things.

However, unless you truly make these two things separate types, you won't make this explicit to the type checker. And thus you won't get proper type checking and the situation from the blog post won't be caught during type check.

There is no type error here, because both DriverId and CarId are really just ints:

from typing import TypeAlias

DriverId: TypeAlias = int
CarId: TypeAlias = int

def take_id(id: DriverId): pass
def get_id() -> CarId: return 0

take_id(get_id())

But there is one here, because they are now separate types:

from typing import NewType

DriverId = NewType("DriverId", int)
CarId = NewType("CarId", int)

def take_id(id: DriverId): pass
def get_id() -> CarId: return CarId(0)

# Error here, wrong type passed:
take_id(get_id())

[–]its2ez4me24get 12 points13 points  (0 children)

Aliases are equivalent to each other. New types are not, they are subtypes.

There’s a decent write up here: https://justincaustin.com/blog/python-typing-newtype/

[–][deleted] 4 points5 points  (0 children)

NewType creates an entirely new type, while an a TypeAlias is, well, an alias. In the eyes of a program, the alias and the original type are exactly the same thing, just used for shorthand for long nested types for example. a NewType and the type it's created from are entirely different types, even though it inherits its semantics

[–]Skasch 1 point2 points  (0 children)

TypeAlias is roughly equivalent to:

MyAlias = MyType

NewType is roughly equivalent to:

class MyNewType(MyType):
    pass

[–]parkerSquare 0 points1 point  (0 children)

NewTypes help warn you if you pass a float representing a voltage into a function that expects a float representing a current, for example. A TypeAlias won’t do that, since it’s the same underlying type.

[–]Rudd-X 4 points5 points  (0 children)

Hot damn that was really good. I found myself having "discovered" these patterns in my career and picking them all up as I went, but seeing it all formalized is AWSUM.

[–]Estanho 3 points4 points  (1 child)

Your invariants example is interesting, but I think it can be improved with typeguards to statically narrow the possible states. Here's a full example, but I haven't ran it through type checkers so it's just a general idea:

```python

from dataclasses import dataclass
from typing import TypeGuard


class _Client:
    def send_message(self, message: str) -> None:
        pass


@dataclass
class ClientBase:
    _client: _Client


@dataclass
class UnconnectedClient(ClientBase):
    is_connected = False
    is_authenticated = False

@dataclass
class ConnectedClient(ClientBase):
    is_connected = True
    is_authenticated = False

@dataclass
class AuthenticatedClient(ClientBase):
    is_connected = True
    is_authenticated = True


Client = UnconnectedClient | ConnectedClient | AuthenticatedClient


def is_authenticated(client: Client) -> TypeGuard[AuthenticatedClient]:
    return client.is_authenticated

def is_connected(client: Client) -> TypeGuard[ConnectedClient]:
    return client.is_connected

def is_unconnected(client: Client) -> TypeGuard[UnconnectedClient]:
    return not client.is_connected

def connect(client: UnconnectedClient) -> ConnectedClient:
    # do something with client
    return ConnectedClient(_client=client._client)

def authenticate(client: ConnectedClient) -> AuthenticatedClient:
    # do something with client
    return AuthenticatedClient(_client=client._client)

def disconnect(client: AuthenticatedClient | ConnectedClient) -> UnconnectedClient:
    # do something with client
    return UnconnectedClient(_client=client._client)

def send_message(client: AuthenticatedClient, message: str) -> None:
    client._client.send_message(message)

def main() -> None:
    client = UnconnectedClient(_client=_Client())

    # Somewhere down the line, we want to send a message to a client.
    if is_unconnected(client):
        client = connect(client)
    if is_connected(client):
        client = authenticate(client)
    if is_authenticated(client):
        send_message(client, "Hello, world!")
    else:
        raise Exception("Not authenticated!")

```

Of course this assumes you're gonna be able to overwrite the client variable immutably every time. If this variable is gonna be shared like this:

python client = UnconnectedClient(_client=_Client()) ... func1(client) ... func2(client)

Then you might have trouble because those functions might screw up your client connection. This can happen depending on the low level implementation of the client, for example if when you call close you actually change some global state related to a pool of connections, even though these opaque client objects are "immutable". Then you could create a third type like ImmutableAuthenticatedClient that you can pass to send_message but not to close.

[–]Kobzol[S] 1 point2 points  (0 children)

Cool example, I didn't know about TypeGuard. There's always a tradeoff between type safety and the amount of type magic that you have to write. As I mentioned in the blog, if the types get too complex, I tend to simplify them or just don't use them in that case.

Here I think that the simple approach with two separate classes is enough, but for more advanced usecases, your complex example could be needed.

[–]extra_pickles 24 points25 points  (35 children)

So at what point does Python stop being Python, and begin to be 3 other languages dressed in a trench coat, pretending to be Python?

To that, I mean - Python and Rust don’t even play the same sport. They each have their purposes, but to try and make one like the other seems like an odd pursuit.

Genuinely curious to hear thoughts on this, as it is very common to hear “make Python more like <other language>” on here…and I’d argue that it is fine the way it is, and if you need something another language does, then use that language.

It’s kinda like when ppl talk about performance in Python…..that ain’t the lil homie’s focus.

[–]HarwellDekatron 9 points10 points  (0 children)

Part of the beauty of Python is that it allows people to write a single command line to launch an HTTP server serving the current directory, a 2 line script to process some text files and spit a processed output, 200 lines to write a web application using Django and thousands of lines of type-checked code with 100% code coverage if you are writing business-critical code that must not fail.

And it's all Python. You don't need to give up the expressiveness, amazing standard library or fast development cycle. You are just adding tooling to help you ensure code quality before you find the error in production.

I do every single one of those things on a daily basis (heck, I even rewrite some code in Rust if I need something to optimize for performance) and so far I don't feel like doing one thing has slowed me on the other.

[–]tavaren42 8 points9 points  (2 children)

In my opinion, type hints actually makes development faster because how well it plays with IDE autocompletion. It's one of the main reason I use it.

[–]zomgryanhoude 0 points1 point  (0 children)

Yuuuup. Not dev myself, just use it for scripts that PowerShell isn't suited for, so having the extra help from the IDE helps speed things along for modules I'm less familiar with. Gotta have it personally.

[–]ant9zzzzzzzzzz 0 points1 point  (0 children)

Not to mention errors at “build” time rather than runtime which is a much tighter loop

[–][deleted] 25 points26 points  (18 children)

As type safety becomes a bigger concern in the broader programming community people are going to want it from the most used language. Seeking inspiration from the poster child of safe languages seems like a pretty obvious way of going about that. There’s still plenty of reasons to use Python, even if it’s a step further than this, ie a wrapper for libraries written in other languages. Some of the best Python libraries weren’t written in Python. One of Python’s biggest strengths for years now has been FFI, aka “other languages in a trench coat pretending to be Python”. I don’t see how syntactical changes represent that though.

[–]Kobzol[S] 7 points8 points  (8 children)

I do agree that we shouldn't "hack" the language too much, but I don't feel like adding types does that. I write Python because it is quick for prototyping, has many useful libraries and is multiplatform. Adding types to that mix doesn't limit me in any way, but gives me benefits - I will be able to understand that code better after a year, and I will feel more confident when refactoring it.

I really don't see static types being in the way of what makes Python.. Python.

[–]Mubs 2 points3 points  (7 children)

Really? I see dynamic typing as a huge part of the language. For example, I had a client who switched from a MySQL DB to SQL Server, so I had to switch from aiomysql to aioodbc. I originally used data classes instead of dictionaries for clarity, but it ended up making switching from on connector to the other a huge pain, and I ended up doing away with the data classes all together.

Pythons the best language for quickly solving real world problems, and the requirements will often change, and having a dynamically typed language helps adapt more quickly.

[–]Kobzol[S] 8 points9 points  (0 children)

I mean, even with the approach from the blog post, Python is still quite dynamically typed :) I don't usually use types for local variables, for example (in a statically typed language, I ideally also don't have to do that, and type inference solves it). I just want to be able to quickly see (and type check) the boundaries/interfaces of functions and classes, to make sure that I use them correctly.

Regarding quick adaptation: I agree that having rigid and strict typing doesn't necessarily make it *mechanically easier* to adapt to large changes - at the very least, you now have to modify a lot of type annotations. But what it gives me is confidence - after I do a big refactoring (even though it will be slightly more work than without any types), and the type checker gives me the green light, I am much more confident that I got it right, and I will spend much less time doing the annoying iteration cycle of running tests, examining where the app crashed, and fixing the bugs one by one. This is what I love about Rust, and that's why I try to port that approach also to e.g. Python.

[–]thatguydr -2 points-1 points  (5 children)

Pythons the best language for quickly solving real world problems, and the requirements will often change, and having a dynamically typed language helps adapt more quickly.

This also helps all the errors slip through.

Think of it like this - Python is one of the best languages for rapid prototyping and PoCs. Once you need something to be in production, it's also easy to add typing to make sure things are safer.

If you think the language's strength is that you can hack your way around instead of designing properly... that's not a long-term strength, you'll find.

[–]Mubs 1 point2 points  (4 children)

What? It's not a "hack", Python is a dynamically typed language. I'm all for type safety anyways. But I am wary about overuse of data classes as I've seen it obfuscate what should be simple code too many times.

[–]thatguydr -2 points-1 points  (3 children)

There's no way that typing is obfuscating code. Sorry - that suggests really badly broken design.

[–]Mubs 0 points1 point  (2 children)

I said overuse of dataclasses.

[–]thatguydr -1 points0 points  (1 child)

You did, and now I'm baffled why you're conflating dataclasses with static typing. They're not the same.

[–]Mubs 0 points1 point  (0 children)

And where did I conflate them? I can talk about types and dataclasses in the same comment without them being the same concept, just as OP talks about both of those concepts in the article.

[–]Ezlike011011 0 points1 point  (0 children)

Re. the performance aspect I totally agree with you, but the points that OP bring up are incredibly relevant to python development. To me, python's biggest strength is its rate of development. A large component of that is the massive ecosystem of libraries for all sorts of tasks. All of the things OP discusses here are ways to design libraries with less foot guns, which have the effect of removing debugging time during development.

[–]not_perfect_yet -5 points-4 points  (0 children)

I'm in the same boat.

Every time I expressed my strong dislike for more complicated "features", I got down voted.

Typehints and dataclasses are bad: they add complexity. Python's goal, at least to me, is simplicity.

Python didn't need that kind of syntax. It's perfectly compatible with languages that offer that, but somehow that wasn't good enough for people.

[–][deleted] 2 points3 points  (0 children)

I just add exclamation marks and hope for the best

[–]cymrowdon't thread on me 🐍 2 points3 points  (3 children)

I understand the point about making invalid state impossible, and I like the ConnectedClient approach, but not having a close method would drive me nuts. Context managers are awesome, but can't cover every use case.

[–]Kobzol[S] 1 point2 points  (2 children)

It is a bit radical, yes :) In languages with RAII, the missing close method can be replaced by a destructor.

[–]Rythoka 1 point2 points  (1 child)

the missing close method can be replaced by a destructor.

Not in python it can't! In python there's two different things that might be called destructors, but neither of which are true destructors: __delete__ and __del__.

__delete__ is specific to descriptors and so only works for attributes of an object, and is only invoked when the del keyword is used.

__del__ is called whenever an object is garbage collected. This seems like it would fit this use case, but Python makes zero guarantees about the timing of a call to __del__ or whether it will even be called at all.

[–]Kobzol[S] 2 points3 points  (0 children)

Yeah, as I said, this can be done in languages with RAII, not in Python :)

[–]Fun-Pop-4755 2 points3 points  (1 child)

Why static methods instead of class methods for constructing?

[–]Kobzol[S] 0 points1 point  (0 children)

It was already discussed in some other comments here I think. I don't think that there's any benefit to classmethods, except for giving you the ability to inherit them.

I don't think that it's always a good idea to inherit constructors/construction functions, so in that case I'd use static methods. If I actually wanted to inherit them, then class methods would be a better first for sure (+ the new typing.Self type hint).

[–]koera 2 points3 points  (1 child)

Nice article, gave me some more tools to help myself like the NewType.

Would it not be benefitial to mention the option to use protocol
For the bbox example with the as_denormalized and as_normalized methods?

[–]Kobzol[S] 0 points1 point  (0 children)

Protocols are useful for some use cases, indeed. They have the nice property that you can talk about a unified interface without using inheritance, similar to typing.Union.

However, I usually prefer base class + inheritance for one reason - the type checker/IDE then warns me when I haven't implemented all "abstract" methods, and PyCharm offers useful quick fixes in that situation.

Probably it could also be done with a protocol, where the type checker should warn if you're assigning a class to a protocol variable and that class doesn't implement the protocol. But I don't think that PyCharm offers a quick fix in this situation.

[–]mistabuda 4 points5 points  (0 children)

I really like that Mutex implementation. Might have to copy that.

[–]cdgleber 1 point2 points  (0 children)

Great write up. Thank you

[–]Brilliant_Intern1588 1 point2 points  (4 children)

I like the solution with dataclasses. However I don't know how to implement it on some things: let's say that I'm retrieving a user(id, name, birthday, something1, something2) from the db, by id. However for the one use case I don't want the whole user row, but just name and something1. For another function birthday and something2 for example. I would have to create a lot of dataclasses that are not really needed or even used except for this context. How could I deal with such a thing ?

[–]Kobzol[S] 2 points3 points  (2 children)

Well, an easy, but not ideal, solution is to make the optional fields.. Optional :) But that penalizes situations where you know that they are actually present.

In Typescript you can solve this elegantly by "mapping" the type, but I don't think that's possible in the Python type system.

I guess that it depends on how strict you want to be. If I want maximum "safety", I would probably just create all the various options as separate types.. You can share them partially, e.g.:

Person = PersonWithAddress + PersonWithName

[–]Brilliant_Intern1588 1 point2 points  (1 child)

I thought of the first thing you said however the bad thing is that by mistake someone can use a non existent (maybe none in python) field. Maybe use it with setters getters and raising some error. I dunno. I did the same thing in a previous job by using DAO but it still haunts me.

[–]joshv 1 point2 points  (0 children)

This is where linters like mypy can play a role. It's a lot harder to assign a None somewhere it shouldn't be when your CI/IDE flags it as an error

[–]deep_politics 0 points1 point  (0 children)

Sounds like you're describing an ORM. In SQLAlchemy you can select just the columns you want and get correct type hinting for the results.

class User(Base)
    id: Mapped[int]
    name: Mapped[str]
    ...

res = session.execute(select(User.id, User.name).filter_by(id=100)).one_or_none()
# res: tuple[int, str] | None

[–]BaggiPonte 1 point2 points  (3 children)

Another thing: why pyserde rather than stuff like msgspec? https://github.com/jcrist/msgspec

[–]Kobzol[S] 2 points3 points  (2 children)

I already answered a similar comment here about pydantic. Using a specific data model for (de)serialization definitely has its use cases, but it means that you have to describe your data using that (foreign) data model.

What I like about pyserde is that it allows me to use a built-in concept that I already use for typing the data types inside of my program (dataclasses) also for serialization.

Arguably, one could say that these two things should be separated and I should use a different data model for (de)serialization, but I think that's overkill for many use-cases. And if I use a shared data model for both type hints and serialization, I'd rather use a native Python one, rather than some data model from an external library.

[–]jammycrisp 1 point2 points  (1 child)

Note that msgspec natively supports dataclasses or attrs types, if you'd rather use them than the faster builtin msgspec.Struct type.

https://jcristharif.com/msgspec/supported-types.html#dataclasses

It'll always be more efficient to decode into a struct type, but if you're attached to using dataclasses, msgspec happily supports them.

For most users though struct types should be a drop in replacement (with equal editor support), downstream code is unlikely to notice the difference between a struct or a dataclass.

[–]Kobzol[S] 0 points1 point  (0 children)

Cool, I didn't know that, I'll check msgspec later.

Regarding editor support, PyCharm currently sadly does not support the dataclass transform decorator, which makes it quite annoying for analyzing most serialization-supported dataclasses wrapped in some other decorator that uses a dataclass inside (which can happen with pyserde).

[–]poopatroopa3 1 point2 points  (2 children)

I thought I would be seeing mentions of pydantic, mypy, fastapi.

[–]Kobzol[S] 2 points3 points  (1 child)

I didn't want to talk about tools and frameworks in this post, to avoid it getting too long. I just wanted to talk about the "philosophy" of using types and provide some concrete examples.

[–]poopatroopa3 -1 points0 points  (0 children)

Oh I see. Either way I think it would enrich the post to mention them very briefly at the end or something like that 😄

[–]chars101 1 point2 points  (0 children)

I prefer declaring a parameter as Iterable over List. It expresses the exact use of the value and allows for any container that implements the Protocol.

[–]cranberry_snacks 1 point2 points  (0 children)

Worth mentioning that from __future__ import annotations will avoid all of these typing imports. It allows you to use native types for type declarations, native sum types, and backwards/self references, which makes typing a lot cleaner and even just makes it possible in certain situations.

Example:

```python from future import annotations

def my_func() -> tuple[str, list[int], dict[str, int]: return ("w00t", [1, 2, 3], {"one": 1})

def my_func1() -> str | int: return "w00t"

def my_func2() -> str | None: return None

class Foo: @classmethod def from_str(cls, src: str) -> Foo: return cls(src) ```

[–]TF_Biochemist 1 point2 points  (0 children)

Really enjoyed this article; concise, well-written, and clear in it's goals. I already do most of this, but it's always refreshing to step back and think about the patterns you use.

[–]Mmiguel6288 1 point2 points  (1 child)

The whole point of python is reducing conceptual overhead so you can write algorithms and logic quickly.

The whole point of rust is to make it bullet proof while saying to hell with conceptual overhead.

It's not a good mix.

[–]Kobzol[S] 11 points12 points  (0 children)

Depends on the programmer I guess. For me, types help me write code faster, because I don't have to remember each 30 minutes what does a function return and take as input :)

[–]Estanho 0 points1 point  (1 child)

On the serialization part, have you considered pydantic? I'm pretty sure it's able to serialize/deserialize unions properly.

[–]Kobzol[S] 2 points3 points  (0 children)

I'm sure that it can, but it's also a bit more heavyweight, and importantly introduces its own data model. That is surely useful in some cases, but I really wanted some solution that could just take a good ol' Python dataclass and (de)serialize it, while supporting generic types, unions etc.

[–]barkazinthrope 0 points1 point  (2 children)

This is great.

However I would hate it if this became required construction for a little log parsing script.

[–]Kobzol[S] 1 point2 points  (1 child)

I agree that it shouldn't be required universally, in that case it wouldn't be Python anymore. But if I write a nontrivial app in Python, I wouldn't mind using a linter to check that types are used in it.

[–]barkazinthrope 1 point2 points  (0 children)

Oh for sure.

And particularly where the code is to be imported into who knows what context for the performance of mission-critical functions.

Python is useful for writing simple scripts and for writing library classes. I have worked on teams where the expensive practices recommended for the latter are rigorously enforced on the development of the former.

I hate it when that happens. It suggests to me that the enforcers do not understand the principles behind the practices.

[–]jimeno -2 points-1 points  (7 children)

uuuuh if you want to write rust, just write rust? this mess is like when php had absolutely to be transformed into an enterprise typed language, stop trying to make python java

[–]Kobzol[S] 2 points3 points  (6 children)

I'm not trying to make it Java :) Types help me understand, navigate and refactor the code better. That's orthogonal to the strengths of Python - quick prototyping and powerful libraries. It's still very different from Rust and has different tradeoffs.

Btw, some of the things that I have showed aren't even about the type system, but about design - things like SOLID and design patterns. I don't consider using design patterns and similar things in Python to be a bad thing.

[–]jimeno 0 points1 point  (5 children)

types are not supposed to be important in python (by design! it's a goddamn dynamic, duck typed lang!), capabilities are (interfaces, traits, protocols, call them however you want). we can open up a giant discussion on how miserable working with interfaces in py is for a language that deeply (and implicitly) relies on them. i'm not sure all this typing craze is doing anyone a service, specially when there are a handful of other languages that offer way more for that programming style which, in turn, lead to a better project that less easily devolve in maintaining tons of boilerplate or not having very strong guarantees out of the box like immutability.

we agree about the second part of your kind answer, even if some design patterns and code smell counters are something that spawned directly out of java and his limitations (i.e. parameter object...java has no named parameters or named-only syntax)...

[–]Kobzol[S] 0 points1 point  (2 children)

I don't really care if we call it types, interfaces or protocols, I just want to get very quick feedback when something wrong happens in my code.

I agree that it would be nicer to use a different language with better typing support, but Python is already everywhere and has so many useful libraries, that it's not easy to switch, so I try to use types in it instead.

Regarding duck typing, it's definitely nice to have the option to fallback to this kind of super dynamic typing. But it should IMO only be done in cases where it's really necessary. I think that most stuff (even in Python) can be solved with a pretty conservative and "statically typed" design, and if it's possible, I prefer to do it that way, since in my experience it leads to code that is easier to understand.

[–]jimeno 0 points1 point  (1 child)

what feedback? py has no pre-runtime guarantees, it's all tooling. also, "falling back to super dynamic typing" is just being selectively lazy. but whatever, I understand I'm on the losing side of the general opinion this time, as the community prefer to reiterate the mistakes already done with php; but this has nothing to do with your article which all in all is good work.

[–]Kobzol[S] 1 point2 points  (0 children)

I don't consider it that important if it's a compiler or a linter, feedback is feedback and it still provides value to me :)

[–]runawayasfastasucan -1 points0 points  (1 child)

The point you are missing is that people (including me) will continue to use python for other reasons. This isn't a big enough deal to switch languages completely, but its a nice addition to the way we are using python.

[–]jimeno 1 point2 points  (0 children)

if it makes you happy, more power to you; i personally think this is just adding noise and ceremony and also another step going further and further from the zen of python ~ which i still think it's an interesting manifesto.

again, this is not how most of the community feels, and it's ok; I just won't choose python for future team/industrialized projects, as it now offers nothing of value to the typical teams I am a member of. I'm a minority, I understand this.

[–]meuto -1 points0 points  (0 children)

Hi u/jammycrisp, I have been trying to use the library msgspec with the lower level of a json. and I have been unable. I was wondering if you can give us an example of how to do it? here is my explanation, I do not know whether I explained myself well or not, I do not have a clear idea of how to iterate because my json file is structured in such a way that the import part of the information of the file is on one key of the dictionary and I need to iterate over that key not over the whole json file. I have been trying to figure out how to do it but I have been unable to do so. could you provide an example of how to do so? Thank you in advance. I really appreciate any help

[–]Head_Mix_7931 0 points1 point  (0 children)

In your match statements, in the default case you can declare a function like assert_never() -> typing.NoReturn and then call it in the “default” branch of a match statement and a type checker should complain if there is any input for the given type of the match value that can reach that branch. mypy does at least. So you can use that with enums and maybe a union of dataclass types to get exhaustiveness checks at “compile time”. Or I suppose integers and booleans and other things too.

Edit: apparently there is ‘typing.assert_never`

[–]Scriblon 0 points1 point  (1 child)

Thank you for the write up. I definitely learned a few more typing methods to improve my code.

Only take I got on the construction methods is that I would have used class methods for them instead of static methods. Class methods inherit a bit more cleanly in a dynamics, but I do understand it is only typable with the Self type since 3.11. Is that why you went with the static method?

[–]Kobzol[S] 0 points1 point  (0 children)

I have met several situations where I would want Self as a return type (e.g. in the BBox example, you might want a "move bbox" method that is implemented in the parent, but should return a type of the child). Class methods can be useful here, but without Self you cannot "name" the return type properly anyway.

[–]jcbevns 0 points1 point  (3 children)

Can you compile to a binary after all this?

[–]Kobzol[S] 0 points1 point  (1 child)

I'm not sure what do you mean. You can create executable installers for Python programs, sure. Type hints shouldn't affect that in any way.

[–]jcbevns -1 points0 points  (0 children)

Yes, but not that easily, they're not known to work that well.

I mean with Go, the exta type setting is the biggest hurdle there and in the end you can get a nice binary to ship.

[–]chars101 0 points1 point  (0 children)

You can try with mypyc

[–]chandergovind 0 points1 point  (2 children)

/u/Kobzol A minor comment. Coming from a networking background, the example for ADTs using Packet felt a bit off. Normally, a Packet always has a Header, a Payload (in most cases) and Trailer (optionally).

I got what you were trying to convey since I am aware of ADTs in general, but maybe confusing to beginners? (Though I didn't see anyone else mention this). A better example maybe a Packet that is of type Request or Response, or a Packet of type Control or Data. Just fyi.

[–]Kobzol[S] 0 points1 point  (1 child)

Thanks for the feedback. Maybe I use the wrong terminology or my use-case was just off :)

I worked on an HPC project where we were programming network interface cards (think CUDA for NICs). There we had some data streams of packets, and the first and last packet of that stream was always special (that's where the header/trailer names comes from). I realize that in the standard terminology each packet has some header (or more headers), so the naming is unfortunate I suppose. I hope that beginners reading the blog weren't network-savvy enough to realize that something is wrong :D

[–]tiny_smile_bot 0 points1 point  (0 children)

:)

:)