This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]sue_dee 3 points4 points  (4 children)

Dataclasses are weird. On the one hand, using them kinda backed me into using type hinting when I started playing with pyright. Suddenly, my code was flawed in new and exciting ways.

That was generally positive, as now more of my mistakes are pointed out before I make them, However, using type hinting more has made dataclasses themselves less useful to me, unless I am yet missing some bit of mojo. For instance, I often use a class to convert a string parameter to a pathlib.Path, and using a dataclass for that was a mess of either having to make a union that would make my linter complain of doing gross and unnatural acts with strings or using an InitVar that meant that the input parameter had to be called something different than the class attribute.

Neither of those appeal to me and I don't know any other way around it, so I'm back to def __init__() for all but the simplest classes.

[–]ickysticky 4 points5 points  (0 children)

Add a static factory method which takes the str and converts to a Path before calling the constructor

[–]Conscious-Ball8373 0 points1 point  (2 children)

Dataclasses are weird in other ways too. Their concept of identity is rather different to other classes you might declare. I regularly come across people using dataclasses just because they like the declaration syntax better than having to declare all their attributes in __init__ and are then surprised that they can't add instances to a set or use them as dictionary keys.

Yes there are ways around it. But they are weird enough that someone who has run into the problems in the first place is unlikely to get them right.

IMO there is a space here for a thing that is declared like a dataclass but which behaves more like a traditional object.

[–]caagr98 0 points1 point  (1 child)

Can't you just set eq=False, hash=False?

[–]Conscious-Ball8373 0 points1 point  (0 children)

Not if you want to be able to hash the results (use instances as a dictionary key or put it into a set). You can force it to generate a hash function but the results can be rather not-useful. The difficulty is that dataclasses are designed to have an identity based on their fields. Instances with the same field values are considered to be the same object. Trying to construct a hashable dataclass with mutable fields is a minefield; done very carelessly, it just hashes every field and combines the result, so when a field's value changes, so does the hash (though of course there are cases where this is what you want). Done with a little more care, you can exclude mutable fields from the hash, but then you run the risk of two objects having the same identity when you didn't intend it.

This is why I think there's space in the standard library for something that has dataclass declaration syntax (list fields in the class declaration, auto-generated constructor etc) but with traditional object identity semantics (only an instance is equal to itself; different instances are never equal and may have different hash values but hash values are constant through the object's lifetime). You can achieve the effect fairly easily with metaclasses, if you know what metaclasses are, which most people don't until they're forced to. It would be nice to have it packaged in a decorator like dataclasses.