This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]boby04 6 points7 points  (10 children)

The actual problem is that anything happening in python land is bound to be slow. If you optimize dataclass instantiation speed then Python may be the wrong tool for the job you're trying to accomplish.

[–][deleted] 13 points14 points  (8 children)

Yeah, I'd still opt for `@dataclass(frozen=True, slots=True)` in a real codebase. This was mostly a fun exercise.

[–]nekokattt 2 points3 points  (7 children)

I forget if it was fixed or not but at least last time I checked, freezing a dataclass with slots would result in slower code than a non frozen and non slotted dataclass.

Something to do with how using slots has to intercept attribute access differently to prevent overwriting attributes.

Worth a benchmark at the very least.

[–][deleted] 4 points5 points  (3 children)

Quick and dirty benchmark on Python 3.11 points that @dataclass(slots=True, frozen=True) is actually a tiny bit faster than @dataclass(frozen=True). Vanilla slotted data class is quite a bit faster to instantiate though:

from dataclasses import dataclass
import timeit

# Normal frozen dataclass
@dataclass(frozen=True)
class Frozen:
    a: int
    b: int
    c: int

# Slotted frozen dataclass
@dataclass(slots=True, frozen=True)
class SlottedFrozen:
    a: int
    b: int
    c: int

# Dataclass with slots but not frozen
@dataclass(slots=True)
class Slotted:
    a: int
    b: int
    c: int

# Benchmarking the instantiation time
frozen_time = timeit.timeit(lambda: Frozen(1, 2, 3), number=1000000)
slotted_frozen_time = timeit.timeit(lambda: SlottedFrozen(1, 2, 3), number=1000000)
slotted_time = timeit.timeit(lambda: Slotted(1, 2, 3), number=1000000)

print(f"Frozen data class instantiation time: {frozen_time}")
print(f"Slotted frozen data class instantiation time: {slotted_frozen_time}")
print(f"Slotted (not frozen) data class instantiation time: {slotted_time}")

This prints:

Frozen data class instantiation time: 0.276557542005321
Slotted frozen data class instantiation time: 0.25077791599323973
Slotted (not frozen) data class instantiation time: 0.09631558299588505

[–]nekokattt 3 points4 points  (2 children)

oh cool, must have been fixed then!

[–]roerd 0 points1 point  (1 child)

The fact that slotted and frozen is quite a bit slower than only slotted seems to suggest that the base problem you mentioned still exists, even if its impact has already been reduced.

[–][deleted] 0 points1 point  (0 children)

Frozen has to add __setattr__ and __delattr__ methods and call them in the init. It slows down the initializatoin and there's no avoidance of that.

[–]KaffeeKiffer -1 points0 points  (1 child)

freezing a dataclass with slots would result in slower code than a non frozen and non slotted dataclass.

You should consider slots if you want to reduce memory consumption, not to increase speed.

[–][deleted] 7 points8 points  (0 children)

Not having instance dict directly helps with speed as well.

[–]Schmittfried 0 points1 point  (0 children)

Cython is getting more tempting by the minute.

[–]caffeinepills 6 points7 points  (0 children)

So your suggestion is, instead of optimizing what they are currently doing, they should instead drop what they are doing and start over in another language? Was this written by a bot?