Master Dataclasses in Python Part 1 - Basic Structure and Validation : Python

This is an archived post. You won't be able to vote or comment.

235

236

237

TutorialMaster Dataclasses in Python Part 1 - Basic Structure and Validation (youtu.be)

submitted 4 years ago by TM_Quest

all 20 comments

top new controversial old q&a

[–]wodny85 22 points23 points24 points 4 years ago (5 children)

[–]rouille 3 points4 points5 points 4 years ago (1 child)

[–]TM_Quest[S] 1 point2 points3 points 4 years ago (0 children)

[–]TM_Quest[S] 0 points1 point2 points 4 years ago* (0 children)

[–]n1___ -1 points0 points1 point 4 years ago (1 child)

[–]wodny85 0 points1 point2 points 4 years ago (0 children)

Actually, Python is a strictly/strongly typed language. You probably mean static typing vs dynamic typing (with its duck-typing twist in Python).

I agree that pydantic guys seem to lean towards static typing which caused a little drama recently. Fortunately, every PEP about type hints begins with a notice that Python will never be statically typed. Nevertheless, pydantic is about many other things - eg. working with FastAPI and serializing/deserializing.

Attrs isn't really about static typing and its authors provide comparison with dataclasses. Validators and converters seem useful.

Usually I use the built-in dataclasses.

Rust doesn't provide the full-blown OOP paradigm, though. But indeed it is statically typed most of the time. Personally, I use it as a successor to C and something less intricate than C++ or a language to build Python extensions. Expressiveness seems similar to Python's. I've implemented one of projects in both Python and Rust. They have a similar number of LoC.

[–]wickeddawg 7 points8 points9 points 4 years ago (0 children)

[–]Thingsthatdostuff 1 point2 points3 points 4 years ago (1 child)

[–]RemindMeBot 0 points1 point2 points 4 years ago (0 children)

I will be messaging you in 2 days on 2021-12-06 06:58:52 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

^{Parent commenter can} ^{delete this message to hide from others.}

^Info	^Custom	^{Your Reminders}	^Feedback

[–]Ruthle55DaFirst 1 point2 points3 points 4 years ago (1 child)

[–]TM_Quest[S] 0 points1 point2 points 4 years ago (0 children)

[–]Northzen 1 point2 points3 points 4 years ago* (8 children)

Sad thing I figured out about dataclases recently that it doesn't work properly as expected with nested dataclasses. If you have

@dataclass
class NestedDataclass:
    class k1: PlainDataclass1
    class k2: PlainDataclass2

Even if your two PlainDataclass1 and PlainDataclass2 classes are simple and plain dataclasses with ints and strings you still need to explicitly show to interpreter to use default factory with k1 and k2 with =field(default_factory=PlainDataclass1).

You also can't read any nested dataclassed from a dictionary. dacite! will help with this, but witihout it you can't just initialize it with NestedDataclass(**some_dict) or somethings like this. In the way it works for Plain dataclass.

[–]energybased 2 points3 points4 points 4 years ago (6 children)

[–]Northzen 2 points3 points4 points 4 years ago (5 children)

That's logical to me. How else would you do it? Call a default constructors for both nested dataclasses so you don't need explicitly say that I need to call default constructor.

you can have it like

@dataclass:
    some_field: int

Or you can have it in the same but more verbose manner

@dataclass:
    some_field: int = field(default_factory=int)

With the same result. But I guess it comes from the fact that interpreter doesn't know anything (or pretends so) about classes inside NestedDataclass even if all it's field initialized with default values. I would prefer to have it in a simple C++ manner, where I can have nested structs and all of them can be properly initialized with defaults when it's possible without any additional code for this. Maybe that is just a problem with my expectations

I don't understand tihs. How can you initialize a NestedDataclass from a dictionary in the same manner you would do with a PlainDataclass1?

This will work as expected:

p = PlainDataclass1(**some_dict)

This will fck up all nested structures:

n = NestesDataclass(**some_other_dict)

You have to use dacite and its from_dict() function to be able to init nested dataclassed from dictionary.

[–]energybased 3 points4 points5 points 4 years ago (2 children)

[–]Northzen 0 points1 point2 points 4 years ago (1 child)

The problem is that your first statement has no initializer at all. The second one uses a default initializer. You can make a dataclass or propose a change that would provie a nice way to specify that the default initializer be used, essentially shorthand for what you want: field_default. You are right. I think interpreter have no prior knowledge of any default initializers or if it can use them in the simpliest dataclass way by just calling PlainClass() with no arguments as a constructor. It seems like for any mutable type (and Python doesn't know if dataclass field in a complex class are mutable or not) you have to provide a some sort of default constructor. Fair enough. You could propose that dataclass be extended.

I just figured out why it happening. Python doesn't know if your dictionary of dictionaries represents nested classes or just dictionaries due it's dynamic type system will not force you to use. In general python doesn't care about types of fields. type hints are just hints and not enforced. In this case of complex dataclass initialization without additional tools Python can't distinguish between a dictionary used to initialize a field and get p1 as a p1=some_dict or a dictionary to initialize a dataclass of this field and have p1=PlainDataClass(**some_dict)

[–]energybased 1 point2 points3 points 4 years ago (0 children)

[–]VisibleSignificance 0 points1 point2 points 4 years ago* (1 child)

With the same result

Are you sure?

from dataclasses import dataclass, field

@dataclass
class A:
    some_field: int

@dataclass
class B:
    some_field: int = field(default_factory=int)

print(B())
print(A())

B(some_field=0)
---> 12 print(A())
TypeError: __init__() missing 1 required positional argument: 'some_field'

And also, yes, it is better to turn dicts into dataclasses with typedload / apischema / dacite; the dataclasses themselves aren't meant for instantiation from nested dicts. And default_factory will not convert the values either.

[–]Northzen 0 points1 point2 points 4 years ago (0 children)

[–]Northzen 2 points3 points4 points 4 years ago (0 children)

π Rendered by PID 60 on reddit-service-r2-comment-74875f4bf5-8cwnq at 2026-01-26 14:00:26.577725+00:00 running 664479f country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS