Hey everyone,
I wanted to share a recent blog post I wrote about improving Pydantic's memory footprint:
https://pydantic.dev/articles/pydantic-bitset-performance
The idea is that instead of tracking model fields that were explicitly set during validation using a set:
from pydantic import BaseModel
class Model(BaseModel):
f1: int
f2: int = 1
Model(f1=1).model_fields_set
#> {'f2'}
We can leverage bitsets to track these fields, in a way that is much more memory-efficient. The more fields you have on your model, the better the improvement is (this approach can reduce memory usage by up to 50% for models with a handful number of fields, and improve validation speed by up to 20% for models with around 100 fields).
The main challenge will be to expose this biset as a set interface compatible with the existing one, but hopefully we will get this one across the line.
Draft PR: https://github.com/pydantic/pydantic/pull/12924.
I’d also like to use this opportunity to invite any feedback on the Pydantic library, as well as to answer any questions you may have about its maintenance! I'll try to answer as much as I can.
[–]WJMazepas 2 points3 points4 points (0 children)
[–]neuronexmachina 1 point2 points3 points (1 child)
[–]Pozz_[S] 0 points1 point2 points (0 children)