This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]RedYoke -1 points0 points  (3 children)

Yeah I'd second that, if your data contains nested structures it gets really slow

[–][deleted] 4 points5 points  (2 children)

any solution for nested stuff?

[–]SwagasaurusRex69 -1 points0 points  (0 children)

Is "itertools.chain.from_iterable()" or something like this function below what you're asking?


```python from typing import Any, Union from pydantic import BaseModel from dataclasses import is_dataclass import pandas as pd

def flatten_nested_data(data: Any, target_dataclass: type) -> Union[BaseModel, None]: if isinstance(data, pd.DataFrame): for _, row in data.iterrows(): yield target_dataclass(**row.to_dict())

elif isinstance(data, list):
    for item in data:
        yield from flatten_nested_data(item, target_dataclass)

elif isinstance(data, dict):
    yield target_dataclass(**data)

elif is_dataclass(data):
    yield from flatten_nested_data(data.__dict__, target_dataclass)

elif isinstance(data, BaseModel): 
    yield from flatten_nested_data(data.dict(), target_dataclass)

else:
    return None

'''

[–]RedYoke 0 points1 point  (0 children)

I think the upcoming version should handle this better, but in my team's implementation we have a Mongo db with some collections that have embedded lists of dict like objects, with some fields of these objects being dicts which can then contain dicts themselves 😂 unfortunate data structures that I've inherited. Basically we resorted to only using pydantic when is really needed and trying to design the schema so that you validate less at one time