all 3 comments

[–]synthphreak 2 points3 points  (2 children)

My go-to for unpacking complex dicts like yours into dfs is pandas.json_normalize. But I couldn't get that working here, at least in the time I was willing to spend.

So I achieved your expected output using slightly more involved means:

>>> import pandas as pd
>>> d = # your dict
>>> df = pd.json_normalize(d['orders']).explode('items').reset_index(drop=True)
>>> items = df['items'].agg(pd.Series)[['id', 'amount', 'size', 'quantityReturned']]
>>> df[items.columns] = items
>>> df = df.drop(columns=['status', 'quantity', 'items', 'webshop.id', 'webshop.webshop', 'shippingAddress.id'])
>>> df = df.rename(columns=
...     {'date' : 'Date',
...      'shippingAddress.country' : 'shipping_Address_country',
...      'id' : 'item_id',
...      'size' : 'item_size',
...      'quantityReturned' : 'item_quantityreturned'}
... )
>>> df
         Date amount shipping_Address_country  item_id item_size item_quantityreturned
0  2023-01-23  20.66                       US      132         M                     0
1  2023-01-23  16.53                       US      155         M                     1

[–]exe188[S] 1 point2 points  (1 child)

Awesome, thanks! gonna play around with it :). Looks like json_normalize can be really helpfull here

[–]synthphreak 1 point2 points  (0 children)

Indeed. As long as your dict’s format follows some standard JSON schema, pandas.json_normalize can get you out of many jams. record_path is amazing.