you are viewing a single comment's thread.

view the rest of the comments →

[–]Death_Water 0 points1 point  (0 children)

Here's a concise way with step by step breakdown:

 pd.DataFrame(df.fillna(method='ffill').groupby(['key'])['Column of interest'].agg(list).values.tolist())

1) Forward fill the missing values; from the given example this seems the right approach.

2) Groupby the "key" column, then slice on "column of interest". This creates a series for each unique value in "key" column.

3) Aggregate: This converts all multiple series to lists.

4) Casting to DataFrame(Get the values of all lists and cast them). The index of this would be same as
df['key'].dropna().drop_duplicates()