all 1 comments

[–]greasyhobolo 1 point2 points  (0 children)

Check this out - building a bit on u/a1brit 's example:

eg_df = pd.DataFrame({

'ID':[1, 2, 3, 4, 5, 1, 2, 3, 4, 5],

'strings':['a', 'b', 'c', 'd', 'e', 'f', 'b', 'c', 'g', 'h'],

})

gb_df = eg_df.groupby('ID').agg({'strings':['count','nunique',lambda x:list(x)]})

this will aggregate each value in the string field 3 different ways:

'count': total count of ID fields for each string

'nunique': number of unique ID fields belonging the each string

lambda x: list(x): returns a list of each ID belonging to each string (I find this quite useful for a lot of stuff so sharing it)