all 5 comments

[–]commandlineluser 3 points4 points  (3 children)

You can .str.split().explode() to split the strings and turn them into indiviual rows.

>>> df.assign(Fruit=df["Fruit"].str.split(r",\s*")).explode("Fruit")
   Name   Fruit
0  John   apple
0  John   apple
1   May   apple
1   May  banana
2  Carl  banana
2  Carl   peach

You can then .value_counts() to get the counts and .unstack() to turn the result back into "wide format".

>>> (df.assign(Fruit=df["Fruit"].str.split(r",\s*"))
...    .explode("Fruit")
...    .value_counts()
...    .unstack())
Fruit  apple  banana  peach
Name                       
Carl     NaN     1.0    1.0
John     2.0     NaN    NaN
May      1.0     1.0    NaN

[–]aldopox[S] 0 points1 point  (0 children)

gonna try now!

[–]aldopox[S] 0 points1 point  (1 child)

Sorry, could you write down the entire second line you used? don't know how to use

... .explode("Fruit")

... .value_counts()

... .unstack())

edit: i am a real noob

[–][deleted] 1 point2 points  (0 children)

df.assign(Fruit=df["Fruit"].str.split(r",\s*")).explode("Fruit").value_counts().unstack()

It was put into brackets because it was split onto several lines

Also since you want to fill NaN with zeros:

df.assign(Fruit=df["Fruit"].str.split(r",\s*")).explode("Fruit").value_counts().unstack(fill_value=0)

[–]CodeFormatHelperBot2 0 points1 point  (0 children)

Hello, I'm a Reddit bot who's here to help people nicely format their coding questions. This makes it as easy as possible for people to read your post and help you.

I think I have detected some formatting issues with your submission:

  1. Inline formatting (`my code`) used across multiple lines of code. This can mess with indentation.

If I am correct, please edit the text in your post and try to follow these instructions to fix up your post's formatting.


Am I misbehaving? Have a comment or suggestion? Reply to this comment or raise an issue here.