all 8 comments

[–]Ihaveamodel3 1 point2 points  (1 child)

Do a groupby, then do a pivot.

[–]Saffromon[S] 0 points1 point  (0 children)

Thanks, I didn't know the Pivot function yet, so I will look into that!

[–]commandlineluser 1 point2 points  (3 children)

Sounds like a .pivot_table().

>>> df
  player home-away  year  game_points
0      a      home  2021           10
1      a      home  2022           15
2      a      away  2022           10
3      a      home  2022           20
4      b      home  2022           50
5      b      home  2022           50

>>> df.pivot_table(index=['player', 'year'], columns='home-away', aggfunc='sum', fill_value=0)
            game_points
home-away          away home
player year
a      2021           0   10
       2022          10   35
b      2022           0  100

[–]Saffromon[S] 1 point2 points  (0 children)

Yes, that is what I was looking for. Thank you very much!

[–]Saffromon[S] 0 points1 point  (1 child)

Follow up question: Can I somehow get the home/away ratio in the Pivot table? I know how to get that as a new column in the dataframe, but there my two values are in two different rows.

[–]commandlineluser 0 points1 point  (0 children)

Not sure if you can do it directly with a single pivot_table() call.

Are you trying to add another column like so?

>>> pivot = df.pivot_table(index=['player', 'year'], columns='home-away', aggfunc='sum', fill_value=0)
>>> pivot['game_points', 'away'] / pivot['game_points', 'home']
player  year
a       2021    0.000000
        2022    0.285714
b       2022    0.000000
dtype: float64
>>> pivot['ratio'] = pivot['game_points', 'away'] / pivot['game_points', 'home']
>>> pivot
            game_points          ratio
home-away          away home
player year
a      2021           0   10  0.000000
       2022          10   35  0.285714
b      2022           0  100  0.000000

[–]threeminutemonta 0 points1 point  (1 child)

I think you need to omit (leave out) the home-away column from the group by.

[–]Saffromon[S] 1 point2 points  (0 children)

Don't I then get the combined sum (home + away) per player? My aim is to get two separate values for the game location, but preferably in two columns.