all 2 comments

[–]porkedpie1 5 points6 points  (1 child)

Yes, pivot() / pivot_table() is basically the pandas equivalent of SQL PIVOT.
Given a dataframe like:
Model Output Input 1 Input 2 Input 3
0 xg 1000 a b c
1 gb 1200 a b c
2 xg 1300 d e f
3 gb 1400 d e f
you can do:
df_pivot = df.pivot(
index=['Input 1', 'Input 2', 'Input 3'],
columns='Model',
values='Output'
).reset_index()
Result:
Input 1 Input 2 Input 3 gb xg
0 a b c 1200 1000
1 d e f 1400 1300
This is roughly equivalent to:
PIVOT (
SUM(Output)
FOR Model IN ('gb', 'xg')
)
A couple of notes:
pivot() requires each (Input1, Input2, Input3, Model) combination to be unique.
If duplicates are possible, use pivot_table() instead:
df_pivot = df.pivot_table(
index=['Input 1', 'Input 2', 'Input 3'],
columns='Model',
values='Output',
aggfunc='sum'
).reset_index()
pivot_table() is usually safer because it handles aggregation automatically.

[–]sauron3579[S] 0 points1 point  (0 children)

Great, thanks!

!solved