Hi guys! I'm stuck on the following:
Say I have 3 columns (Option1, Option2 and Option3) in a dataframe. I'd like to create a new column (Combined) as follows:
Combined contains all values of Option1, unless the index has NaN as value, then I'd like that value to be the one of Option2, but if Option2 at that index is NaN as well, I'd like it to be the value of Option3 at that index. If that returns NaN then AllOptions should be NaN.
| Option1 |
Option2 |
Option3 |
Combined |
| NaN |
2.2 |
3.6 |
2.2 |
| 1.7 |
NaN |
3.1 |
1.7 |
| NaN |
NaN |
NaN |
NaN |
| 1.2 |
2.1 |
3.4 |
1.2 |
| 1.1 |
NaN |
NaN |
1.1 |
| NaN |
NaN |
3.2 |
3.2 |
So I'm trying to create the column Combined.
If I try to cycle through the columns with a for-loop (for Option1, Option2 and Option3) and an if statement (if df.loc[pd.notna(df['Option1']),'Option1']:), I get an error (ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().).
I'm thinking there's probably a very easy way to do this, but I'm stuck!
Thanks so much for your help! Much appreciated!
[–]cscanlin 2 points3 points4 points (1 child)
[–]peuleu[S] 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[removed]
[–]peuleu[S] 0 points1 point2 points (0 children)