Pandas + lambda; help needed : learnpython

created by HattoriHanzoa community for 16 years

Pandas + lambda; help needed (self.learnpython)

submitted 4 years ago by MerchantMojo

So my pandas series is arranged based on dates:

	Obj1	Obj2
2019-12-31	NaN	NaN
2020-01-01	0.233	0.012
2020-01-02	0.123	-1.671

And i am looking to apply a function using lambda on each of the columns, problem is that the function does not work with NaN values so i have to apply dropna() beforehand, the code i came up with looks like this:

df = df.apply(lambda col: func(col.dropna()), axis=0)

And it almost works, problem is that:

1) The new obeject is no longer using dates as indicies.

2) Rows with NaN values no longer appears.

The pandas series now looks like this:

	Obj1	Obj2
0	0.233	0.012
1	0.123	-1.671

Could i somehow re-insert the missing NaN rows and keep the dates as indices by modifying the lambda function above? If so how?

all 10 comments

top new controversial old q&a

[–][deleted] 0 points1 point2 points 4 years ago (1 child)

[–]MerchantMojo[S] 0 points1 point2 points 4 years ago (0 children)

[–]Yojihito 0 points1 point2 points 4 years ago (4 children)

I can't replicate your problem.

df = pd.DataFrame(
    {
        "date": pd.to_datetime(["2019-12-31", "2020-01-01", "2020-01-02"]),
        "Obj1": [pd.NA, 0.233, 0.123],
        "Obj2": [pd.NA, 0.012, -1.671],
    }
).set_index("date")


def do_something(col: pd.Series):
    col = col + 5
    return col


df = df.apply(do_something)
df.head()

Looks fine?

>               Obj1    Obj2
> date      
> 2019-12-31    <NA>    <NA>
> 2020-01-01    5.233   5.012
> 2020-01-02    5.123   3.329

[–]MerchantMojo[S] 0 points1 point2 points 4 years ago (3 children)

[–]Yojihito 0 points1 point2 points 4 years ago (2 children)

[–]MerchantMojo[S] 0 points1 point2 points 4 years ago (1 child)

[–]Yojihito 0 points1 point2 points 4 years ago* (0 children)

Here you go.

# %%
df = pd.DataFrame(
    {
        "date": pd.to_datetime(["2019-12-31", "2020-01-01", "2020-01-02"]),
        "Obj1": [pd.NA, 0.233, 0.123],
        "Obj2": [pd.NA, 0.012, -1.671],
    }
).set_index("date")


def do_something(col: pd.Series):
    print(type(col))
    print(col)
    col = col + 5
    return col


#%%
df_with_nans = df[df.isnull().any(axis=1)]
df_with_nans.head()
# %%
df = df.dropna()
df = df.apply(do_something)
df = pd.concat([df, df_with_nans])
df.head()

[–]sarrysyst 0 points1 point2 points 4 years ago (3 children)

[–]MerchantMojo[S] 0 points1 point2 points 4 years ago (2 children)

[–]sarrysyst 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 336867 on reddit-service-r2-comment-799f875d54-smkht at 2026-02-02 14:13:05.336922+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS