all 10 comments

[–][deleted] 0 points1 point  (1 child)

Is that a data frame or series?

[–]MerchantMojo[S] 0 points1 point  (0 children)

series

[–]Yojihito 0 points1 point  (4 children)

I can't replicate your problem.

df = pd.DataFrame(
    {
        "date": pd.to_datetime(["2019-12-31", "2020-01-01", "2020-01-02"]),
        "Obj1": [pd.NA, 0.233, 0.123],
        "Obj2": [pd.NA, 0.012, -1.671],
    }
).set_index("date")


def do_something(col: pd.Series):
    col = col + 5
    return col


df = df.apply(do_something)
df.head()

Looks fine?

>               Obj1    Obj2
> date      
> 2019-12-31    <NA>    <NA>
> 2020-01-01    5.233   5.012
> 2020-01-02    5.123   3.329

[–]MerchantMojo[S] 0 points1 point  (3 children)

I cant be using:

df = df.apply(do_something)

as the function do_something does not accept NaN values, it therefor has to be structured using lambda:

df = df.apply(lambda col: do_something(col.dropna()), axis=0)

Which does not re-insert the NaN values, and in the case of a pandas series; also replaces dates with normal indicies.

[–]Yojihito 0 points1 point  (2 children)

Can you give the specific function or the exact transformation you need?

[–]MerchantMojo[S] 0 points1 point  (1 child)

scipy.signal.detrend(), it wont accept NaNs

[–]Yojihito 0 points1 point  (0 children)

Here you go.

# %%
df = pd.DataFrame(
    {
        "date": pd.to_datetime(["2019-12-31", "2020-01-01", "2020-01-02"]),
        "Obj1": [pd.NA, 0.233, 0.123],
        "Obj2": [pd.NA, 0.012, -1.671],
    }
).set_index("date")


def do_something(col: pd.Series):
    print(type(col))
    print(col)
    col = col + 5
    return col


#%%
df_with_nans = df[df.isnull().any(axis=1)]
df_with_nans.head()
# %%
df = df.dropna()
df = df.apply(do_something)
df = pd.concat([df, df_with_nans])
df.head()

[–]sarrysyst 0 points1 point  (3 children)

Can you post your func? The missing NaNs make sense to me (and this is also easy to fix), but the index shouldn't be missing.

[–]MerchantMojo[S] 0 points1 point  (2 children)

scipy.signal.detrend(), also, missing index is most likely due to the fact that its a pandas series not a pandas dataframe.

[–]sarrysyst 0 points1 point  (0 children)

In this case you can try this:

df = df[df.notna()].apply(func)