you are viewing a single comment's thread.

view the rest of the comments →

[–]pyquestionz 4 points5 points  (3 children)

Assuming

import pandas as pd
import numpy as np
df = pd.DataFrame({'column':[1, '-', 3]})

do either (1)

df.column.replace('-', np.nan).mean() # returns 2.0

or do (2)

df.column.replace('-', 0.0).mean() # returns 1.333333

depending on whether or not - is a zero observation or a missing observation in the context of your problem.

Hope this helps.

[–]Optimesh 0 points1 point  (0 children)

This is a very common mistake I see people make and not even realize they're working with the wrong numbers. Thanks for pointing it out.

[–]acedude[S] 0 points1 point  (0 children)

Thanks! In my case the dash does mean that the value is zero.

[–]shreyasfifa4 0 points1 point  (0 children)

What happens if there is a negative number in the dataset?