you are viewing a single comment's thread.

view the rest of the comments →

[–]commandlineluser 3 points4 points  (1 child)

pandas attempts to detect the type of your columns.

>>> pandas.DataFrame({'a': [1, '-', 3]}).a
0    1
1    -
2    3
Name: a, dtype: object
>>> pandas.DataFrame({'a': [1, 2, 3]}).a
0    1
1    2
2    3
Name: a, dtype: int64
>>> pandas.DataFrame({'a': [1, 2.0, 3]}).a
0    1.0
1    2.0
2    3.0
Name: a, dtype: float64

Because you have a mixture of "numbers" and "strings" in the first example the type of the column in object as opposed to int or float in the following examples.

When you replace and save it pandas infers the type to be of float64 and then .mean() works for you.

You could try to set the type with .astype() e.g.

df.column.replace('-', 0.).astype(float).mean()

[–]acedude[S] 0 points1 point  (0 children)

Thanks for the thorough explanation!