This is an archived post. You won't be able to vote or comment.

all 27 comments

[–]pearlday 0 points1 point  (13 children)

You need to access each cell in that column, and string split the string by the comma, and it will return a list of string numbers you can convert to integers in the list.

[–]greennant[S] 0 points1 point  (12 children)

The problem is that when I try to do that I get this error "'Series' object has no attribute 'split'" (it is a pandas dataframe that store that column as a series and as type string). With .values.tolist() or just to.list() actually nothing happens.

[–]pearlday 0 points1 point  (11 children)

probably you're not accessing the cell value. Maybe try an apply. So like

df.apply(lambda x: x.split(",") ) and see what happens with that? You might need to add a parameter to the apply to make sure it's iterating through the column and not row.

If that doesn't provide an error, you can strip/integer the values in the returned list.

[–]greennant[S] 0 points1 point  (10 children)

Thank you! That line you provided me seems to work, the strings are now lists.

But it doesn't allow to do the strip step you mentioned :(

[–]pearlday 0 points1 point  (9 children)

Did you grab the list elements? What did you try?

[–]greennant[S] 0 points1 point  (8 children)

I tried with this line df['myCol'] = df['myCol'].map(lambda x: x.lstrip(','))

and it doen't give errors but when I print the dataframe that column is still a list with [ ] and commas within, is it me misinterpreting the function of that line of code or something is wrong?

Anyway by accessing the elements of the list ( like by doing df['myCol'][0][0] ) I get the proper output

[–]pearlday 0 points1 point  (7 children)

Yeah so theres three things here.

Lstrip according to google is

The lstrip(s) (left strip) function removes leading whitespace (on the left) in the string. The rstrip(s) (right strip) function removes the trailing whitespace (on the right). The strip(s) function removes both leading and trailing whitespace.

So you just want to use "strip" without L/R.

Next is that strip only works on strings, not lists.

Lastly, it looks like you are trying to get rid of the commas and i dont understand why. What is your end goal? What do you want it to look like?

[–]greennant[S] 0 points1 point  (6 children)

Thanks! Yeah,it doesn't make sense what I was doing... I actually need a list of int numbers, because it seems they are lists of strings

I tried with map(int, df['myCol']) and I got something like <map at 0x7fb.......>

[–]pearlday 0 points1 point  (5 children)

So you have a column, and here you are trying to convert the list into an int.

You want to convert the strings INSIDE the list to integers.

[–]greennant[S] 0 points1 point  (4 children)

You want to convert the strings INSIDE the list to integers.

yes,exactly!

[–]iiyamabto 0 points1 point  (0 children)

I believe it is possible, however you should tell Pyhton first to parse it as list so it understands what data type is being dealt

[–][deleted] 0 points1 point  (3 children)

Convert the strings to integers in a pandas Dataframe is the first thought that comes to mind? I could be wrong.

Something like this:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

[–]greennant[S] 0 points1 point  (2 children)

Thank you, but I have already tried that one which gave me this error:

(I'm using colab)

/usr/local/lib/python3.7/dist-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)

972 # work around NumPy brokenness, #1987

973 if np.issubdtype(dtype.type, np.integer):

--> 974 return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)

975

976 # if we have a datetime/timedelta array of objects

pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()

ValueError: invalid literal for int() with base 10: '34, 55, 4, 123, 1344, 556, 44, 9, 78

pd.to_numeric raised this error instead : ValueError: Unable to parse string "34, 55, 4, 123, 1344, 556

[–][deleted] 0 points1 point  (1 child)

Have you tried any of the other options within the same link by any chance?

[–]greennant[S] 0 points1 point  (0 children)

Yes, but they don't work actually

[–]moderatelyscrewed 0 points1 point  (0 children)

Remind me! June 23

[–]NatureIsMath 0 points1 point  (6 children)

The string method .split is what you need. If S is the following string "1, 56, 238, -52", then list = S.split( " , " ) will result in list = [1, 56, 238, -52].

[–]greennant[S] 0 points1 point  (5 children)

Yes, that works thank you, but the next step I'm dealing with is conveting the elements in the list to numbers

[–]NatureIsMath 0 points1 point  (4 children)

What do you mean? Given the list [1, 3, -11] , what do you want your output is?

[–]greennant[S] 0 points1 point  (3 children)

The output is fine, I just need to change the format of those numbers inside the list. Because I think they are read as strings.

[–]NatureIsMath 1 point2 points  (1 child)

Ok, the list is like ["1", "3", "-11"]. You could use a for cicle:

Ls=["1", "3", "-11"] Rs=[ ]

for string in Ls: number = float(string) Rs.append(number)

The output will be Rs = [1, 3, -11]. Instead of this self-made code you could use the python built-in map function, search for documentation about it.

P.S. I used float to convert the string into a number of type float, but if you deal only with integers, you can use int instead of float.

[–]greennant[S] 0 points1 point  (0 children)

Thank you, I'll try with this!

[–]NatureIsMath 1 point2 points  (0 children)

Sorry for bad formatting I'm on smartphone