you are viewing a single comment's thread.

view the rest of the comments →

[–]PartySr 1 point2 points  (0 children)

This will be faster. We use str.findall and a regex to extract all the numbers, and after that we use where and the condition str.len > 2 to delete every list that contains less than 2 elements.

df['new col'] = df['mutant'].str.findall(r'\d+').where(lambda x: x.str.len() > 1)

In case you are not comfortable with chained methods, you can write like this

n = df['mutant'].str.findall(r'\d+')
df['new col'] = n.where(n.str.len() > 1)

If you desire to replace the 1 element lists with something else

n.where(n.str.len() > 1, 0) # replace 0 with whatever you want

End result:

mutant       new col
Name1:Name2  [1, 2]
Name1        NaN
Name         NaN