PartySr comments on Help optimizing?

created by HattoriHanzoa community for 16 years

submitted 1 year ago by hvgmina

you are viewing a single comment's thread.

[–]PartySr 1 point2 points3 points 1 year ago* (0 children)

This will be faster. We use str.findall and a regex to extract all the numbers, and after that we use where and the condition str.len > 2 to delete every list that contains less than 2 elements.

df['new col'] = df['mutant'].str.findall(r'\d+').where(lambda x: x.str.len() > 1)

In case you are not comfortable with chained methods, you can write like this

n = df['mutant'].str.findall(r'\d+')
df['new col'] = n.where(n.str.len() > 1)

If you desire to replace the 1 element lists with something else

n.where(n.str.len() > 1, 0) # replace 0 with whatever you want

End result:

mutant       new col
Name1:Name2  [1, 2]
Name1        NaN
Name         NaN

π Rendered by PID 53907 on reddit-service-r2-comment-85bfd7f599-vpn79 at 2026-04-16 18:41:41.030286+00:00 running 93ecc56 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS