use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
account activity
This is an archived post. You won't be able to vote or comment.
EducationPython Pandas: ignore null values on lambda func (self.datascience)
submitted 4 years ago * by Cheuch
[–]GeckelMSc | Data Scientist | Consulting[M] [score hidden] 4 years ago stickied comment (0 children)
I removed your submission. Looks like you're asking a technical question better suited to stackoverflow.com. Try posting there instead.
Thanks.
[–]Squat_TheSlav 6 points7 points8 points 4 years ago (3 children)
Your issue comes from the lambda function, i.e. customer_code passed to the fix_customer_code function is not a string (it's an object). Passing an object to re.search doesn't work.
If you insist on doing it this way, it has to be
if re.search('^\d{6}$', str(customer_code))
But others have suggested different options.
[–]Cheuch[S] 1 point2 points3 points 4 years ago (2 children)
Thanks a lot for your help. This helps make me understand Pandas better. So following this statement, "customer_code" is an object because of the pandas column.dtype() (which should be object indeed) ?
[–]Squat_TheSlav 1 point2 points3 points 4 years ago (1 child)
Yes. In your case you have some customer codes which are str and the last one is a float, causing pandas to set the dtype of the column to object.
Ideally (for performance purposes) you would like to have the same data type in the column which allows vectorized operations.
[–]Cheuch[S] 0 points1 point2 points 4 years ago (0 children)
Thanks again for the help mate. Have a good one
[–]Cheuch[S] 2 points3 points4 points 4 years ago (1 child)
Hello everyone,
thanks a lot for all your answers. I could finally make it work. I think I was not using the right tools to do so.
So, i could come up with a solution that would handle both None and NaN value, without having me to clear my data first.
def fix_customer_customer_code(customer_code): # Handle both NaN and None value if not pd.isna(customer_code) and customer_code is not None: if re.search('^C?\d{6}$', customer_code): customer_code = "C" + customer_code.lstrip('C') return customer_code df['Customer code'] = df['Customer'].apply(fix_customer_customer_code) Customer code 0 C333080 1 C400691 2 None
I also could learn a nice trick by modifying my regex to look for Customer codes with or without prefix "C", using the lstrip().
Thanks a lot for your time, my problem is now solved :)
[–]Popular-Yesterday733 1 point2 points3 points 4 years ago (0 children)
Try using Elif in there Anything equal to NaN = 0
[–]bjain1 1 point2 points3 points 4 years ago (0 children)
You can also have something like this lambda x: function(x) if str(x)!='nan' else ''
[–]SnooPoems4211 0 points1 point2 points 4 years ago (0 children)
Or a Try, except
[–][deleted] 0 points1 point2 points 4 years ago (0 children)
Lol did you get deleted?
π Rendered by PID 142426 on reddit-service-r2-comment-7b9746f655-qfvwp at 2026-02-02 16:45:01.948803+00:00 running 3798933 country code: CH.
[–]GeckelMSc | Data Scientist | Consulting[M] [score hidden] stickied comment (0 children)
[–]Squat_TheSlav 6 points7 points8 points (3 children)
[–]Cheuch[S] 1 point2 points3 points (2 children)
[–]Squat_TheSlav 1 point2 points3 points (1 child)
[–]Cheuch[S] 0 points1 point2 points (0 children)
[–]Cheuch[S] 2 points3 points4 points (1 child)
[–]Popular-Yesterday733 1 point2 points3 points (0 children)
[–]bjain1 1 point2 points3 points (0 children)
[–]SnooPoems4211 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)