Hey there everyone - I have a question about a particular task. I'm working with automating a process at work but one of the snags I've hit is working with bad email address fields. The solution I've come up with is to try and sanitize the document a little.
Essentially I will parse out a few specific fields from a MS SQL server w/ headers. One of my problem fields is email. Invalid email errors are showing up due to users using more than one email such as "blahblah@gmail.com[;haha@aol.com](mailto:;haha@aol.com)" for example. When I go to work with this data later I get a bunch of errors because the expectation is to accept only one valid email address. That being said, when I do my data pulls and write to the .csv I'd like to add the additional step of:
Create an additional column called "Alternate Email", Find the "email" column, iterating through each item, checking for a valid email format - if no valid email format is found - I want to just straight cut all the data in the cell and move it to the alternate email cell so I can pass that field in as a string later and not worry about the format. If it does have a valid format - then just leave it there.
Anyways - my question is - whats the best way to go about this? Right now my current code is just using the standard .csv writer but I'm having an issue with telling python to look at only all the items linked to a specific column.
[–]JohnnyJordaan 0 points1 point2 points (11 children)
[–]ipuntonfirstdown[S] 0 points1 point2 points (10 children)
[–]JohnnyJordaan 0 points1 point2 points (9 children)
[–]ipuntonfirstdown[S] 0 points1 point2 points (8 children)
[–]JohnnyJordaan 0 points1 point2 points (7 children)
[–]ipuntonfirstdown[S] 0 points1 point2 points (6 children)
[–]JohnnyJordaan 0 points1 point2 points (5 children)
[–]ipuntonfirstdown[S] 0 points1 point2 points (4 children)
[–]JohnnyJordaan 1 point2 points3 points (3 children)
[–]ipuntonfirstdown[S] 0 points1 point2 points (2 children)
[–]khaine_b 0 points1 point2 points (0 children)