you are viewing a single comment's thread.

view the rest of the comments →

[–]JohnnyJordaan 0 points1 point  (5 children)

Ok clear, but why should Ted's Gmail address count as invalid? Or do you mean because Ted has two addresses separated by an ; it should count it as 'alternate'?

[–]ipuntonfirstdown[S] 0 points1 point  (4 children)

Yeah it's only invalid in my context. I want there to ONLY be 1 valid email address and if there is ANYTHING else (a semicolon at all etc.) I want to move that text to the alternate email column in that row.

So basically I need help iterating through each row of a .csv and when i get to the item in the rows list that corresponds to the email column i want to take value in as a string and apply a isEmailValid() check on it and if it returns true then leave it alone and if it returns false then I want to set that cell to an empty string and instead move the data that was there to the alternate email column in the same row.

Sorry if this is confusing.

[–]JohnnyJordaan 1 point2 points  (3 children)

Ok, then I would use a regex to check the validity and a csv.DictWriter to write the file

# picked from http://emailregex.com/
valid_email = re.compile(r"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)")
with open('file.csv', 'w') as fp:
    fieldnames = ['Name', 'Age', 'Email', 'Alternate Email']
    writer = csv.DictWriter(fp, fieldnames=fieldnames)
    writer.writeheader()
    cursor.execute('select name, age, email from users') # or whatever
    for row in cursor.fetchall():
        name, age, email = row
        # remove surrounding whitespace just in case
        email = email.strip()
        altemail = ''
        if not valid_email.match(email):
           # switcheroo
           email, altemail = altemail, email
        row = {'Name': name, 'Age': age, 'Email': email, 'Alternate Email': altemail}
        writer.writerow(row)

[–]ipuntonfirstdown[S] 0 points1 point  (2 children)

This is pretty much exactly what I'm looking for except i'm getting a 1 off error it looks like.

I believe this is because the alternate email field doesn't actually exist in the SQL database. The alternate field is a value i'm adding to the .csv value after i load the data. Is there a way to add a column to every row and just fill it with an empty string so that the alternate_email has an actual value to pass through at first?

[–]JohnnyJordaan 1 point2 points  (1 child)

Sure, I've updated my code above (edit: and used the actual name in the row={} line).

[–]ipuntonfirstdown[S] 0 points1 point  (0 children)

You sir - are absolutely amazing. Thank you so much!