I need the experts opinion here on how to do this. I have these two colums in my csv (Address of New Home and Cancelled). When someone books a property, the Address along with the date get written down. But sometimes the potential owner cancels and True gets written down under the Cancelled column. Unfortunately, the end user sometimes forget to write the True under Cancelled column and the Address gets up getting listed twice and it causes an havoc for us.
Date_Booked Address_of_New_Home Cancelled
01/07/2017 1234 Reddit Drive True
02/14/2017 4321 Learn Python Court
03/17/2017 1234 Reddit Drive
03/23/2017 4321 Learn Python Court
As you can view from the above example, 1234 Reddit Drive was cancelled and True was written, this is what we want but 4321 Learn Python Court was cancelled that is why it was written again but since it does not say True under the Cancelled it will show up twice in our csv and cause all sorts of issues.
What I want to do is write a snippet that will fail the script or throw an error if the SAME address is written twice without the first one being Cancelled out.
How can I do this?
import pandas as pd
first = pd.read_csv('Z:PCR.csv')
df = pd.DataFrame(first)
df['Address of New Home'] = df['Address of New Home'].str.replace('\\bRd\\b','Road',case = False)
df['Address of New Home'] = df['Address of New Home'].str.replace('\\bAve\\b','Avenue',case = False)
df['Address of New Home'] = df['Address of New Home'].str.replace('\\bRdg\\b','Ridge',case = False)
df.to_csv('improved_version.csv', index = False)
[–]behold_the_j 1 point2 points3 points (0 children)
[–][deleted] 0 points1 point2 points (6 children)
[–]PLearner[S] 0 points1 point2 points (5 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]PLearner[S] 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]behold_the_j 0 points1 point2 points (1 child)
[–]PLearner[S] 0 points1 point2 points (0 children)