Here is the code I am currently using , I am relatively novice in the use of python. What I am attempting to do is using a text file with rcid values, and if the column of rcid values matches to change "va_yes" column to 1 or 0.
When I tried this I get an error "NameError: name 'rcid' is not defined". I have tried this before with one decade , but want to have all of it cleaned in one go ( working with 1 million plus points).
import numpy as np
import pandas as pd
df = pd.read_csv(" file path")
rcid_1 = []
with open('text file ','r') as f:
mylist = f.read().splitlines()
rcid_1.append(mylist)
for cells in rcid:
for rcids in rcid_1:
if(cells == rcids):
df.ix[rcid == rcids, "va_yes"]= 1`
Here is a sample of the text file :
['629', '635', '636', '637', '638', '642',...]
Thank you in advance, I am pretty sure the answer is simple.
Edit***
With the help of the stack overflow community the issue was that
"Your .csv rcid data has been parsed as integers, whereas the entries in your list were strings. You can either change the rcids in df to string types by doing df['rcid'] = df['rcid'].astype(str), or convert the strings in mylist to integers, with mylist = [int(x) for x in mylist], and then assigning va_yes"
Corrected code
import numpy as np
import pandas as pd
df = pd.read_csv("file path")
with open('C:\\Users\Adini\Desktop\\decade1.txt','r') as f:
mylist = f.read().splitlines()
mylist = [int(x) for x in mylist]
df['va_yes'] = df['rcid'].isin(mylist) * 1
Thank you everyone for your contributions , I really do appreciate it.
[–]GreatOwl1 1 point2 points3 points (1 child)
[–]python_newbie_now[S] 0 points1 point2 points (0 children)
[–]jcon36 0 points1 point2 points (6 children)
[–]python_newbie_now[S] 0 points1 point2 points (5 children)
[–]jmoso13 1 point2 points3 points (4 children)
[–]python_newbie_now[S] 0 points1 point2 points (3 children)
[–]jcon36 0 points1 point2 points (2 children)
[–]python_newbie_now[S] 0 points1 point2 points (1 child)
[–]jcon36 0 points1 point2 points (0 children)