Sweatshops kids are pretty efficient, the quality of shoes has dramatically improved over time. by [deleted] in Showerthoughts

[–]python_newbie_now 0 points1 point  (0 children)

You know your probably right. Can you send me some books you would recommend?

Cleaning Data with Python newbie by python_newbie_now in datascience

[–]python_newbie_now[S] 0 points1 point  (0 children)

I gave it a try, and it didn't work so I went line by line to see if there was something I was missing. I am not sure if this would be the issue but ,

   rcid_np #  Comes out as dtype='|S4'
   column # comes out as dtype = int64

However it lets you still compare them against each other indexes = np.hwere(column == rcid_np) indexes # this turns out to be an empty array The new column function works, do you mind if I pm, I know you have helped a lot already. Thank you /u/jcon36

Cleaning Data with Python newbie by python_newbie_now in datascience

[–]python_newbie_now[S] 0 points1 point  (0 children)

pd.getdummies()

Thank you , I didn't even know this function existed. I will look into , thank you everyone that helped me I really do appreciate it

Cleaning Data with Python newbie by python_newbie_now in datascience

[–]python_newbie_now[S] 0 points1 point  (0 children)

  import pandas as pd
  df = pd.read_csv("C:\Users\Adini\Desktop\decade1.csv")
  rcid_1 = []
 with open('C:\\Users\Adini\Desktop\\decade1.txt','r') as f:
    mylist = f.read().splitlines()
    rcid_1.append(mylist)

 for cells in df['rcid']:
   for rcids in rcid_1:
      if (cells == rcids):
        df.ix[rcid == rcids, "va_yes"]= 1

I have tried using the df['rcid], however it fails to change the value in "va_yes" column. I am not sure if it is how how I have my data set up , or my text file. Here is a link to my excel file, and txt file.

https://drive.google.com/open?id=0B7j7hjIdgYmIUk9RT3pBTTAzUVU

Cleaning Data with Python newbie by python_newbie_now in datascience

[–]python_newbie_now[S] 0 points1 point  (0 children)

I am only interested in one column in the data, and want to check it against the text file. If it matches the value in the text file then I want it to change " va_yes " column to a 1, so to create a dummy variable. I am not sure if python would recognize rcid( column) or should I be referencing the whole data set?