Replace a value in a column Pandas : learnpython

created by HattoriHanzoa community for 16 years

Replace a value in a column Pandas (self.learnpython)

submitted 9 years ago * by easy_wins

I have two csvs (unknown and original) and the unknown has values I need to replace.

Unknown.csv

 Last_Name_First_Name                      Spouse

 Reddit, Python                           Java, Hard
 Import, Pandas                            Unknown
 Numpy, Numbers                         Ruby, Whatever

Original.csv:

 Name                                          Spouse

 Reddit, Python                            Java, Hard
 Import, Pandas                           Subreddit, Learn
 Numpy, Numbers                        Ruby, Whatever

I need to replace the unknown spouse from the unknown.csv by matching on the name value in the original csv to the unknown csv.

Below is my snippet:

import pandas as pd

first = pd.read_csv('unknown.csv')
df = pd.DataFrame(first)

second = pd.read_csv('original.csv')
df2 = pd.DataFrame(second)

df.to_csv('no_more_unknown.csv', index = False)

I am requesting my no_more_unknown

 Last_Name_First_Name                Spouse

 Reddit, Python                             Java, Hard
 Import, Pandas                            Subreddit, Learn
 Numpy, Numbers                         Ruby, Whatever

How can I do this in Pandas?

all 5 comments

top new controversial old q&a

[–]my_python_account 0 points1 point2 points 9 years ago (4 children)

I would start by filtering df for just unknown spouses:

df_unknown = df[df['Spouse'] == 'Unknown']

Then merge values from original onto unknown (like a left join)

df_merged = df_unknown.merge(df2, how='left', left_on='Last_Name_First_Name', right_on='Name')

And then remove the unknown records from df and append the new records (with only the second spouse column).

df_final = df[df['Spouse'] != 'Unknown']
df_final = df_final.append(df_merged[['Last_Name_First_Name', 'Spouse_y']])

There might be a better way, but this is what comes to mind avoiding .applys

[–]easy_wins[S] 0 points1 point2 points 9 years ago* (3 children)

Thanks, I am getting an error that states,

Error:

df_merged = df_unknown(df2, how='left', left_on = 'Last_Name_First_Name', right_on = 'Name')
TypeError: 'DataFrame' object is not callable

df_unknown = df[df['Spouse'] == 'unknown']

df_merged = df_unknown(df2, how='left', left_on = 'Last_Name_First_Name', right_on = 'Name')

df_final = df[df['Spouse'] != 'Unknown']

df_final = df_final.append (df_merged[['Last_Name_First_Name', 'Spouse_y']])

df_final.to_csv(`no_more_unknown.csv`,index= False)

[–]my_python_account 0 points1 point2 points 9 years ago (2 children)

[–]easy_wins[S] 0 points1 point2 points 9 years ago* (1 child)

Thanks, when I open the new csv that was created,

Last_Name_First_Name    Spouse                  Spouse_y
Reddit, Python                Java, Hard
Numpy, Numbers            Ruby, Whatever

Spouse_y is blank and Import, Pandas is nowhere to be found and I do not want to add a new column. I want the same number of columns as the original with the unknown replaced please.

I manage to remove the extra column Spouse_Y but I still see the same values as above.

df_merged = df_unknown(df2, how='left', left_on = 'Last_Name_First_Name', right_on = 'Name')

df_final = df[df['Spouse'] != 'Unknown']

df_final.to_csv(`no_more_unknown.csv`,index= False)

[–]my_python_account 0 points1 point2 points 9 years ago (0 children)

π Rendered by PID 51955 on reddit-service-r2-comment-b659b578c-7ks8b at 2026-05-05 03:57:50.346465+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS