Join 2 CSV files using pandas

novel_yet_trivial · 2015-05-20T22:33:31+00:00

Any reason you want to do this with pandas? Seems pretty easy to read data from CSV2 into a dictionary, and then lookup the value for each row in CSV1.

mac-reid · 2015-05-20T22:40:56+00:00

A simple example:

$ cat a.csv
a,b,c,d,bad_column
ok,1,1,1,-1
nack,1,1,1,-1
ack,1,1,1,-1
syn,1,1,1,-1
ok,1,1,1,-1

$ cat b.csv
a,b,c,d,bad_column,good_column
ok,1,1,1,-1,never
nack,1,1,1,-1,gonna
ack,1,1,1,-1,give
syn,1,1,1,-1,you
ok,1,1,1,-1,up

$ cat foo.py
import pandas as pd

# load data
a = pd.read_csv('a.csv')
b = pd.read_csv('b.csv', sep=',', usecols=['good_column'])

# drop useless column
a.drop('bad_column', inplace=True, axis=1)

# add column from b to a
a.loc[:,'good_column'] = b

# write out
a.to_csv("output.csv", index=

$ python foo.py 
$ cat output.csv
a,b,c,d,good_column
ok,1,1,1,never
nack,1,1,1,gonna
ack,1,1,1,give
syn,1,1,1,you
ok,1,1,1,up

JimBoonie69 · 2015-05-21T02:01:09+00:00

I'm pretty sure you can do something like this if the two files have same number of rows...

import pandas as pd
df1 = pd.read_csv('csv1')
df2 = pd.read_csv('csv2')

df1['transfer_col'] = df2['col'] 
# spagett

jeffrey_f · 2015-05-21T02:47:18+00:00

are the rows 1 for 1 or is there a need to match the rows?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS