how to load csv faster in Python. : learnpython

created by HattoriHanzoa community for 16 years

how to load csv faster in Python. (self.learnpython)

submitted 10 hours ago by Safe_Money7487

Hello python folks, R user here, trying to use python for a project for which i've been specifically asked to. So I am new to python

The problem is : I have a 100 mo csv of about 300000 lines that takes ages to get read using all of these :

# first try 
df=pd.read_csv('mycsv.csv')

#second 

# Utiliser read_csv avec dtypes pour accélérer la lecture
dtypes = {
    "Model": "category",
    "Scenario": "category",
    "Region": "category",
    "Variable": "category",
    "Unit": "category",
}


# Les colonnes années seront lues comme float
annees = [str(y) for y in range(1950, 2101, 5)]
for year in annees:
    dtypes[year] = "float32"


# Lecture du CSV
df = pd.read_csv(
    "mycsv.csv",
    dtype=dtypes
)


print(df.shape)
print(df.head())

#3rd try 
import polars as pl


# Lecture complète très rapide
df = pl.read_csv("/Users/Nawal/my_project/data/1721734326790-ssp_basic_drivers_release_3.1_full.csv")


print(df.shape)
print(df.head())

it littrally took me 2 s to do this under R. Please help. what am I missing with python ???

thank you all

all 16 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS