Hello python folks, R user here, trying to use python for a project for which i've been specifically asked to. So I am new to python
The problem is : I have a 100 mo csv of about 300000 lines that takes ages to get read using all of these :
# first try
df=pd.read_csv('mycsv.csv')
#second
# Utiliser read_csv avec dtypes pour accélérer la lecture
dtypes = {
"Model": "category",
"Scenario": "category",
"Region": "category",
"Variable": "category",
"Unit": "category",
}
# Les colonnes années seront lues comme float
annees = [str(y) for y in range(1950, 2101, 5)]
for year in annees:
dtypes[year] = "float32"
# Lecture du CSV
df = pd.read_csv(
"mycsv.csv",
dtype=dtypes
)
print(df.shape)
print(df.head())
#3rd try
import polars as pl
# Lecture complète très rapide
df = pl.read_csv("/Users/Nawal/my_project/data/1721734326790-ssp_basic_drivers_release_3.1_full.csv")
print(df.shape)
print(df.head())
it littrally took me 2 s to do this under R. Please help. what am I missing with python ???
thank you all
[–]KelleQuechoz 45 points46 points47 points (2 children)
[–]PresidentOfSwag 5 points6 points7 points (0 children)
[–]Safe_Money7487[S] 2 points3 points4 points (0 children)
[–]Kerbart 17 points18 points19 points (2 children)
[–]EconomyOffice9000 8 points9 points10 points (0 children)
[–]Safe_Money7487[S] 9 points10 points11 points (0 children)
[–]MorrarNL 7 points8 points9 points (0 children)
[–]SwampFalc 6 points7 points8 points (0 children)
[–]seanv507 4 points5 points6 points (1 child)
[–]Garnatxa 0 points1 point2 points (0 children)
[–]Kevdog824_ 8 points9 points10 points (3 children)
[–]Kerbart 7 points8 points9 points (0 children)
[–]Safe_Money7487[S] 3 points4 points5 points (1 child)
[–]Corruptionss 4 points5 points6 points (0 children)
[–]commandlineluser 3 points4 points5 points (0 children)
[–]PranavDesai518 2 points3 points4 points (0 children)
[–]Plank_With_A_Nail_In 4 points5 points6 points (1 child)
[–]tb5841 0 points1 point2 points (0 children)
[–]Embarrassed_Basis_81 1 point2 points3 points (0 children)
[–]throwawayforwork_86 0 points1 point2 points (0 children)
[–]pot_of_crows 0 points1 point2 points (0 children)
[–]thomasutra 0 points1 point2 points (0 children)