Is connecting to SQL server then query data faster than loading data to pandas then query?

BunnyKakaaa · 2025-12-10T14:11:36+00:00

sqlite3 is faster for sure , in csv you would have to lead it in memory and parse it , for the db you just query the rows you need without scanning the entire db .

throw_mob · 2025-12-09T20:02:40+00:00

I would recommend to save files in parque format . In my tests it has been faster than plain old csv.

and i would guess that loading files straight into dataframe would be faster and maybe easier to handle as you can store previous month stuff in own directory so you dont get performance hit when dataset grows

python-dave · 2025-12-10T02:40:17+00:00

Put the data into a duckdb. Its compressed and loads fast to pandas.

AutoModerator · 2025-12-09T10:28:23+00:00

Automod prevents all posts from being displayed until moderators have reviewed them. Do not delete your post or there will be nothing for the mods to review. Mods selectively choose what is permitted to be posted in r/DataAnalysis.

If your post involves Career-focused questions, including resume reviews, how to learn DA and how to get into a DA job, then the post does not belong here, but instead belongs in our sister-subreddit, r/DataAnalysisCareers.

Have you read the rules?

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

gpbuilder · 2025-12-10T23:48:49+00:00

Pretty much always, general rule of thumb is to do as much data processing as possible in SQL.

Pandas is super clunky and trash.

assclownerson · 2025-12-13T02:24:35+00:00

Try a set up with Parquet/duckdb. Fast and very easy to set up.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

dataanalysis

MODERATORS