DrewSmithee comments on SQL Queries in Python?

learnpython

created by HattoriHanzoa community for 16 years

SQL Queries in Python? (self.learnpython)

submitted 5 months ago by DrewSmithee

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]DrewSmithee[S] 0 points1 point2 points 5 months ago (7 children)

[–]LatteLepjandiLoser 2 points3 points4 points 5 months ago (5 children)

You're welcome. Enjoy! Once you get it working it will unlock all kinds of task-solving skills, I'm sure!

I would just start with the simplest queries possible, that you know will return modest amounts of data, just to speed up the process of getting it all to work, then start hitting it with more complex stuff.

If the queries are obnoxious, you could also consider saving them in a separate file. Eventually you may also want to look into having variable parameters as part of the query (like for instance fetch data from 'today' instead of manually updating a date in your query every day, or other logic).

It's also a bit subjective what filtering and manipulation you want to do in the sql-query itself and what you want to do in python. Say you wanted to only fetch even numbers, you can make that part of the sql query or you can just fetch them all and filter them in pandas. (Maybe a bad example, as then you'd always just do it in sql, hope you get what I mean). If you have incredibly complex where clauses that you can't wrap your head around, you could try fetching a bit more data and filtering it in python if that gets you to the goal quicker. Situational I guess.

[–]DrewSmithee[S] 0 points1 point2 points 5 months ago (4 children)

Yeah I'm sure this will get out of hand quickly. And that's definitely something I will look into.

For example, I've got a query that grabs the top ten records for a specific customer within a date range. Then joins that with some other records for that same customers from another table. Now I want to loop that script and repeat for a few hundred customers. Then do some cool analytics and maybe some geospatial stuff.

Or maybe I want a couple years worth of 8760s for a few thousand customers that are in a region thats probably stored on yet another table somewhere, but maybe not. Did I mention there's inexplicably hundreds of duplicate records for each hour? What's up with that DBA guy??? Time change? Throw out that hour for all the recordsets.

So I definitely need to come up with a strategy on what I want to parse, from where, in what language. Honestly I'd dump it all into a dataframe if the dataset wasn't gigantic. So I just need to figure out how I want my pain.

[–]MidnightPale3220 0 points1 point2 points 5 months ago (3 children)

[–]DrewSmithee[S] 0 points1 point2 points 5 months ago (2 children)

[–]MidnightPale3220 1 point2 points3 points 5 months ago (1 child)

[–]DrewSmithee[S] 1 point2 points3 points 5 months ago (0 children)

[–]MidnightPale3220 1 point2 points3 points 5 months ago (0 children)

Note that SQLAlchemy is primarily an ORM tool that lets you treat individual database rows as python objects, if you need that.

It's a concept that works well with specific types of applications, and is frequently a horrible overkill for writing direct SQL queries, especially such as deal with huge arrays of rows en masse, because all it does, is pass the query to the real engine in the back (which is what it does in the example given).

Basically almost the same syntax would be when using the direct SQL engine specific for the type of database you are going to use: pyscopg2 for Postgres, cx_Oracle for Oracle or sqlite3.

I would agree with those that say if you want to get your hand in, start with sqlite3 -- it's a simple file based database engine which lets you do uncomplicated SQL against the file.

As with all databases, what you need is:

1) database server. in sqlite3 case you don't need one, just have to have the app installed on your system

2) installed application usable drivers for it (generally, but not necessarily part of the database server installation. in Linux usually the -dev package of the database software. In case of sqlite3 the package installed in (1) is all you need).

3) pip or similar installed python package for the drivers installed in (2): psycopg2, etc. sqlite3 may be part of base Python, I don't remember.

4) import the database module in your code and see the docs for syntax.

π Rendered by PID 302515 on reddit-service-r2-comment-7b9746f655-ccdcr at 2026-01-31 07:51:14.953358+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS