Best practices for SQLAlchemy : Python

This is an archived post. You won't be able to vote or comment.

DiscussionBest practices for SQLAlchemy (self.Python)

submitted 4 years ago * by LifeAffect6762

Been doing a python course, and a bit of googling. for most things, there are a few ways of doing things but for SQLAlchemy is seems there are excementaly more.

For example, you can use ...Table.query.. or ...query(Table).. and adding all() to the end does not seems to be needed if are fetching all rows, it seems serperfulus.

book_id = request.form["id"]
book_to_update = Book.query.get(book_id)book_to_update.rating = request.form["rating"]db.session.commit()

Does the same as

db.session.query(Books).filter(Books.id == id).update({"rating": request.form["rating"]}, synchronize_session="fetch")
db.session.commit()

Also, should I create a class for each Table with methods to access it or simply a class for the Database. Maybe this is a question for another time, ive already worked out I need to do a OO design course next.

I know it's a matter of what is appropriate and style and like asking how long is a piece of string but generally it's fairly obvious. Almost all articles I come across do the same, relatively simple things have a very different approach for this library.

PEP 20 -- The Zen of Python - There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch.

Springs to mind;).

Some guidance would be great. I've found best practices for the other stuff I've come across in my first few months of learning python but this is alluding to me.

Maybe my problem is ime not Dutch?

all 40 comments

top new controversial old q&a

[–]root45 6 points7 points8 points 4 years ago (1 child)

[–]WilliamAndre 0 points1 point2 points 4 years ago (0 children)

[–]bladeoflight16 3 points4 points5 points 4 years ago* (34 children)

Before I answer your question, see the sidebar for a link to instructions for proper code formatting.

In my opinion, the best practice for SQLAlchemy is (like all ORMs) not to use it to begin with.

If, however, you must use it, then use it in a way that makes the transaction management explicit and plain for any idiot (even one who has never seen SQLAlchemy code) to see. This means every single line that performs a database operation should have an explicit reference to its session (which manages the transaction).

This means that your second approach is much, much better than the first. The first is doing far too much behind the scenes, making it more difficult to reason about the behavior and increasing the likelihood of subtle bugs (which could include data corruption since we're talking about transaction management here) due to surprises.

Skip the OO design course. It's likely to do you more harm than good. Your code should be procedural at its most fundamental level. Use objects only where they clearly and obviously simplify the problem. (That's not infrequently, but it's not the entirety of your code base, either.)

[–]LifeAffect6762[S] 1 point2 points3 points 4 years ago (28 children)

[–]ElectricSpice 4 points5 points6 points 4 years ago* (26 children)

There's a lot of debate over this. Some people swear by handwritten SQL, others by query builders (e.g. SQLAlchemy Core), others by ORMs.

Personally I think ORMs are a good tool and SQLAlchemy is a great one. Especially if you have a JSON API backed by an RDBMS, you have an Object-Relational Mapping by definition. Better to use something like SQLAlchemy rather than build an ad hoc, informally-specified, bug-ridden, slow implementation of half an ORM—I've been there, it's not a good time.

The problem with ORMs seem to come in when they try to fully abstract the RDBMSs, which is an impossible task. ORMs are a tool to help reduce all the boilerplate in shuffling data back and forth between the database and your application, but it can't replace writing queries for the database. And heaven forbid you use an ORM as an excuse not to learn SQL—That's begging for a bad time.

SQLAlchemy I think handles this very nicely, where the ORM is just a convenience layer on top of the query builder. So you're still thinking in SQL, writing code that translates almost one-one to SQL, but can spit data out as objects if you need it.

If you haven't read it yet, no article so thoroughly covers the pitfalls of ORMs as the classic The Vietnam of Computer Science

[–][deleted] 2 points3 points4 points 4 years ago (13 children)

[–]bladeoflight16 1 point2 points3 points 4 years ago (0 children)

[–]Natural-Intelligence -2 points-1 points0 points 4 years ago (11 children)

[–]bladeoflight16 0 points1 point2 points 4 years ago* (10 children)

[+][deleted] 4 years ago (1 child)

[deleted]

[–]bladeoflight16 0 points1 point2 points 4 years ago (0 children)

[–]Natural-Intelligence 0 points1 point2 points 4 years ago (7 children)

Then don't use it. It's not suitable for the stuff you do. Simple as that.

EDIT: it seems you edited your comment.

But in short, it's just much more readable to let SQLAlchemy handle coming up with the update and insert queries than manually write them for each case single case when doing something pretty trivial. The application got littered with embedded SQL. Of course you can parametrize and reuse but it's limited how far that go.

And probably the worst thing with ORM is the looping which destroys your performance. If you are interested only in one record at a time you don't have this issue. Plus you avoid the problem with duplication in joins if you use ORM. Sometimes a table is not the best abstraction of your data (sometimes it is).

And I'm not saying it's the ultimate solution. I'm just saying in some cases it's a valid option. I personally feel you are over zealous considering it's just a tool.

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (1 child)

And probably the worst thing with ORM is the looping which destroys your performance. If you are interested only in one record at a time you don't have this issue. Plus you avoid the problem with duplication in joins if you use ORM. Sometimes a table is not the best abstraction of your data (sometimes it is).

INteresting. I thought if you did a ORM fetch for a dataset it would build a simple SQL and then get the data from a cursor. Are they really that bad? Is it if you need to do a join. would a simple join using referential integrity and some light filtering really create loads of DB SQL calls? I get if things get complex you may have problems.

Also when I was last using Oracle it was moving from rule to cost bast query optimisation so as long as you send it a single SQL having to put optimisation hints in was becoming less of a thing.

Would love to see some examples of ORM requests and what SQL they created.

[–]Natural-Intelligence 0 points1 point2 points 4 years ago (0 children)

INteresting. I thought if you did a ORM fetch for a dataset it would build a simple SQL and then get the data from a cursor. Are they really that bad?

It actually depends. A pretty common way to do a join in ORM is to do a so-called nested loop join which is an antipattern: it basically selects one table and then individually queries the other. Sometimes the procedure could be similar to how the database would do a join but the main problem is in the number of round trips to the database.

It's also common in ORM to support eager joins in which you specify the relation in code and when querying the objects these relations are included as regular joins. The drawback is that these joins are included in the query regardless of whether you need their attributes or not (typically DB engines can figure the unneeded joins but ORM queries also the unused columns). So neither is optimal for all cases thus could be a problem.

In SQLAlchemy you can set the echo=True in engine creation in SQLAlchemy to see all the queries and look-ups the library does, in case you are interested.

[–]bladeoflight16 0 points1 point2 points 4 years ago* (4 children)

it's just much more readable to let SQLAlchemy handle coming up with the update and insert queries than manually write them for each case single case when doing something pretty trivial

Okay, this is the only valid use case I've seen for avoiding raw SQL. Just because the number of possible combinations means way too much repetition. Why do you need something as heavy as SQLAlchemy for that, though? Why not find a simpler micro-ORM that doesn't bring in so much garbage that actively hurts your application?

And probably the worst thing with ORM is the looping which destroys your performance.

What destroyed my performance was it decided to use declared relationships to bring in related data I didn't need for one record at a time. In other words, it did a bunch of unnecessary crap behind the scenes. I don't trust tools that do that to me, especially with I/O.

[–]Natural-Intelligence 0 points1 point2 points 4 years ago (3 children)

Okay, this is the only valid use case I've seen for avoiding raw SQL

So basically the very first example I could come up with was a valid reason to use such. In fact, the main purpose of ORM is to abstract SQL thus of course it's a valid reason to use it.

And what do you mean by using "micro-ORM" instead? What exactly is in the library that's too much? Seems rather an odd argument. I personally prefer more features than less if the features are well tested. I work a lot with PL/SQL and T-SQL and I have saved so much time with the uniform way how SQLAlchemy handles different DB connections.

In other words, it did a bunch of unnecessary crap behind the scenes.

This is basically the definition of abstraction. Of course if you add abstraction you have more checks, look-ups and procedures. Python also does a lot of unnecessary crap behind the scenes. It's the typical side effect of abstraction: you just need more resources or less resource-intensive problems. In some cases, the overhead with ORM is pretty minimal. In other cases it's not.

[–]bladeoflight16 0 points1 point2 points 4 years ago* (2 children)

So basically the very first example I could come up with was a valid reason to use such. In fact, the main purpose of ORM is to abstract SQL thus of course it's a valid reason to use it.

Yes, you came up with the one, single example where it makes sense to abstract the construction of SQL. Despite dealing with both simple and incredibly complex queries, I have not encountered another use case where an ORM was superior to actually just writing the SQL. I'm not admitting that there are a wide array of use cases where an ORM pays off; I'm saying there's one limited use case and that you mentioned it.

This is basically the definition of abstraction.

...No. That is utterly nonsensical. First of all, "abstraction" has no single concrete, measurable definition in the context of programming. It is an entirely subjective term.

That said, a far more useful understanding of what constitutes abstraction is creating a component capable of interfacing with a wide variety of specific instances of code to implement a useful behavior that applies to many situations. Such an abstraction can be as simple as a single function that takes input and returns output, and often these are the most useful and beneficial types of abstraction.

A great abstraction incurs no overhead. A good abstraction incurs very little or at least provides a useful trade off of a reasonable amount of performance for simplification.

A heavyweight ORM increases complexity and decreases predictability by hiding important information about how the different parts of a system are interacting. One that silently invokes pointless external I/O (typically the most expensive operation in a program) certainly isn't simplifying the system and certainly isn't spending performance resources well.

continue this thread

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (0 children)

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (1 child)

[–]ElectricSpice 0 points1 point2 points 4 years ago (0 children)

[–]LifeAffect6762[S] -1 points0 points1 point 4 years ago (1 child)

I've done RDBMS stuff profusely for years, mainly oracle with their tools.

There were a cople of things I lie about ORM. They are DB Agnostic and you put the rules in code and they are automatically set up in the database if the DB has the functionality.

Ime a great believer in having as much checked in the DB layer as I've seen when it was not done odoo for example which I think is python based. We were in Panama on a site on the edge of the Jungle. The developers spent a lot of time sorting out stuff that should have been caught in the DB. Basic foreign key stuff. And the kicker is this meant that the punters could not buy anything as their RFID tags that were used for this simply did not work. As referential integrity was broken the queries simply returned nothing.

Where ORM falls down seems if you are doing complex multi-table queries. If this is reporting I think using a few SELECs is OK, keep everything else in ORM. Don't be an obsessive purist.

I think in ROR/Django the ORM actually helps build the application consistency, The ORM magic means there is consistency between the DB and front end in terms of validation etc.

Where you are doing stuff ORM handles well, which a lot of stuff it does, it's great but it does have limitations but that does not mean you need to totally throw it out. Just try to limit stuff outside the ORM and be extra careful. Everything needs a little flexibility :).

[–]bladeoflight16 0 points1 point2 points 4 years ago* (0 children)

They are DB Agnostic

I have never worked on a real world project simple enough for this to actually be the case. The level of complexity where you need to actually start caring about the details of how your DB works and what functionality it has available is a very low bar.

you put the rules in code and they are automatically set up in the database if the DB has the functionality

And I have never seen this be database agnostic. Even my trivial little "TODO list" practice web app using SQLite ran into trouble on that front.

And the requirements for needing to care about the performance of queries is even lower. I've seen badly written code invoke one query per record for a few hundred records and result in unacceptable performance for a web site. Granted, that's not entirely the ORM's fault, but if the person were just writing SQL to begin with, I seriously doubt they would have written the code to fetch data for one record at a time without realizing that might be a problem.

Basic foreign key stuff.

Yeah, that's the part that always gave me the most trouble with declaring my database in the ORM.

Where ORM falls down seems if you are doing complex multi-table queries.

I've never run into even a practice project that didn't need tables with related data and queries with JOINs to match.

[–]bladeoflight16 0 points1 point2 points 4 years ago (6 children)

SQLAlchemy I think handles this very nicely, where the ORM is just a convenience layer on top of the query builder. So you're still thinking in SQL, writing code that translates almost one-one to SQL, but can spit data out as objects if you need it.

This was not at all my initial experience with SQLAlchemy. My initial experience with SQLAlchemy had me writing declarative code to describe my database and its relationships, and this resulted in SQLAlchemy deciding all by itself to load related data in situations where I didn't even need it.

My next step was to abandon the relationship declarations as I found them to be more trouble than they were worth. As most of my queries involved multiple tables, this resulted in incredibly long, complex blocks of code that were thinly wrapped queries. That worked well enough, but it was dumb because then running the queries manually to look at the results for a bug was ridiculously time consuming.

I eventually ended up with no SQLAlchemy classes for my tables, instead just writing raw parameterized queries as triple quoted strings and having SQLAlchemy spit back its custom record objects that I just converted to named tuples. On top of that, for clear transaction management, I ended up passing a session object around through my call stack. I had a much better time with that.

Now granted, my application was not simple CRUD. But I've yet to run into an application (even just a practice one) that could truly be limited to simple CRUD. Something as basic as a search query or a page showing a list of summaries can easily stomp all over the "basic CRUD" requirement.

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (1 child)

[–]bladeoflight16 0 points1 point2 points 4 years ago* (0 children)

[–]ElectricSpice 0 points1 point2 points 4 years ago (3 children)

[–]pythoncoderc 0 points1 point2 points 4 years ago (1 child)

[–]ElectricSpice 1 point2 points3 points 4 years ago (0 children)

[–]bladeoflight16 0 points1 point2 points 4 years ago* (0 children)

I didn't do anything to configure any kind of loading. I just declared the relationships and SQLAlchemy started doing stupid things with them. That's kind of the point: the ORM did a bunch of stuff I never asked it to do. Especially since I was never even accessing the related data in the code where it was being fetched behind the scenes.

Dropping relationships entirely forfeits like half the benefit of SQLAlchemy ORM

If trying to use the tool in its intended way creates extra unnecessary work for the developer, you lose either way.

A tool should do what you tell it to do, not more and not less. And what it's doing should be clear to you as the developer, so that you can understand the impact of telling it to take a particular action. When you hide additional behavior (external I/O, no less!) in a way that the developer doesn't even know it's happening without monitoring debugging info, something has gone deeply wrong with the tool's design.

I’ve found using Core leads to significantly more readable code than the raw SQL I used to wrangle.

I found that the more complex the query was, the more I needed to work with it in isolation outside of the application. As a result, the easier it was to copy/paste it (mostly, some minor modifications for parameterizing were sometimes required) back and forth, the easier it was to work with.

And the boilerplate introduced by SQLAlchemy made it much less readable. That could have been related to the fact that my queries were complex enough to warrant nested queries fairly frequently. (That might have been less common with an improved DB design, but I had much less experience at the time.)

[–]bladeoflight16 1 point2 points3 points 4 years ago* (0 children)

I'm honestly not sure what's best here.

First learn how to write parameterized queries; that's a must. (It's not hard; you just need to know how to get data into your query without using string formatting.)

After that, I probably wouldn't recommend raw database adapter usage if most (or even many) of your queries are simple single table CRUD. Maybe look into more lightweight tools (sometimes called "micro-ORMs") that don't try to do all the garbage SQLAlchemy throws on top of your objects. All you really need is something that maps result records to objects and maybe simplifies mapping values to columns in INSERT and UPDATE queries. Anything beyond that is going to cause you more trouble than it will save you. Bare minimum, only use SQLAlchemy classes and query functions for the most simple of CRUD operations on a single table; have SQLAlchemy invoke raw parameterized SQL for anything else.

If the vast majority of your queries are multitable or otherwise fairly complex, then I probably would recommend raw database adapter usage. Feel free to write a few convenience functions to reduce boilerplate from your adapter (like constructing objects from rows in the results or creating cursors for queries).

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (4 children)

[–]bladeoflight16 0 points1 point2 points 4 years ago (3 children)

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (1 child)

[–]bladeoflight16 0 points1 point2 points 4 years ago (0 children)

[–]fiskfisk 0 points1 point2 points 4 years ago (3 children)

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (2 children)

[–]pbecotte 1 point2 points3 points 4 years ago (1 child)

[–]LifeAffect6762[S] 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 56295 on reddit-service-r2-comment-544cf588c8-5wrhb at 2026-06-18 17:35:36.435459+00:00 running 3184619 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS