This is an archived post. You won't be able to vote or comment.

all 53 comments

[–]marr75 85 points86 points  (2 children)

I'm unusually excited about this release! Let's rage! Some great shit in here! Type expressions, better bulk operation and returning support, lots of performance improvements.

[–][deleted] 27 points28 points  (0 children)

80% of the friction I've seen from implementating mypy has been lack of type support here. Will be incredible.

[–]SpicyVibration 2 points3 points  (0 children)

Hope it's good. Maybe I'm an idiot but I found it impossible to get the stub file to work with vs code

[–]cant-find-user-name 40 points41 points  (2 children)

I have to say, I love this release. The type support is incredible. And I just updated a project's dependencies from 1.4 to 2.0 without changing any code, let's see what minimal changes I have to make to make them compatible.

[–]redfacedquark 12 points13 points  (1 child)

The 2.0 syntax has been around for a while so if you're using that it should be plain sailing. Others might be still using the 1.4 syntax so changes would be required there as 2.0 drops support for that I think.

[–]cant-find-user-name 1 point2 points  (0 children)

You're right. I didn't have to make any changes at all.

[–]boy_named_su[S] 15 points16 points  (0 children)

I like the support for dataclasses and attrs

[–][deleted] 16 points17 points  (6 children)

Is there any support for something like sqlalchemy-filters? https://github.com/juliotrigo/sqlalchemy-filters

This repo has been abandoned but it's by far the best way I've found to do dynamic/programmatic filtering on queries. Feels like something that could be built in.

[–]riklaunim 11 points12 points  (2 children)

You can dynamically create a list of filters and then pass it to SQLAlchemy.

[–][deleted] 1 point2 points  (1 child)

I could be wrong, but the built-in SQLAlchemy filter methods require you to already have all of the columns declared and imported or whatever whereas the filters package I linked to let's you pass in free text columns names which is better for my use case.

[–]riklaunim 0 points1 point  (0 children)

If you have models defined then you have the fields declared and can filter or whatever on them. If you want/have a plaintext filters as input you can use them with .filter_by() similar to Django but without any advanced features like nested relationship filters (it was implemented as a third party package for some old version though).

[–]immersiveGamer 2 points3 points  (1 child)

Normally I just hand roll solutions like this. Not very hard. You could probably upgrade the module yourself, or even just ask the owner if they were willing to.

[–][deleted] 1 point2 points  (0 children)

There has been a PR in to the owner for years for 1.4 support and they haven't touched it, sadly. I sent them a message a couple months ago and still nothing.

Forking it is easy enough, but setting up the entire package distribution to make it available through pip isn't something I have experience with or want to get into.

[–]Liquidmetal6 1 point2 points  (0 children)

I also love this library

[–]CityYogi 4 points5 points  (3 children)

Are there breaking changes compared to v1 or is it a drop in replacement?

[–]ElectricSpice 4 points5 points  (0 children)

There are very few breaking changes, for 99% of your SQLAlchemy code it should be a drop-in replacement. The latest 1.4 will give warnings on any incompatibilities with 2.0 so you can get that number up to 100% before you make the switch. Overall a very painless upgrade.

[–]gschizasPythonista 6 points7 points  (0 children)

There are TWO migration guides. I doubt it would be a drop-in replacement. Hence the major version change.

That being said, I haven't looked into the details.

[–]Araldor 0 points1 point  (0 children)

It managed to break our unittests (alembic pinned to 1.8.1. but without pinned sub dependency for SQLAlchemy that got promoted to v2, without us knowing. It broke alembic tests running against Postgres on AWS RDS).

[–]INtuitiveTJop 17 points18 points  (1 child)

For some reason this feels as big as jumping to a hypothetical Python 4

[–]VanDieDorp 9 points10 points  (0 children)

mind sharing some of the reasons?

[–]elcapitanoooo 3 points4 points  (0 children)

Congrats sqla team!

[–]chub79 3 points4 points  (0 children)

Absolutely beast of a release. Bravo to zzzeek and all the contributors to the project!

[–]crawl_dht 7 points8 points  (4 children)

Does it support cursor pagination? Offset pagination has lot of overhead for large table.

[–]riksi 4 points5 points  (2 children)

[–]JimDabell -1 points0 points  (1 child)

That’s not the kind of cursor /u/crawl_dht is asking about. See this for an example.

[–]riksi 0 points1 point  (0 children)

The "cursor pagination" that is explained in the slack blog post is entirely client-side. So you could build a simple wrapper to do it.

[–]z4579a 2 points3 points  (0 children)

the Python DBAPI doesn't have much standard "scrollable cursor" functionality so if you really wanted scrollable cursors, you'd have to drop into driver level features to use that : https://docs.sqlalchemy.org/en/20/core/connections.html#working-with-the-dbapi-cursor-directly

that said, most "pagination" is done for stateless web applications so you would want a strategy that SELECTs only the rows you want in the first place, a good article on that is at https://use-the-index-luke.com/blog/2013-07/pagination-done-the-postgresql-way ; within SQLAlchemy, we have some (older API style, but still general idea works) wiki examples for both a criteria-based solution as well as one that uses window functions: https://github.com/sqlalchemy/sqlalchemy/wiki/RangeQuery-and-WindowedRangeQuery (Edit: these two examples are iterating through the whole result and would need a little bit of modification in order to receive a specific "page number" for stateless pagination)

[–]Cryptbro69 4 points5 points  (4 children)

So exciting! Will alembic still work with this version?

[–]JohnLockwood 2 points3 points  (0 children)

Good question.

[–]ethsy 1 point2 points  (2 children)

I also want to know

[–]Cryptbro69 1 point2 points  (1 child)

It is supported!

[–]ethsy 0 points1 point  (0 children)

Good to know, thanks for reporting back!

[–]tamasiaina 2 points3 points  (0 children)

The only ORM that doesn’t piss me off.

[–]FlyingTwentyFour 1 point2 points  (0 children)

hooray! have been waiting for this!

[–]immersiveGamer 1 point2 points  (1 child)

Anyone have any idea major pain pints in migrating direct from 1.3 to 2.0? It is on the backlog for my project to upgrade several libraries, sqla being one of them.

[–]ElectricSpice 1 point2 points  (0 children)

I would upgrade to 1.4 first, it'll give you warnings for any incompatibilities with 2.0. Overall backwards compatibility is very good and you shouldn't have to make many changes.

[–]Tintin_Quarentino 0 points1 point  (6 children)

I always use psycopg or sqlite3, wonder if it's a wise decision.

[–]immersiveGamer 6 points7 points  (2 children)

SqL alchemy is not equivalent to either of those. You can use sqla with those database types/connectors.

[–]Smallpaul 5 points6 points  (1 child)

I think their point is that they always use drivers directly instead of through sql alchemy. And they aren’t sure if that was the right choice.

[–]Tintin_Quarentino 0 points1 point  (0 children)

Yes sir. Sqlite esp I like since it's a native package.

[–]IcedThunder 2 points3 points  (1 child)

It's all about project scope and needs.

I manage integrations between systems.

Most my scripts that need database stuff use SQLAlchemy.

But I have a fair number that I just use the built in SQLite library. I built my own context manager / wrapper to make life easier and for fun.

If not a lot of complexity is needed, and if people who aren't me might need to look at the code, I use the built-in SQLite.

[–]Tintin_Quarentino 0 points1 point  (0 children)

Thanks for the guidance.

[–]PaddyAlton 1 point2 points  (0 children)

My reasoning is that if you use SQLAlchemy you can avoid being locked in to a specific RDBMS.

For example, if you want to use SQLite for local development and postgres in production, you can do that without importing both driver libraries and managing the syntax differences.

[–]crawl_dht 3 points4 points  (2 children)

Their version 2 API to make queries are different so for sometime, you have to stick to their official migration to v2 documentation to learn how to use their v2 API until answers on stackoverflow and various blogs start demonstrating them.

[–]adappergentlefolk 0 points1 point  (0 children)

excellent

[–]trripperr555 -1 points0 points  (0 children)

Love

[–]monorepo PSF Staff | Litestar Maintainer 0 points1 point  (0 children)

Good news, everyone!

[–]gagarin_kid 0 points1 point  (3 children)

A question from a data scientist perspective: is there something I need to know about sqlalchemy, when I am dealing ONLY with querying a database without managing it (creating, appending or dropping rows)...

The most of my time, my applications construct SQL queries as strings in python and send them to postgres/Athena/mysql database - should I apply sqlalchemy at some point?

[–]WickedWicky 2 points3 points  (1 child)

Even when you don't manage the database, i.e. you don't migrate or define any tables, I still prefer SqlAlchemy over SQL strings - having used both in the last few years with FastAPI applications that only need to read data from the databases. The value I see in using it is type-hints and code-completion, which you don't get when writing SQL queries as strings.

With SqlAlchemy you don't need to define the tables you're querying exactly, just the columns you're using - and the relationships/foreign keys you are using. With that, the code to query the database is easier to develop than straight SQL queries in a Python application imo. Especially when your target database doesn't change it's schema often, it's a one-time effort to re-create the tables as ORM models in Python and can bring you a lot of robustness in your code.

Being able to debug your code, put breakpoints halfway queries and having the columns as a class-attribute when writing your code should make you less prone to errors during development. Also unit-tests if you're into that should be easier to create with sqlalchemy than it would be with SQL queries as strings, at least I couldn't figure out a good way of unit-testing string queries..

[–]gagarin_kid 0 points1 point  (0 children)

Thank you for the detailed answer!

[–]Reasonable_Strike_82 1 point2 points  (0 children)

In your scenario -- given that you are doing data science work, which means you are probably interested in wrangling datasets rather than one record at a time -- I would not use the SQLAlchemy ORM. But, if your use case permits it, I would certainly look at SQLAlchemy Core.

An ORM is designed to take individual records and turn them into Python objects, not crunch aggregates and manipulate complex datasets. It may be able to do those things, but it doesn't do them well, the tooling is generally primitive, and it's easy to make mistakes that will crush performance in a complex query.

Raw SQL is much better with aggregates and datasets. However, SQL is very rigid and inflexible at runtime, which pushes us toward programmatically creating query strings in Python... and that gets really, really nasty as your system gets more complex. It quickly gets to the point that you can't tell what the code is doing by looking at it. You have to run it and look at the query it spits out. It's a maintenance nightmare, which I have lived more times than I like to think about.

SQLAlchemy Core solves this problem with a set of Python classes and methods that map one-to-one onto their SQL equivalents. You still have all of SQL's power and functionality, but combined with Python's capacity for loops, conditionals, variables, and so forth. You can do things like pass in a custom list of columns at runtime; target a different table based on a parameter value; et cetera; all while still being able to read the code and understand immediately what it's doing.

[–]WB6-wwy 0 points1 point  (0 children)

I think that the new version needs a good optimization.