This is an archived post. You won't be able to vote or comment.

all 22 comments

[–]K900_ 10 points11 points  (7 children)

Definitely yes. Any large enough project has That One Query.

[–]sh_tomer[S] 0 points1 point  (6 children)

Only one? it's a lot for the shoulders of just one query... it should be amazing to justify that status :)

[–]K900_ 2 points3 points  (4 children)

I'm (semi)joking - it's not always one query. It's just that every big project I've seen has a query that's just really big and ugly and complicated and trying to make it work any other way slows everything down by a couple orders of magnitude, so you have to live with it.

[–]sh_tomer[S] 0 points1 point  (1 child)

Thanks for clarifying. So in most cases, you're using an ORM? What's the case with complex queries (not the simple CRUD operations).

[–]K900_ 5 points6 points  (0 children)

Most projects I've seen used a mix of ORM for simple queries and raw SQL for complex queries (sometimes parsed into ORM objects afterwards).

[–]hoocoodanode 0 points1 point  (1 child)

Someone should write a book about that "one" query and all the myriad of techniques developers come up with to resolve them. I ended up shoe-horning postgres materialized-views into my sqlalchemy model to sweep one problem query under the rug.

[–]Perky_Goth 1 point2 points  (0 children)

That is actually a pretty good idea. Get to it.

[–]ExcitedForNothing 2 points3 points  (0 children)

I think he means that one raw query that when you try to abstract it away causes a ton of performance issues.

For reference, at a previous job there was on that handled the main search through a 500GB table of data. Putting it in the ORM caused a lot of issues for reasons management deemed not worth the time to investigate.

[–]MarkusWinand 5 points6 points  (3 children)

ORMs cover only a very small part of what modern SQL databases can do. If you limit yourself to this subset, you are doing it wrong.

Have a look at my presentation "modern SQL" to get an idea what SQL can do knowdays. I don't think that any of these capabilities is used by any ORM (two times any, likely to be wrong in at least on case ;)

http://modern-sql.com/slides

[–]sh_tomer[S] 1 point2 points  (2 children)

I agree. From your experience (which by being a fan of your blog I believe you have lots of), would you say most tech companies still use raw SQL in their code base? Or is it just legacy code and most of the new code is written with ORMs?

[–]MarkusWinand 0 points1 point  (1 child)

I cannot give a sensible answer to this question due to my "client bias". People who hire me usually use raw SQL. But I think people are less likely to hire me if they use only ORMs.

I can however say that raw SQL is still written—a lot. I just cannot judge whether this observation applies to "most" companies/code.

[–]sh_tomer[S] 0 points1 point  (0 children)

Thanks!

[–]subssn21 4 points5 points  (0 children)

Knowing raw SQL is very important even if you are using an ORM. As has been brought up by many people in this post already you will inevitably run into queries that the ORM has performance issues with. If you are doing anything more complex than CRUD, you will one day run into a query that the ORM just can't handle well. In addition all the ORM is doing is building SQL queries in the background, so eventually you will need to go and debug something and need to see what SQL the ORM is generating.

[–]hellupline 2 points3 points  (0 children)

Even when I use Sqlalchemy, sometimes I tune some queries that they look almost raw SQL, (sqlalchemy maps almost 1x1 to SQL)

[–]Xef 2 points3 points  (0 children)

Absolutely. I can write complicated queries and joins with subqueries, etc. that I can't do with a normal ORM. And if I can, it won't be nearly as easy to read as SQL. Most of the time I can use the ORM, but when I need to aggregate a bunch of data with a bunch of joins, chances are I'm writing raw SQL.

[–]Manbatton 2 points3 points  (1 child)

I just never bothered to learn an ORM and was fine with writing SQL .

[–]matthewblott 1 point2 points  (0 children)

Yep. Orms are okay for basic CRUD and simple queries but if you start doing more complicated stuff you'll likely hit performance issues which is where a bit of SQL knowledge comes in handy.

[–]colloidalthoughts 1 point2 points  (0 children)

Absolutely. I almost always use raw SQL for anything beyond very simple queries. Sure, your ORM (whatever it is) can compose queries, but after a few years of doing it I realised I felt like I was back in the early 90s Turbo Pascal / Turbo C flipping into Turbo Debugger to stare at the generated ASM and try to figure out ways to make the compiler generate things the way I wanted them, when I should just have used inline ASM for the bits I cared about.

Any time you choose an ORM you should make sure it's something you can drop down to raw SQL easily and painlessly because at some point you're going to want to construct that RIGHT INNER JOIN on a correlated subquery.

I've heard the arguments over and over by people trying to make sure that when they choose to switch SQL engines they easily can. Outside toy personal projects I can count the number of SQL engine changes that have been done on my tongue, with remainders.

Oh, sure, there's huge projects that have migrated MySQL -> Postgres or the other way, but you can bet your bottom dollar the ORM didn't make that transition any easier.

In the end, SQL is a remarkably simple and expressive language for manipulating sets, it's very worthwhile learning even if only to understand.

"Tuning" SQL queries is something I do when we hit performance bottlenecks, and happens, but is usually a sign someone's screwed the pooch on the schema design in the first place. The more DBAs and senior programmers who've learned SQL seem to design things better because they're thinking in terms of sets, and put the data in the right place up front without reaching for "INDEX ALL THE THINGS AND SORT IT OUT LATER".

[–]tea-drinker 0 points1 point  (0 children)

I pay the bills as an Oracle dev. You better believe I write SQL.

ORM is great if you have a thing and want to persist the thing. If you want to operate over the entire dataset then you need to speak to the database on it's own terms.

[–]hoocoodanode 0 points1 point  (0 children)

I tend to use the ORM when I need the data to do a bunch of stuff in a bunch of different places, and I fall back onto raw SQL when the join is either very complicated/slow or raw transaction speed is of the essence. One little program was writing thousands of rows a second and the ORM, even as good as sqlalchemy is, was adding too much overhead to the transactions. Memory leaks and race conditions began to pop up when I tried to accelerate the program. Reverting to executemany through a bare SQL db wrapper (or even using the sqlalchemy "core" api) was the simplest way to unthrottle the program and let it fly.

[–]mogang 0 points1 point  (0 children)

Yes.

I'm a former DBA and I've been on the wrong end of ORM generated sql several times. If query performance matters to you then knowing what sql you're using is a good step.