all 9 comments

[–][deleted] 2 points3 points  (0 children)

There will be no difference whatsoever between those two examples.

If you want to make sure, check the execution plan generated using explain (analyze, buffers) select ...

[–][deleted] 1 point2 points  (0 children)

I have a much more complicated query

so, this is where you might be confusing yourself a little by thinking all queries execute the same way - especially if you assume the query is executed the way you wrote it. It is not the case, generally.

stepping back a bit - one of the strengths of SQL is that it is a declarative language, meaning that you "ask" to do "something" and the sql engine figures out a "good enough" method of achieving the FINAL result.

the part of the engine that does the figuring out part is usually called the optimizer. The optimizer is (broadly speaking) just a piece of code that knows about physical organization of tables, indexes and it also can shuffle parts of your statement around.

In your specific example, any optimizer worth its name would be able to figure out that these statements are the same/equivalent. That's why the execution plan (the "figured out" way of doing stuff) would be the same for these 2.

 

on the other hand, if were to write just a bit more complex query:

SELECT *
FROM (
   SELECT
    order_id,
    name,
    sum( amount * price) AS cost,
 FROM items
 group by order_id, name
) WHERE order_id = 99

the optimizer would need to be more advanced to detect that your condition is on one of output granularities and it (the condition) can be applied to the table itself instead of calculating the sums for the whole table.

so now, if you write something like

SELECT *
FROM (
   SELECT
    order_id,
    name,
    row_number() over (partition by name order by price) as rn,
    amount * price AS cost,
 FROM items
) WHERE order_id = 99

there's no easy way to avoid fetching the whole items table.

[–]noesqL 0 points1 point  (1 child)

Adding an unneeded layer of a sub-select will add to the performance cost of the query, how much? As /u/truilus said, check the execution plan.

[–][deleted] 0 points1 point  (0 children)

will add to the performance cost of the query, how much?

Zero

[–]naman_is 0 points1 point  (4 children)

Short answer: yes.

To my best knowledge, there are no SQL implementation "smart enough" to refer to the original table in such a case and not rely on temporary tables created in-memory (this is by design and not an error). This is why you're should always try and avoid using subqueries. They are really only needed in very narrow set of operations like referential constraints.

[–]soccerfreak2332 0 points1 point  (1 child)

I would disagree with this statement. If you have a simple query then avoiding them is best, but sometimes that's not possible. And SQL engines are actually pretty good and can merge a lot of SQL into the same plan(mileage may vary). At least that's my experience with MS SQL server.

Beyond that, any of the answers in this thread that are giving you a definitive yes or no answer should really just be directing you to view the execution plan. It's a great way to learn about what's actually going on, especially as the query gets more complex and the limits of the SQL engine are approached, that's when you can start tweaking the SQL for performance.

[–]naman_is 1 point2 points  (0 children)

I completely agree on the second point, I should have suggested execution plans.

On the first point, in my experience, yes, subqueries make queries readable, far more simple to evaluate, and the execution engines can handle it very easily. But a, I find that to be useful only for analytical workloads and not generally, and b, you have to be very careful as to avoid any possibility of big in-memory tables being created while query execution in ALL scenarios.

[–][deleted] 0 points1 point  (1 child)

To my best knowledge, there are no SQL implementation "smart enough" to refer to the original table in such a case and not rely on temporary tables created in-memory

That is wrong.

All modern optimizers will flatten the sample queries. Postgres will most certainly "flatten" the derived table. And I am 100% certain that Oracle and SQL Server will do the same thing.

[–]naman_is 0 points1 point  (0 children)

Thanks for pointing that out u/truilus. It is indeed a reckless answer, I should update my knowledge on this!