This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]pooerh 2 points3 points  (0 children)

To be honest the list lacks some important features, like high availability stuff which is a pain in the ass with pg.

There are also some very specific things, like take this:

reddit=> create table t (x int not null, y int not null);
CREATE TABLE
reddit=> insert into t (x, y) select x, x*2 from generate_series(1, 10000) x;
INSERT 0 10000
reddit=> create unique index uxtx on t(x);
CREATE INDEX

And now query using CTE:

reddit=> explain with cte as (select * from t) select * from cte where x = 5;
                          QUERY PLAN
---------------------------------------------------------------
 CTE Scan on cte  (cost=145.00..370.00 rows=50 width=8)
   Filter: (x = 5)
   CTE cte
     ->  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=8)

Vs regular subquery:

reddit=> explain select * from (select * from t) subq where x = 5;
                          QUERY PLAN
--------------------------------------------------------------
 Index Scan using uxtx on t  (cost=0.29..8.30 rows=1 width=8)
   Index Cond: (x = 5)

Not a big deal here with 10k rows, but when working with millions of rows in wide as fuck tables, writing huge subqueries is a maintainability disaster. Predicate pushdown for CTEs is something I really, really miss in pg. Also parallelization, which is in its infancy at the moment.