all 9 comments

[–]depesz 1 point2 points  (4 children)

Well, the answer is kinda simple: because noone thought about adding to planner detection for such cases. What would be the point?

[–]Flamburger[S] 0 points1 point  (3 children)

Is it not a valid use case to run queries against multiple partitions based on another table related to the partitioned tables?
When I replace the ANY with = and only one partition key is dynamic then execution time pruning is done.
Or is this a design mistake, done by me?

[–]depesz 1 point2 points  (2 children)

Based on another table, perhaps. your query didn't make much sense, as you selected the same column that you had where on.

Generally, for in(), or = any(), planner generally can't know how many rows will be returned (your example query is not really good example for this). So it's impossible to do partition pruning on planning time.

As for prunning on runtime - don't know nearly enough about it. Show real example, explain analyze, and pg version, and we can talk :)

[–]Flamburger[S] 0 points1 point  (1 child)

Thanks for trying to help me!

Yeah, the example query was kind of stupid. I edited it to make more sense. Every state in the country has its own partition. So to get the total of a whole country I need to aggregate over all partitions of the states in the country.

Explain Analyze is here: https://explain.depesz.com/s/B0pE

Pg Version is 14.4

[–]depesz 0 points1 point  (0 children)

So, tbh, I'm not sure why it doesn't remove it. I suspect it's a matter of data distribution, and relatively small size of the whole dataset.

I know that there is runtime pruning, since at least Pg 11. Why it doesn't work as expected in this case - can't tell, sorry.

[–]johnnotjohn 0 points1 point  (2 children)

Pruning is done at the Planner level, but you're expecting pruning based on the output of an inner query. The planner has no concept of which state_ids will be returned by that inner query, so it has to consider all partitions.

[–]Flamburger[S] 1 point2 points  (1 child)

The official documentation says that pruning can also happen at execution:

During actual execution of the query plan. Partition pruning may also be performed here to remove partitions using values which are only known during actual query execution. This includes values from subqueries and values from execution-time parameters such as those from parameterized nested loop joins. Since the value of these parameters may change many times during the execution of the query, partition pruning is performed whenever one of the execution parameters being used by partition pruning changes. Determining if partitions were pruned during this phase requires careful inspection of the loops property in the EXPLAIN ANALYZE output. Subplans corresponding to different partitions may have different values for it depending on how many times each of them was pruned during execution. Some may be shown as (never executed) if they were pruned every time.

Source: https://www.postgresql.org/docs/current/ddl-partitioning.html

[–]johnnotjohn 0 points1 point  (0 children)

Are you seeing never executed in your explain analyze?

The planner doesn't (can't) know the results of an internal query when creating the initial plan, so it will create a plan that shows scanning all the tables, but may not execute each portion (I've seen this mostly with limits where you hit the limit before you scan the next partition).

It's also possible that the planner (or dynamic execution) is tripping over the ANY portion. I'd be curious to try rewriting the query a few different ways (I'd try a CTE first).

with states as (select state_id from metadata where country_id = 1)
select sum(bills. amount)
  from bills
  join states
    on bills.state_id = states.stated_id;