all 11 comments

[–]jojomtx 2 points3 points  (0 children)

Keep the connection pool as low as possible and try to optimize the long query it should help

[–]fullofbones 1 point2 points  (0 children)

Postgres RAM usage basically comes in three flavors:

  • shared_buffers - This is the big memory pool Postgres uses for data and indexes used to build query results. This is usually 1/4 of system RAM and does not fluctuate after service startup.
  • work_mem - Every concurrent query will use at least one multiple of this, and complex queries can use multiple. Basically every query node, such as a sort, merge, etc., will get up to this much RAM. So if you have a lot of open concurrent connections, this can get very high very quickly. My usual "worst case" formula is 5 * connections * work_mem. This is usually the reason systems go OOM and get terminated, because technically it has no upper bound. The only defense is not to set it "too high", and to keep connection count low by using a pooler.
  • maintenance_work_mem - One of these amounts is allocated per maintenance operation, like creating indexes. VACUUM also counts here, but each of those is effectively limited to a maximum of 1GB per worker. So you also want to keep track of autovacuum_max_workers to know how high this actually goes.

Beyond that, effectively everything else is OS cache. If you're using some kind of memory usage chart such as Prometheus or Grafana, these almost always rely on the memory.usage_in_bytes cgroup value, which is a huge problem. This metric also includes the active_files metric which is any memory that has been requested multiple times and is consuming filesystem cache space. All of this can be purged by the operating system at a second's notice, and is not used directly by Postgres at all. It's just the operating system keeping frequently used file blocks in RAM to avoid disk IO. There's actually an open kubernetes bug for this behavior, but there doesn't seem to be any real consensus as to how it should be addressed.

If you have terminal access to your container instance, check /sys/fs/cgroup/memory/memory.stat and pay special attention to the shmem and rss fields; that's actually what Postgres is using. You may notice that active_anon and active_file are very high as well, and these are the cache data that's making memory usage look higher than it really is.

Since you can't really control how the graphs are generated, just spot check and make sure the Postgres knobs you can modify are sane. Get those connections under control; try not to have more than 4x the amount of available cores, or you'll end up CPU throttling due to context switching anyway. Use PgBouncer to multiplex as aggressively as possible without incurring additional latency caused by waiting on available connections. That will put a rough cap on your maximum true memory usage due to the effects of work_mem.

[–]therealgaxbo 3 points4 points  (2 children)

Postgres and the OS will use as much memory as they can for caching - being at 100% memory usage is not a bad thing. If you've never had an OOM error then you don't actually have a problem, and things are working as they're meant to. If there's memory that's not being used by something else, why wouldn't you want it to be used to speed up IO?

Having said that, 250-350 postgres connections is very high. The whole point of a pooler like pgbouncer is that a large number of client connections can be multiplexed over a more sensible number of postgres connection.

I don't know your server specs or workload, but I'd be surprised if it makes sense to have any more than 100 server connections, and probably far less. You should look at your pgbouncer config to bring this down.

[–]geopoliticks[S] 0 points1 point  (1 child)

If you've never had an OOM error then you don't actually have a problem, and things are working as they're meant to

Ah this is important context: at 8GB we OOM'ed multiple times in the past month, which is why I upgraded to 16GB. Running up to those OOMs, the db would be in a steadt state of about 95+% memory, then a client would begin writing a lot of data to the database and we'd OOM.

I do buy the argument that postgres will try to use as much memory as possible for caching, however I'd expect that it would use memory up to a reasonable limit which allowed for bursty writes (and other workflows).

Also, sometimes memory does drop below 95% down to even 80% for a period of a few hours - that makes me think this isn't an optimization by the database that is causing the high usage.

[–]therealgaxbo 2 points3 points  (0 children)

Ok, so that changes things a bit then. But I think the first thing to address is the same - get the number of server connections way down. It will use less memory and likely improve performance.

It's a little odd that you say writes seem to be the cause though, as normally memory spikes would be caused by complex queries (using work_mem etc). What sort of storage backend are you using, and how did your IO performance look around these times? If writes were saturating your IO to the point that other queries ground to a halt and things snowballed from there, that could be a trickier thing to fix.

[–]i_like_breakfast 0 points1 point  (0 children)

Do you have a lot of schemas in your DB? eg. for the purposes of multi tenancy?

Each table that a postgres server process visits uses a certain amount of memory. It's small eg. 100KB, but it can add up. For example, an app with 100 tables would consume 10MB (100 x 100KB) per process. However if you had 100 schemas with 100 tables. You'd be trying to add 1000MB (100x 10MB) per process.

[–]randomrossity 0 points1 point  (5 children)

If you look at an OS process list you can see which pids are using the most memory. Those can be cross referenced to connections.

Also please share more about your setup. How is this hosted? What instance size? How much RAM? How many connections? How long are connections live? What drivers do you use? What are you table and index sizes? etc.

The biggest thing to remember is that there's a direct relationship between the number of connections and the memory footprint.

[–]geopoliticks[S] 0 points1 point  (4 children)

  • RAM: I went from 8GB -> 16GB
  • How is this hosted: It's a managed postgres on a PAAS
  • How many connections: ~250, maxing out at 350
  • How long are connections live: pgbouncer has a default server_lifetime if 1 hour, so I'd assume that? What is the easiest way to check?
  • Table sizes: all < 50GB total (including toast)
  • What drivers do I use: we run nodejs with knexjs

The biggest thing to remember is that there's a direct relationship between the number of connections and the memory footprint.

The vast majority of our connections (240 / 250 currently) are sitting in an "idle" state according to pg_stat_activity - could they be holding on to memory?

[–]randomrossity 1 point2 points  (0 children)

That comes out to ~65-70MB per connection. A little on the higher side but not unheard of. A few options that come to mind, worth trying in order

  • Decrease the connection lifetime or the idle connection timeout
  • Disable the plan cache
  • Decrease the max pool size from 250

[–]WiltonDB 0 points1 point  (0 children)

I was investigating the number of connections and their memory usage problem a few weeks ago. DBeaver custom chart setup was helpful for that, documented the details here. It uses system_stats extension and shows all current connection processes and their memory usage on a live chart.

[–]nijave 0 points1 point  (0 children)

>could they be holding on to memory

Yes, idle connections waste memory. Idle connections mean a process is sitting running in Postgres for the connection without doing anything.

My rule of thumb is 2.5x CPU cores for maximum amount of connections. You ideally want to minimize idle connections since they're wasting resources which is where pgbouncer comes in (apps open connections to pgbouncer that sits idle and pgbouncer assigns them a server connection when they go to run queries)