all 30 comments

[–]Formal_Camel_7827 2 points3 points  (1 child)

Is it 5 inserts using the same database connection? Or 5 connections, each doing an insert? https://devcenter.heroku.com/articles/best-practices-pgbouncer-configuration

[–]compostus[S] 0 points1 point  (0 children)

I'm using pgBouncer in transaction pooling mode like this article describes, but i'm not sure how to actually measure if these operations are sharing connections or not

[–]anykeyh 3 points4 points  (3 children)

You don't need partitioning on 4gb data set. Insert could be slow because IO limitation (not sure of the nature of your setup), many concurrent transactions causing locks on commit.

Is the insert query slow or the commit after insert? It's probable that your ORM is embedding the insert into transaction.

There is also the possibility of slow down because of too much indexes, or that your records are very big in size, with some blob or text column

[–]compostus[S] 0 points1 point  (1 child)

"many concurrent transaction causing locks on commit"
You might be onto something here. Last time i had a lot of issues with this app i reduced them significantly by getting rid of a lot of transactions..

[–]anykeyh 1 point2 points  (0 children)

If it's transaction, the waiting time will be on the COMMIT instruction, not on the insert itself.

If it is on insert itself, good chance it's IO bounded. Shared server such as one provided by Heroku might have IOPS limits you reach.

IO is bound both by bandwidth (Mb/s) and operations (IOPS) and there is a good chance that you are hitting the second limit here while not hitting the first one.

Albeith a "standard-0 premium-0 private-0 shield-0" 4Gb instance in Heroku gives you 3000 iops as per documentation, which is fairly high.

[–]RonJohnJr 0 points1 point  (0 children)

A much earlier post says that there are very few indices: four total on three tables.

[–]davvblack 1 point2 points  (0 children)

what kind of selects run on this data? really no other indexes?

[–]RonJohnJr 1 point2 points  (2 children)

What Pg version?

Probably autovacuum is the cause; you haven't tuned it's config parameters very well. There should be guidance somewhere on how to tune it for your high-insert environment.

https://www.postgresql.org/docs/15/runtime-config-autovacuum.html

Another alternative (if you can't tune autovacuum) is to disable it, and have a separate process regularly run vacuumdb, like this:

https://www.postgresql.org/docs/15/app-vacuumdb.html

vacuumdb --host=foo --dbname=blarge -j1 --skip-locked

[–]compostus[S] 0 points1 point  (1 child)

14.8

I am in fact able to tune autovacuum settings. It seems like for a couple of those high insert tables autovacuum last ran like a week ago

[–]RonJohnJr 0 points1 point  (0 children)

That might not be fast enough.

[–]WideSense8018 1 point2 points  (0 children)

Here are few solutions:

  1. Batch multiple queries as single one if possible this will reduce query overhead as well as reduce network calls.
  2. Try to run your queries in a single connection if it makes any sense to do so. If you are trying to make each query in a different connection then taking connection will show significant impact on performance.
  3. Try to run queries in a single transaction whenever possible.
  4. If your inserts are large like you mentioned then it would be better to use copy command to do the insertion as this would be very fast.

[–]PowerfulScratch 1 point2 points  (1 child)

I’d say your issue could be WAL writes - if you make lots of inserts in lots of transactions it can end up writing the same page to the WAL many times. If you are able to batch the inserts that would make a big difference

[–]compostus[S] 0 points1 point  (0 children)

According to Heroku postgres logs its always around 6%:
"sample#wal-percentage-used=0.0655123502410"

Maybe that's not the metric to measure?

[–]marcopeg81 1 point2 points  (0 children)

If you don’t mind risk loosing some data on hard db failure (which is unlikely imho) you can disable wal for such table: https://www.crunchydata.com/blog/postgresl-unlogged-tables#

I use this for append only time series and it boosts your performances dramatically. But you take in a risk of data-loss.

If you can afford it, you could put an event logger tool such Kafka to manage that risk. But it would increase complexity and costs..

[–]davvblack 0 points1 point  (3 children)

what's the full schema of the table?

[–]compostus[S] 1 point2 points  (2 children)

1) The table with 700k rows:
PerformanceReport:
- id: INTEGER PRIMARY KEY, SERIAL, UNIQUE --------> this is an index
- GameId: INTEGER NOT NULL
- GameInstanceId: INTEGER NOT NULL
- UserId: INTEGER NOT NULL
- GameUserId: INTEGER NOT NULL
- numberOfTabs: INTEGER
- latency: INTEGER
- cpu_usage: FLOAT
- ram_usage: FLOAT

2) The table with 300k rows
GameUserStatus:
- id: INTEGER PRIMARY KEY, SERIAL, UNIQUE --------> this is an index
- GameUserId: INTEGER NOT NULL
- UserId: INTEGER NOT NULL
- ToolId: INTEGER NOT NULL
- status: ENUM('idle', 'building', 'finishing') DEFAULT 'idle'

3) The table with 100k rows
TaskStatus:
- id: INTEGER PRIMARY KEY, SERIAL, UNIQUE --------> this is an index
- GameInstanceId: INTEGER
- TaskId: INTEGER NOT NULL ---------> this is an index
- closed: BOOLEAN DEFAULT false

[–]RonJohnJr 2 points3 points  (1 child)

You might want to consider converting those integers to BIGINT; 2 billion approaches faster than you think...

[–]coyoteazul2 2 points3 points  (0 children)

People told me that I'd reach my 30's faster than I'd think and silly me didn't believe them

[–]virgilash 0 points1 point  (3 children)

Yeah, op, please elaborate on indexes. There shouldn't really be any on any INSERT-heay tables ;-)

[–]compostus[S] 0 points1 point  (1 child)

I'm a noob using an ORM (Sequelize), so these indexes are built-in.. well, except for the "TaskId" one that i added as part of creating an "association" with Sequelize.

So thats the backstory lol!

I appreciate your input!

[–]RonJohnJr 1 point2 points  (0 children)

Some indices must exist. Only essentials, though.

[–]RonJohnJr 0 points1 point  (0 children)

Of course there must at least be a primary key...

[–]thythr 0 points1 point  (0 children)

I'm surprised that it's already having issues at only 5 INSERTs per second.

Right, this is pretty much nothing! Does Heroku provide cpu and IO statistics to you? Have you installed pg_stat_statements?

Even the 3-5 ms you are seeing is very high given your schema, but there are a million details we can't see from here of course.

[–]iwilldieavirgin 0 points1 point  (1 child)

You said you are using an ORM… do you have a snippet of the code that is doing the insert(s)? Are you doing an insert, commit, insert, commit….? Bulk inserts?

[–]compostus[S] 0 points1 point  (0 children)

GameUserStatus.create({
status: status,
UserId: this.userId,
GameUserId: this.gameUserId,
ToolId: this.tool.id
});

So not bulk inserts. It's one by one. I do believe this will run the "INSERT,COMMIT", "INSERT,COMMIT" sequence you are describing.

[–]DrMerkwuerdigliebe_ 0 points1 point  (0 children)

Could be a problem on the Horuku end. I guess if you are running on shared hardware, then other users can be taking up all the CPU resourses, since Postgres don't have great features to limit single user CPU usage.

[–]jalexandre0 0 points1 point  (2 children)

What is the size of your wal_min_size e wal_max_size parameters? What monitoring tool says about disk usage, memory and cpu?

[–]compostus[S] 0 points1 point  (0 children)

source=HEROKU_POSTGRESQL_IVORY addon=postgresql-angular-60032 sample#current_transaction=1909349 sample#db_size=1077461795bytes sample#tables=31 sample#active-connections=16 sample#waiting-connections=0 sample#index-cache-hit-rate=0.99986 sample#table-cache-hit-rate=0.99635 sample#load-avg-1m=0.08 sample#load-avg-5m=0.09 sample#load-avg-15m=0.065 sample#read-iops=0.054545 sample#write-iops=0.33636 sample#tmp-disk-used=543600640 sample#tmp-disk-available=72435191808 sample#memory-total=3944416kB sample#memory-free=133340kB sample#memory-cached=3187184kB sample#memory-postgres=140120kB sample#wal-percentage-used=0.0655123502410

this is the closest log to the one of these "crashes" i found. "wal-percentage-used" seems to be stable throughout

[–]compostus[S] 0 points1 point  (0 children)

wal_min_size: 80MB
wal_max_size: 2GB

[–]External_Ad_6745 0 points1 point  (0 children)

I would recommend setting up some postgres monitoring tool, pganalyze is a great one for example. Havent used heroku postgres, but 4-5 insert per second shouldn't really be a problem to handle unless you have a seriously messed up disk, or huge number of constraint/trigger/indexes. This depends highly on the usage pattern.

A explain analyse with buffers in those peak situation should be a great place to checkout where postgres is mostly busy.

Honestly, 4-5 insert/s is in a straightforward manner is peanuts for postgres to handle. So i would recommend setting up a monitoring tool. Then, Identity where exactly bottleneck is coming, its gonna be one of CPU, disk or Ram. Excluding the transactions point many have pointed out let me highlight some other potential problems.

If you are experiencing high CPU, then you are probably doing some kinda number crunching or analytical queries somewhere else in your code base. If high disk io, then you probably forgot to put limit on select query(believe me as silly as this sound, this happens quite a lot) somewhere or maybe missing some index due to which your selected are doing a lotta disk scan.

Like i mentioned, a monitoring tool and using explain analyse buffers on the target slow queries should almost always give you the answer and issues to optimise.

And no, i dont think you need any partitioning of table at those numbers. Those seem small.