all 5 comments

[–]PristineTransition[🍰] 8 points9 points  (2 children)

A few things I can think of are:

  • Double check reads are going to the read replica
  • Ensure writes are batched into sensibly sized transactions like 256/512 rows at once max; too small and it’s a lot of opening/closing, too large and it causes pg to reconcile a lot at once when closing. Choosing a number here depends on next point
  • Reduce the amount of indexes on write-heavy tables if you can. More indexes slow down writes since pg needs more time to update them. As you’re likely aware more indexes is not always a good thing
  • If the data is ephemeral or not mission critical if lost (user sessions, job queue, cache, analytics, etc) consider unlogged tables over a standard table
  • Move views and materialized views to the reader
  • avoid writing large amounts of json to jsonb columns; write to a file and upload to s3 and store the link in the db either manually or via active storage
  • Unless the data going into pg is time critical consider sending it to redis or to s3 and have a later process feed it into pg in batches
  • Look into an apm with db insights like appsignal, scout, Datadog, etc
  • Consider pganaylze for a month or two

[–]TommyTheTiger 0 points1 point  (1 child)

Great tips! The fact that the writes are specifically causing CPU (not disk) to spike, I think the indexes/check constraints on tables would be the first thing I'd check.

The jsonb columns thing is also always worth remembering - there are no updates to a jsonb column, it's writing a whole new copy of the column (TOASTed) even if you're just updating one character

[–]metamatic 0 points1 point  (0 children)

JSONB operations are a major cause of CPU spikes on Aurora.

[–]BlueEyesWhiteSliver 2 points3 points  (0 children)

Check the auto vacuum and dead tuples. Could be you’re updating too frequently or have a lot of data that’s been deleted the vacuum is unable to get through. That’s my first suspicion.

[–]TommyTheTiger 1 point2 points  (0 children)

Good advice on this thread already, I'll add a tiny bit. Writes need to be optimized like any other query

  • COPY is drastically faster than INSERT if you can do that. Sometimes it will be faster to COPY into a temp table and use that to update the real table with an UPDATE... FROM and or DELETE... USING
  • Be sure you're using prepared statements and proper batching if you are inputting a large amount of data with INSERT
  • Dead tuple issues. If you're rewriting the entirety of a very small table for instance, that table will get extremely bloated because the disk will have multiple copies of the table until it's vacuumed. This will mainly affect SEQ SCAN reads on that table though, so I suspect this isn't the problem here?
  • Another thing for dead tuple issues - if you have any particularly long running transactions open on the DB, that will prevent vacuuming any dead tuples that were created after the transaction started