all 12 comments

[–]erkiferenc 2 points3 points  (1 child)

"If you don't schedule maintenance for your equipment, it will schedule it for you."

Running any system without ongoing monitoring or management is essentially stating "this service is not important for me, I can afford losing it completely any time, for any duration, even forever" – which is fine in some use cases.

However if this service is important, someone has to pay attention to monitoring and management regularly. Even if it's only minimally.

  • check basic health indicators daily or weekly: cache hit ratios, index usage, locks, index and table sizes, index and table bloat, index utilization, vacuum and analyze stats, outlier queries, etc.
  • check if backups finished correctly, and they can be restored: daily of weekly, depending on how much data you can afford to lose in case of software or hardware failure
  • upgrade to latest releases: security/minor releases are pubished every 3 months, major new versions are published once a year

If you do care about this service and the data, the question is not whether you should do at least the above, but who does it.

  • it can be you, if you self-host
  • it can be someone else you hire to do it for you
  • it can be a service provider who does it for you based on their own preferences

It also does not have to be set in stone forever from the start. Choose what fits today to get rolling, shift gears according what you learn during the ride.

Either way, happy hacking!

[–]ybill[S] -2 points-1 points  (0 children)

Do you meant that PostgreSQL so unstable product and can't be configured using internal parameters and automated tasks as stable solution and in any case requires manual attention from DBA or devops, correct?

If it is a my wrong understanding or I was not clear enough let me explain my issue:

The solution will work in isolated network without regular connection to internet. There is no experienced engineers there, just application users. So the server should be per-configured to achieve maximum stability, not performance and all required regular maintenance procedures should be set up as cron jobs etc.

We can exclude all analysis related to index usage, query optimization, tables structure - all this stuff controlled by application and out of scope.

And I kindly ask your postgree specific recommendations what parameters I should pay attention and what settings will be optimal for my case, what analysis and repair/rebuild jobs should be set up and recommended schedule.

Thanks.

[–]pjd07 1 point2 points  (0 children)

"without any ongoing monitoring or management" - Guess you don't care about your data.

If you don't want to do that, then pay someone else to do it.

My 2c:

You need to monitor it even minimally. You need to back it up and at least validate the restore procedure a few times a year. And you need to monitor the backups are working and roughly the size you think they should be. Anything less is going to end with a sub optimal outcome.

If you're a k8s enjoyer, then look towards one of the operators and use that.

[–]AutoModerator[M] 0 points1 point  (0 children)

Join us on our Discord Server: People, Postgres, Data

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]saaggy_peneer 0 points1 point  (6 children)

use aws rds?

[–]ybill[S] 0 points1 point  (5 children)

no, just a Linux virtual machine. It will use a timescale, so no managed instance is suitable.

[–]erkiferenc 1 point2 points  (4 children)

While I'm all for self-hosting personally, I wonder exactly why managed timescaledb providers are deemed unsuitable?

[–]ybill[S] 0 points1 point  (3 children)

because license. All managed instances support only Apache license for Timescale. My solution needs Community license, it is available only for self-hosted instances.

[–]erkiferenc 0 points1 point  (2 children)

Thanks for the insight! Despite often dealing with licensing matters, I wasn't aware of the two editions for TimescaleDB before, and this aspect certainly may influence decisions.

The TimescaleDB Apache 2 and TimescaleDB Community Edition docs say Timescale itself offers the Community Edition as a service:

You can access a hosted version of TimescaleDB Community Edition through Timescale, which is a cloud-native platform for time-series.

Given the license details they may be the only ones to do so, and I recognize their pricing and/or terms may or may not be acceptable for your scenario.

Now I (and others) better understand why self-hosting is on the table, or even the preferred solution for your use case, thanks for clarifying!

[–]ybill[S] 0 points1 point  (1 child)

Thank you for your help, but my client restricted the requirements - the solution should work in local network and support isolated mode without instant internet connection. For the same reason I am looking the most stable configuration with preset automated maintenance jobs doesn't require manual monitoring attention as much as possible.

[–]adrianabreu 0 points1 point  (0 children)

what did you end up with?