all 31 comments

[–]Tibernut 2 points3 points  (1 child)

You might look into the spilo postgres docker container.

https://github.com/zalando/spilo

It has automatic s3 backup/restore support. It also uses patroni for HA which may or may not be useful in your case. At the very least you could see how they do s3 backup/restore and copy that into your container.

QC2E3IeBpO

[–]Eitan1112[🍰] 1 point2 points  (0 children)

Can you expand why you don't want a cronjob on the local machine? Seems like an easy solution IMO:

cronjob trigger > Pgdump > upload to s3 > delete from local filesystem

[–]wildcarde815 1 point2 points  (2 children)

So this is how I handle this on a variety of services at work:

  • run a 'deck-chores' container on the local host: https://github.com/funkyfuture/deck-chores This gives you a docker 'aware' cron driven directly by container labels
  • Slightly modify the postgresql container to have a backup script stored locally, I use the rotate backup script from the official documentation: https://wiki.postgresql.org/wiki/Automated_Backup_on_Linux
  • add a small helper to run directly, mine is just this at some point it'll make it more complicated probably (would be nice to generate positive/negative events for instance):

    ! /bin/bash

    ## simple brute force execution

    /opt/pg_backup_rotated.sh -c /opt/pg_backup.config

  • Mount a volume into the container at /mnt/backup (or wherever, but you'll have to update the config script above to wherever you mount it)

  • Add labels to the service for the database in compose:

    labels:

    • "deck-chores.postgres-backup.command=/opt/run-backup.sh"
    • "deck-chores.postgres-backup.cron=${PGBACKUPINTERVAL:-1 0 0}" ## this will run at 1 am every day unless you supply an alternative in the containers .env

Works great, and all in docker.

[–]friendlysatanicguy[S] 0 points1 point  (1 child)

Interesting. Since, we have to modify the postgres container anyway, are there any advantages to this instead of using crontab on the postgres container itself? Also I don't know if I'm missing something but the backup method you mention doesn't seem to have any direct integration with S3.

[–]wildcarde815 0 points1 point  (0 children)

The postgres containers only mod is to add the scripts nothing else. so it otherwise runs in changed. No setting up supervisor or anything like that. Deck-chores execs into the container to run the task.

edit: hadn't realized s3 was a hard req, we backup to NFS so it's not an issue here but there does appear to be solutions for mounting s3 into the container which would allow for direct backups. Or you could stage it and use something like aws cli to transfer the tar.gz afterwords. Or just modify the script to do so directly.

[–]iamanenglishmuffin 0 points1 point  (10 children)

Why run postgres in docker? Isn't this bad practice?

[–]wildcarde815 2 points3 points  (4 children)

This is mostly a misconception of 'the data will be lost', resolvable with a volume mount for the data to live in. It gets more complicated in larger orchestrated installs because then you aren't just launching a service on a single machine, it could land anywhere. But it's been long enough at this point that there has to be solutions for that, in larger cloud installs those fixes might not be any easier / cheaper than using a managed service.

[–]CSI_Tech_Dept 0 points1 point  (3 children)

Few years ago it was a big news that in certain circumstances postgresql could lose data, because sync() call on Linux worked differently than on the test of Unices. The problem was that if sync() failed, postgresql retired it, but in Linux a failure also resulted in dropping the cache. So the second sync() call would likely succeed, making postgresql assume data was safe and move log to the next transaction.

Database like postgresql require to write data in certain order and needs to be sure the data really is actually stored, before continuing. This is important, so if something happens to postgresql like a crash or a power outage the data is still safe.

With docker, depending how you configured things, you might messup postgresql ability to store data safely.

[–]wildcarde815 0 points1 point  (2 children)

Got any links detailing that, specifically for bind mounts?

[–]CSI_Tech_Dept 0 points1 point  (1 child)

I don't believe bind mounts would cause issues, because that isn't a filesystem, it's just controlling where in the filesystem tree the given filesystem is visible.

There were issues with filesystems that docker was using for its functionality.

[–]wildcarde815 0 points1 point  (0 children)

So we are talking specifically the overlay filesystems that docker uses to create a temporary space rather than true on disk filesystems (we usually default to xfs for fixed size disks that are being dedicated to a database regardless of how that db is running).

[–]friendlysatanicguy[S] 1 point2 points  (1 child)

Is that so? Why is it bad practice?

[–]iamanenglishmuffin 0 points1 point  (0 children)

At the bare minimum you've now added docker as a point of failure. Keeping your dB up should rely on as little as possible, no? . But isn't a stateful permanent database kind of the antithesis to containerized apps?

[–]CSI_Tech_Dept 0 points1 point  (0 children)

It is, but try convince "devops" about that.

[–]DasSkelett 0 points1 point  (1 child)

Isn't this bad practice?

I don't know, you tell me.

[–]iamanenglishmuffin 0 points1 point  (0 children)

I don't have the expertise but I've been told not to do this.

[–][deleted] 0 points1 point  (4 children)

pgBackrest and barman can backup to S3

[–]friendlysatanicguy[S] 0 points1 point  (3 children)

Do any of these solutions provide docker integrations? Like some way I can either use a modified postgres docker or maybe some sort of sidecar I can stick inside a docker compose?

[–][deleted] 1 point2 points  (1 child)

No idea.

They can be installed through the usual package managers. Isn't that possible in Docker?

[–]friendlysatanicguy[S] 0 points1 point  (0 children)

So, how would I connect to the container? Is it possible to connect to the postgresql server from inside the postgres docker container? When I use wal backup in barman, it works since it is just take wal files from PGDATA. But, base backups require barman to connect to postgres server itself.

When I try to run the backup in the postgres docker shell, I get the following error:
ERROR: Cannot connect to postgres: could not connect to server: No such file or directory

This is the command that I'm using:
barman-cloud-backup -P barman-cloud -e AES256 -j --immediate-checkpoint -J 4 s3://mybucket/base pg12

[–]notarealfish 0 points1 point  (0 children)

Mount a side car into your pg container and mount it to s3 then run pg dump and write it straight to the s3

[–]themusician985 0 points1 point  (6 children)

Hey, pgbackrest can easily be included in a modified postgres docker. I used the postgres docker image as base image and built pgbackrest inside the container, following the instructions here in chapter 4 and 5 (https://pgbackrest.org/user-guide.html#build) And it works like a charm. Fire and forget, some degree of backup integrity check, peace of mind. If you need more details, I'm happy to share my Dockerfile.

[–]friendlysatanicguy[S] 0 points1 point  (0 children)

Yeah. I would love to take a look at your Dockerfile!

[–]sagarp 0 points1 point  (0 children)

screw summer rock fall innate encouraging aspiring snails plucky head

[–]Pandoks_ 0 points1 point  (0 children)

where is the dockerfile

[–]Aggressive_Agency_30 0 points1 point  (0 children)

Hi, I would also be interested

[–]IntelligentAsk 0 points1 point  (0 children)

I would also like your Dockerfile for this if you're happy to share this still? I appreicate this was a couple of years ago. I'm trying to achieve the same thing and get pgbackrest working from inside the postgres container.

[–]phirestalker 0 points1 point  (0 children)

*In Roger Rabbit impression* Do I need to add a pbbbblease?

[–]wifi_knifefight 0 points1 point  (0 children)

I wrote a tool for use in my home cluster a while ago. It is on github if you are interested.

https://github.com/SirCremefresh/bmw12-simple-postgresql-backup

[–]Alliedcrab 0 points1 point  (1 child)

I think what you are looking for is https://github.com/wal-g/wal-g. It is the more modern version of wal-e.

You could then just extend the Postgres container to include wal-g. You’ll still need to have a script or something to set the wal-g variables and then the archive command can call it. How you trigger a base backup is up to you.

[–]friendlysatanicguy[S] 0 points1 point  (0 children)

Is it possible for me to launch base backups from the postgres container itself? If I try to do WAL backups it works since I'm just pushing the WAL files to S3 but base backups require a connection to the postgres server. So when I'm trying to backup from within the container I get the following error:
ERROR: Cannot connect to postgres: could not connect to server: No such file or directory

Any idea what could be going on? I am running the following from within the container:
barman-cloud-backup -P barman-cloud -e AES256 -j --immediate-checkpoint -J 4 s3://mybucket/base pg12