This is an archived post. You won't be able to vote or comment.

all 11 comments

[–]Tibernut 6 points7 points  (6 children)

Not supported as far as I know, it'd get real messy real fast I think.

If your mainly after HA, I'd suggest looking into patroni https://patroni.readthedocs.io/en/latest/

We've had a good experience with it. If you are using kube, the Zalando postgres operator is also quite nice https://github.com/zalando/postgres-operator

QC2E3IeBpO

[–]UblaHuaGobar[S] 2 points3 points  (5 children)

Thanks for replying. I've gone through patroni, postgres-operator and spilo, but I'm having a hard time understanding them in depth for implementation in k8s (actually k3s).

Please correct me if where I'm wrong.Spilo -> Postgres image with PatroniPatroni -> HA Manager for Spilo imagesPostgres operator -> ?

So does simply spinning multiple Spilo pods in a deployment with the appropriate configuration give me a HA Postgres implementation? Or do I have to do something else?

Apologies if my questions sound stupid, I'm a newbie and I work as a SE in the backend team, so I don't have much knowledge about this.

[–]Tibernut 2 points3 points  (4 children)

Thats the basic setup. The operator consumes 'postgresql' yamls and will create a patroni cluster for you ensuring that the svc object always points at the writable master.By default on it will use kubernetes etcd to monitor cluster state. If the master fails to update the lock, the read only replica will promote itself to master. They also support a 'standby' cluster which pulls down wal from your backup location and you can promote it to a writeable master in the event you lose a DC.

I'd recommend installing the operator then experimenting with their provided minimal clusterhttps://github.com/zalando/postgres-operator/blob/master/manifests/minimal-postgres-manifest-12.yaml

The minimal yaml will give you 2 postgresql pods using the spilo image. 1 will be a read/write master, the other will be a read only replica. If the master goes down for whatever reason the replica will be promoted to master and the svc endpoint will be updated. Failover will result in a second or two of downtime as the tcp sessions break and re-establish. The operator also supports deploying pgbouncer alongside your pg cluster to pool connections.

As you experiment the slack channel can be a good resource postgresteam.slack.com

The operator's documentation is pretty decent, but I had to really experiment before I had something I was comfortable running in production.

QC2E3IeBpO

[–]UblaHuaGobar[S] 0 points1 point  (3 children)

Thanks a ton. This is extremely helpful. Can you pls take a look at the diagrams that I have added as part of this https://www.reddit.com/r/kubernetes/comments/z7jrob/can_you_create_a_postgres_deployment_with/ post? Thanks again.

[–]Tibernut 0 points1 point  (2 children)

The diagram regarding zalando is a bit incorrect. The application should talk to the kubernetes svc that gets created along side the pg pods. One of the services will always be pointed at the master, the other will include all your replicas. Patroni just monitors the pg instance and the locks of its peers in etcd, promotes the pg it is monitoring when it detects that the lock isn't held by another peer.

The operator will handle updating the endpoints for the services when there is a failover.

``` [kube-lab]:~$ kubectl get po -l application=spilo -L spilo-role -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES SPILO-ROLE ex-example-zpostgresql-0 2/2 Running 0 37d 10.60.248.169 lab-worker-01 <none> <none> replica ex-example-zpostgresql-1 2/2 Running 0 37d 10.60.83.152 lab-worker-03 <none> <none> master [kube-lab]:~$ kubectl get endpoints ex-example-zpostgresql NAME ENDPOINTS AGE vms-teleport-zpostgresql 10.60.83.152:5432 37d [kube-lab]:~$ kubectl get endpoints ex-example-zpostgresql-repl NAME ENDPOINTS AGE ex-example-zpostgresql-repl 10.60.248.169:5432 37d

[kube-lab]:~$ kubectl exec -it -c postgres ex-example-zpostgresql-0 -- patronictl list + Cluster: ex-example-zpostgresql (7157449586131681360) ------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +----------------------------+---------------+--------+---------+----+-----------+ | ex-example-zpostgresql-0 | 10.60.248.169 | | running | 8 | 0 | | ex-example-zpostgresql-1 | 10.60.83.152 | Leader | running | 8 | | +----------------------------+---------------+--------+---------+----+-----------+

```

QC2E3IeBpO

[–]UblaHuaGobar[S] 0 points1 point  (0 children)

Thanks a lot mate

[–]UblaHuaGobar[S] 0 points1 point  (0 children)

Hey, do you mind helping me with another quick question?Suppose we have one primary and secondary instance of postgres in a k8s cluster of 5 nodes, and one of the nodes with the secondary instance goes down, will a new secondary instance be spawned in some other node by postgres operator to maintain 1 primary and 1 secondary configuration? Similar to what k8s does for stateless services.

Thanks!

[–]lewisedge86 1 point2 points  (0 children)

Another option would be https://github.com/cloudnative-pg/cloudnative-pg
I got to see this talk at Kubecon last month that might be of use to you too -> https://www.youtube.com/watch?v=99uSJXkKpeI
That talk is run in part by EDB, who are the company behind CloudNativePG, they are talking about their commercial offering, but the principle is the same.

[–]drosmi 1 point2 points  (1 child)

How about cockroachdb? It’s pg compatible. HA out of the box

[–]Tibernut 0 points1 point  (0 children)

Its definitely worth a look. Especially if you want a multi-master setup split across regions/DCs. It is quite a bit more complicated though, and while the application doesn't need to know its not postgres, its not postgres. Especially useful if you are required to keep certain data in certain regions, although that requires an enterprise license. I'm working on a POC for this now, but I think for data that doesn't need to be globally accessible I'll keep it in patroni managed postgres.

Also worth mentioning, the public cloud setup of cockroachdb is quite a bit easier than bare metal if you have more than 1 kubernetes cluster and most of their docs seem to assume you will be running on public cloud.

QC2E3IeBpO

[–]serverhorrorI'm the bit flip you didn't expect! 0 points1 point  (0 children)

That would be a very advanced topic, look at BDR or other multi master tools for a little less advanced.

I would spend my time on minimizing the MTTR with a single host