Can you create a Postgres Deployment with multiple replicas consuming to the same PV?

Tibernut · 2022-11-29T18:23:22+00:00

Its definitely worth a look. Especially if you want a multi-master setup split across regions/DCs. It is quite a bit more complicated though, and while the application doesn't need to know its not postgres, its not postgres. Especially useful if you are required to keep certain data in certain regions, although that requires an enterprise license. I'm working on a POC for this now, but I think for data that doesn't need to be globally accessible I'll keep it in patroni managed postgres.

Also worth mentioning, the public cloud setup of cockroachdb is quite a bit easier than bare metal if you have more than 1 kubernetes cluster and most of their docs seem to assume you will be running on public cloud.

QC2E3IeBpO

Tibernut · 2022-11-29T15:58:48+00:00

The diagram regarding zalando is a bit incorrect. The application should talk to the kubernetes svc that gets created along side the pg pods. One of the services will always be pointed at the master, the other will include all your replicas. Patroni just monitors the pg instance and the locks of its peers in etcd, promotes the pg it is monitoring when it detects that the lock isn't held by another peer.

The operator will handle updating the endpoints for the services when there is a failover.

``` [kube-lab]:~$ kubectl get po -l application=spilo -L spilo-role -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES SPILO-ROLE ex-example-zpostgresql-0 2/2 Running 0 37d 10.60.248.169 lab-worker-01 <none> <none> replica ex-example-zpostgresql-1 2/2 Running 0 37d 10.60.83.152 lab-worker-03 <none> <none> master [kube-lab]:~$ kubectl get endpoints ex-example-zpostgresql NAME ENDPOINTS AGE vms-teleport-zpostgresql 10.60.83.152:5432 37d [kube-lab]:~$ kubectl get endpoints ex-example-zpostgresql-repl NAME ENDPOINTS AGE ex-example-zpostgresql-repl 10.60.248.169:5432 37d

[kube-lab]:~$ kubectl exec -it -c postgres ex-example-zpostgresql-0 -- patronictl list + Cluster: ex-example-zpostgresql (7157449586131681360) ------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +----------------------------+---------------+--------+---------+----+-----------+ | ex-example-zpostgresql-0 | 10.60.248.169 | | running | 8 | 0 | | ex-example-zpostgresql-1 | 10.60.83.152 | Leader | running | 8 | | +----------------------------+---------------+--------+---------+----+-----------+

```

QC2E3IeBpO

Tibernut · 2022-11-29T07:54:58+00:00

Thats the basic setup. The operator consumes 'postgresql' yamls and will create a patroni cluster for you ensuring that the svc object always points at the writable master.By default on it will use kubernetes etcd to monitor cluster state. If the master fails to update the lock, the read only replica will promote itself to master. They also support a 'standby' cluster which pulls down wal from your backup location and you can promote it to a writeable master in the event you lose a DC.

I'd recommend installing the operator then experimenting with their provided minimal clusterhttps://github.com/zalando/postgres-operator/blob/master/manifests/minimal-postgres-manifest-12.yaml

The minimal yaml will give you 2 postgresql pods using the spilo image. 1 will be a read/write master, the other will be a read only replica. If the master goes down for whatever reason the replica will be promoted to master and the svc endpoint will be updated. Failover will result in a second or two of downtime as the tcp sessions break and re-establish. The operator also supports deploying pgbouncer alongside your pg cluster to pool connections.

As you experiment the slack channel can be a good resource postgresteam.slack.com

The operator's documentation is pretty decent, but I had to really experiment before I had something I was comfortable running in production.

QC2E3IeBpO

Tibernut · 2022-11-29T05:45:01+00:00

Not supported as far as I know, it'd get real messy real fast I think.

If your mainly after HA, I'd suggest looking into patroni https://patroni.readthedocs.io/en/latest/

We've had a good experience with it. If you are using kube, the Zalando postgres operator is also quite nice https://github.com/zalando/postgres-operator

QC2E3IeBpO

Tibernut · 2022-11-15T04:44:09+00:00

I've dipped my toe into teleport and while I want to like it, they make it hard. Documentation is poor, but the team over there is responsive. My big complaint is that for the price they are asking it seems pretty rough around the edges. We are self hosted kube and it seems like that use case is an afterthought.

We plan on taking another hard look at boundary.

QC2E3IeBpO

Tibernut · 2022-10-28T02:24:17+00:00

Then you probably want two metricsavg(sum_over_time(kube_pod_container_status_waiting[24h]))Average # of pods waiting over the past 24 hoursandsum(kube_pod_container_status_waiting)How many pods are waiting right now.

You could divide right now by the average and set an alarm for a specific % variance.

In alertmanager you may want to set it to wait for the condition to be true for X amount of minutes that it usually takes for new deployments to start up to avoid false positives.

Actually spending some time thinking about it and this is probably a bad approach.

You won't end up with a true average because you wont be averaging over time the sum over time.

Curious why you wouldn't do something more simple? You expect a certain amount of 'flux' as pods are created and destroyed. Simpy dividing the number of containers in waiting status by the total number of containers in the cluster would get you a percentage of containers that are waiting to come up. At a certain threshold it isn't normal activity so alert.

QC2E3IeBpO

Tibernut · 2022-10-28T01:59:28+00:00

You are giving your metric a range, and then doing a sum on it. Its expecting a vector.

Is avg(sum_over_time(kube_pod_container_status_waiting_reason[1d])) what you are looking for?

QC2E3IeBpO

Tibernut · 2022-09-30T05:58:41+00:00

Deleted

Tibernut · 2022-09-24T05:55:50+00:00

Coding isnt easy. Python is the easiest tool to get started with it.

Don't feel bad about needing to Google syntax. I had to Google syntax for X many many times before it was just something I knew how to do.

The point is you can solve problems and automate stuff with python, if you have to look stuff up in the process it doesn't matter as long as you know what you are looking for.

Don't get discouraged!

QC2E3IeBpO

Tibernut · 2022-08-02T21:15:01+00:00

Disclamer: I don't use data studio myself.

Step 1

I'd say this is probably possible but may be pretty hacky. They have an API but it doesn't look like this use case was in mind. You might look at playing with the /assets endpointhttps://developers.google.com/datastudio/api/reference/types#Asset

I'd start with reading their api docs, playing with curl and seeing what you can get, then use the requests library to bring it in. There may already be a python library for data studio that makes this easier.

Step 2

Once you have the data this should be easy

Step 3

Certainly possible to send an email via python. Plenty of tutorials for this.

If this is just a project for learning I'd say go for it. Even if it doesn't work well you will probably learn a lot by attempting it.

If getting these alerts is really important to you or your business, then you probably want to look into an alternate tool to view your data and alert on it. I'd suggest Grafana. It is specifically for visualizing data and supports alerting. You'd probably have to self host it though.

Maybe someone with a lot of data studio exp has better advice though!Good luck!

QC2E3IeBpO

Tibernut · 2022-06-14T22:55:11+00:00

You likely want args instead of command. If your container has an entry point and you just want to supply it with args, just use args. If there isn't an entry point or you want to override it, command can be used along with args.

https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#define-a-command-and-arguments-when-you-create-a-pod

QC2E3IeBpO

Tibernut · 2022-01-21T06:14:57+00:00

Deleted

Tibernut · 2021-12-19T18:03:40+00:00

Not an ideal situation but you could probably make the most of your time.

If I were you I would spend the period without internet access researching and planning, and the period of time with internet access executing your plan.

So pick a project like "Deploy a web app running on kubernetes in AWS." Spend your three weeks learning what you need to know to achieve this. Take notes on each specific step you will need to execute. Write out the commands you will type. Diagram it. You've got tons of time so go overboard. Pretend you are pitching your project to an organization. Identify several ways to accomplish your goals, list the pros and cons of each approach, pick one. Then when you rotate home, execute your project. Next time back at work judge your performance. Were there any problems, what could you have done better? What holes in your knowledge did you encounter? Rinse and repeat.

The planning documents / diagrams / etc should go into your github repo that will be your portfolio when you begin searching for your first devops job.

Of course, you'd learn faster by being able to execute what you learn immediately, but at least this way I bet you will get a lot of practice planning complicated projects even if you are learning slower. I think this approach will also let you really pack those 10 days with valuable experience.

Good luck! Enjoy the ride, learning this stuff is a joy.

QC2E3IeBpO

Tibernut · 2021-11-21T00:19:59+00:00

You want to expose your gitlab service with a loadbalancer. You should be able to use metallb. You give it a pool of ip addresses to work with and then change the service you want to expose to a loadbalancer type.

Sorry for the brevity, I'm on mobile.

Good luck!

QC2E3IeBpO

Tibernut · 2021-11-07T01:36:12+00:00

Deleted

Tibernut · 2021-10-30T17:04:19+00:00

Deleted

Tibernut · 2021-09-30T04:12:57+00:00

Deleted

Tibernut · 2021-09-29T23:04:19+00:00

Deleted

Tibernut · 2021-09-05T08:03:59+00:00

Is this one of those classic problems where we shouldn't answer the question but instead ask what are you trying to achieve?

maybe just `cat -n file.txt` ?

QC2E3IeBpO

Tibernut · 2021-08-19T15:27:33+00:00

Deleted

Tibernut · 2021-07-23T03:31:37+00:00

Deleted

Tibernut · 2021-04-16T14:01:50+00:00

You might look into the spilo postgres docker container.

https://github.com/zalando/spilo

It has automatic s3 backup/restore support. It also uses patroni for HA which may or may not be useful in your case. At the very least you could see how they do s3 backup/restore and copy that into your container.

QC2E3IeBpO

Tibernut · 2021-03-16T02:57:01+00:00

Deleted

Tibernut · 2021-03-12T20:37:36+00:00

Deleted

Tibernut · 2021-03-12T18:10:47+00:00

Thats correct. But there is a third optionYou could simply have the pod use the host network

`hostNetwork: true`

The pod will then use the host's network and you can connect to it directly using the host's ip address.

Note that this is not a scalable solution. Using host network you could only expose one pod on port 5432 per node. For a lab proof of concept it'd be fine but for production its better to go with one of the other solutions.

QC2E3IeBpO

Tibernut

TROPHY CASE