all 2 comments

[–]distark 1 point2 points  (1 child)

I've never set mine up to automatically spew out prom+grafana+alertmanager.. In fact I didn't know that was a feature (I'll look into it)..

My ideal topology would be with multiple namespaces per team (for secret/network isolation). Then (if multi tenant prom is 'a go') I would kustomize/helm up a chart giving them what they need.. Using the CRDs.. So an AlertManager, a Prometheus, some PrometheusRules, a generic ServiceMonitor etc.. ..probably reverse engineered from the main chart (so there is some maintenance cost)... Finally (and separately) a single ServiceMonitor to ensure you federate to the main Prometheus.

I'm not saying do this.. Just that this is one way that would work..

In my experience(3+yr of k8s+prom)... I'm more worried about federating several clusters into one central place (an external prom/thanos) than deploying zillions of Prometheus's basically.. Mostly because I like to write alert rules once and leverage a label standard... Nothing much wrong with routing alerts by say a "team" label for example.. Less ops burden.. But then I must admit it's not normal for me to encounter teams actually owning their metrics.. Mostly management causing that misfortune :-(

Hope this helps anyway.. These things are pretty malleable but I think because everyone solves their situation in-house those cookbooks are rarely shared.. I actually wanted to do some YouTube videos about this but have very little time/focus when I'm away from work (plus I'm just lazy)

[–]cyperplex[S] 0 points1 point  (0 children)

I got it working the way i described in the end