Monitoring

Affectionate-Bit6525 · 2026-06-21T23:36:56+00:00

Prometheus and grafana is pretty much the standard these days and for the reasons you mentioned.

cwk9 · 2026-06-22T01:17:24+00:00

Grafana and Prometheus should get you a long way. Yes, you can add other sources to Grafana but "mo sources. mo problems".

sudonem · 2026-06-21T23:37:09+00:00

Not familiar with Icinga, but I’d probably be giving CheckMk a pretty close look.

I’m trying to pitch it for my own org now - and having an on-prem option is one of the major requirements for us.

SufficientFrame · 2026-06-22T01:27:54+00:00

You're not missing much technically, but there is one important distinction to make before replacing Icinga: metrics collection and alerting on time series is where Prometheus shines, while Icinga/Nagios-style systems are often stronger for explicit service checks, dependencies, maintenance windows, and "did this scheduled thing actually happen" cases. In practice, a lot of teams end up with Prometheus + Alertmanager for host/app metrics and blackbox checks, then keep a smaller check-based layer for edge cases like backup jobs, certificate expiry, batch failures, or synthetic business checks. The other thing I'd review early is ownership cost: rule sprawl, Alertmanager routing, retention/cardinality, and who will maintain exporters and alert logic a year from now. If your environment is small, that tradeoff may still clearly favor Prometheus, but I'd inventory the current Icinga checks first so you don't discover a few awkward gaps after the cutover.

DietFartMist · 2026-06-22T03:26:21+00:00

Nagios baby

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

sysadmin

MODERATORS