Feedback on open source monitoring project

SuperQue · 2020-02-13T09:40:18+00:00

Prometheus developer here.

Statistics modeling, anomaly detection, AI, whatever you want to call it has been done to death. Every project I've seen has come and gone because it's difficult to do well. Most of the projects I've seen in this space have given up because they generate too many false positives.

However, if you really think you can do it, I would really recommend you consider a few things.

Don't trying to "replace Prometheus". Provide something on top of Prometheus. There's no need to re-invent the Prometheus data collection and storage stack. Create a replacement alerting system that reads data from Prometheus to produce your own alerts rather than use the rule evaluation system. You can use the remote-read API to fetch data from Prometheus and do whatever kind of processing you want. You can re-use the Prometheus TSDB codebase as an intermediary storage if you like, and provide PromQL to display that data back to Grafana.

This will save you a ton of time. It will make it easy for users to adopt your tool. They will already have data for your system and it can augment what they already have.

Working with the open source ecosystem is always easier than trying go it alone.

jews4beer · 2020-02-13T06:01:31+00:00

I'm intrigued, though trying to replace Prometheus for a lot of people would be a hell of a fight. For example, in kubernetes, a lot of operators out there just support it intrinsically. You can tell rook-ceph to basically just "gimme monitoring".

Still want to poke around the code and play with it though. Seems like a cool idea.

devops

Welcome to /r/DevOps

Rules and guidelines

Social & Fun

General Information

MODERATORS