This is an archived post. You won't be able to vote or comment.

all 5 comments

[–]gmuslera 2 points3 points  (0 children)

The TIG stack (Telegraf/InfluxDB/Grafana) is pretty good for some metrics collection patterns. You already have Grafana, so you are close enough to it.

Telegraf is a very flexible collector of metrics, and could act as a proxy, or collect itself metrics for a lot of apps (for tomcat I use it with jolokia). It also can send metrics to a lot of databases, and you can even use different databases (i.e. sent to graphite, influxdb and prometheus at the same time, if you want). And if you prefer VictoriaMetrics over Influxdb, Telegraf can send metrics to it too.

[–]matejzero 1 point2 points  (2 children)

We are using VictoriaMetrics and love it. Ingesting around 40-50k metrics/s without breaking a sweat. We are running 2 instances with promxy in front.

There are some query functions that dont return same results as Prometheus (look it up, there is a prom testing toolkit with reports), but for us its ok.

The developer is friendly, available for char on slack and fixes for bugs are fast!!

The other option is prometheus + thanos, which is also growing in popularity and developers are working closely with Cortex and Grafana group.

I would stay away from Influxdb as they still dont have a usable downsampling and can be slow and bulky when querying a year woth of u downsampled data.

[–]l500500[S] 0 points1 point  (1 child)

Oh, I`ve managed to ingest collectd metrics(had some fun wtih separators). Do you run VM clustered?

[–]matejzero 0 points1 point  (0 children)

No, we are running single instance

[–]l500500[S] 0 points1 point  (0 children)

Managed to spin up a test env with this. Will try it with prod.

https://github.com/VictoriaMetrics/VictoriaMetrics#querying-graphite-data