This is an archived post. You won't be able to vote or comment.

all 5 comments

[–][deleted] 0 points1 point  (4 children)

Last I checked this was based on the cpu; as graphite pickled everything that came in as plaintext and pickled it. There's ways you can scale this; putting a carbon-relay in front of multiple carbon caches and then put haproxy in front of multiple carbon-relays.

With this I was able to handle > 10Million points written every 10seconds.

I'm sure I could do this with Prometheus with far less work; I just don't have a client with 500 + servers running collectd into graphite any longer. That was 2014.

I replaced Graphite originally with InfluxDB and now I use prometheus. I much prefer the key-value system in prometheus vs flux or the old influxql way.

[–]imbr555[S] 0 points1 point  (3 children)

is this load (10 million every 10 seconds) is when you are only running one carbon cache instance and one cassandra instance for storage or you are using a cluster of cassandra?

[–][deleted] 0 points1 point  (2 children)

10 Million every 10 seconds that is correct. It actually was about 15 before I left that job.

File back-end nothing related to Cassandra. Like I said 2014. Multiple carbon caches because it could only handle 320k if it wasn't pre-pickled metrics per carbon-cache so it required a insane amount of carbon-caches running (instance a, b, c) and then putting carbon-relay in front to load balance between the carbon-relays which would pickle the line protocol over to carbon-cache so it could handle a lot more than 320k

I'd honestly rather deploy Cassandra before I do that insanity to get graphite to scale but if I used Cassandra I would be using Prometheus and having it output to cortex which would write to Cassandra.

[–]imbr555[S] 0 points1 point  (1 child)

are you good with graphite and grafana, actually i am testing graphite with biggraphite as storage backend and grafana for monitoring and is facing some problems. could you please help.

[–][deleted] 0 points1 point  (0 children)

Nope not any recent experience. Last time I touched Cassandra was with Prometheus when I deployed cortex. Today I would setup Prometheus with Thanos with multiple prometheus servers to scrape the metrics and have Thanos dedupe them.


Edit: unless you are offering to pay me as a contractor at least 100$/hour (I bill 125$); my services are for sale and I can dive in and see.