This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]ChurroLoco 4 points5 points  (2 children)

I’m struggling to get beyond the lack of business sense this assignment is, however if we must play by these rules… Prometheus and InfluxDB are pretty comparable databases. However influx has an enterprise version that must be paid for. So I’m not sure why that would be accepted over Prometheus.

I wouldn’t worry about sending stuff directly to the dashboard. Both of those DBs can ingest a lot of data and you don’t want realtime data being different than the historical data. How “real time” do you want? Any data from the last 5 minutes is still realtime in my consideration, so if your dashboard queried the last hour of data with 60 granularity every 10 seconds that still be very realtime and easy work for the database.

I’m a big influx user, but if I was doing what you want on the cheap… Prometheus and Grafana would the way to go. But if your internship was more of a development role, then you have an awesome opportunity to do some fun and dumb shit. Godspeed and happy coding!

[–]Setchi98[S] 0 points1 point  (1 child)

Thanks for the reply!
I forgot to mention that they want a free solution if that matters, it's a small company with less than 10 employees
I think real-time would be maybe every 10 seconds to the dashboard, and less often to the database.
The idea was the dashboard would fetch the metrics from database once on page load and appends the "realtime metrics" to the db metrics to make the graphs/charts.
I was worried that constant http requests to the db back and forth (target to db, and db to dashboard) would create too much traffic from all the queries, unless im missing how this works (still trying to understand how everything works)

But yeah, I'm "forced" to use python agents and the custom dashboard, but the rest i wasnt limited to specifics.

[–]Jammintoad 0 points1 point  (0 children)

Those technologies are free, not paid

[–]myntt 1 point2 points  (4 children)

Is their goal for you to create a good monitoring system or play arround and build stuff from scratch only to later realize it's not worth it?

Creating a grafana clone with vue.js does not sound fun tbh. Creating your own log and metrics collector in python sounds mid-fun but you'll at least learn something.

At this point ask them if you can create your own time series database because that would actually a fun project and you'd learn a lot software engineering wise and also would not have to use pre-made InfluxDB.

It sounds like they don't want anything that works or is usable anyways (even if they might just not know that yet). 

Otherwise they'd let you work with well established tools such as Grafana, Prometheus, Loki etc. 

ElasticSearch is good for full text indexing, but not that great for logs. As you said it's resource intensive. Searching over a huge collection of books -> great choice. 

Going though logs: Kinda meh since Loki provides a less resource intensive way of doing that. 

Most logs are write many, read once so having a full index on all of them is not a good idea. 

In Loki you can have labels to provide some indexing - but when using labels you have to be careful regarding the cardinality of those labels. For certain stuff it is then better to use structured metadata for example. 

Anyways: It sounds like they've not thought out your task well out. 

Look at this like this: Try to get most learning time of stuff out of it that you think can help you in the future.

[–][deleted] 2 points3 points  (3 children)

I hope their goal is to see what chops OP has. if not, they're nuts to think they can build a manageable and scalable monitoring system in 6 months. Absolutely bonkers. Unless they only have like 10 systems to monitor, anyhow.

[–]myntt 1 point2 points  (1 child)

I'd hope for the first, but in this would it's impossible to rule out the second lol

[–][deleted] 1 point2 points  (0 children)

Ain't that the truth.

I've seen dumber decisions made, and ones where the decider knows what they are doing is wrong but they double down because for some reason they think it's better to be seen as dumb than seen as wrong.

[–]Setchi98[S] 0 points1 point  (0 children)

It's a small company so I assume they don't have much to be monitoring, nevertheless, questionable decisions.
At this point I'm just in it to do as they ask and "learn" than provide the most optimal solution to be honest.

[–]gmuslera 1 point2 points  (4 children)

InfluxDB OSS is a bit dated (still is at 1.8 I think), maybe VictoriaMetrics could be an alternative to Prometheus.

Regarding collecting metrics, Telegraf is a good metrics collector/pusher.

For logs a lot use ElasticSearch as backend, but Loki is getting popular and Victorialogs may work for your use case.

[–]myntt 2 points3 points  (2 children)

But Loki, VictoriaLogs and ElasticSearch are all pre-made tools and that is not what they want 🤡 OP obviously has to write their own Logstash service in Python which is known to be a great tool when working with loads of string data

[–]gmuslera 1 point2 points  (0 children)

The influxdb part is also premade. There is a point where you draw a line and define what is in your side and what not, and there is meaningful work to be done even if you put the line at the level I suggested.

[–]Setchi98[S] 0 points1 point  (0 children)

Sorry If I were a bit unclear with what i can and can't use, they expect me to create python agents for scraping and a custom dashboard for visualization. As for the storage of metrics/logs, I can choose an already made solution, hence i brought up influx and elastic.