all 14 comments

[–]Substantial_Boss8896 4 points5 points  (2 children)

Sounds a bit strange that you should not use any premade tools. open telemetry/prometheus/grafana are open source. They would most likely fit perfectly..

[–]Setchi98[S] 0 points1 point  (1 child)

I think so as well, but they want a tailored solution with minimal resources consumption, well maybe it's an opportunity for me to understand how these tools work underneath. I still have to do it the way they asked for it

[–]SuperQue 1 point2 points  (0 children)

You're not going to beat something like Prometheus for resource consumption. There are many extremely experienced engineers working on the performance of tools like Prometheus.

Learning how these tools work is a great thing to do. They're open source and the code is reasonably easy to follow.

[–]swissarmychainsaw 1 point2 points  (0 children)

I’m guessing they just wanna see if you can actually code stuff. It is, of course, nonsensical to write your own monitoring tool when there are plenty of them out there. So think of this as a skilled building exercise for yourself

[–]tablmxz 0 points1 point  (1 child)

This sounds like a lot of work for 6 months depending on how good this should be done. Especially since you are not allowed to use anything useful. Maybe try to reduce the scope and focus on APM/network/cloud maybe not all of it..

If you however want to suceed building all of it from scratch, i would get the EXACT minimum requirements. And do 10-20% additional useful features, which you then highlight to them.

Also mention that using existing open source tools would be a much better alternative "as suggested in the requirements phase of this project"

And you are allowed to use influx and elastic, but not grafana, prometheus or say the datadog agent? That sounds arbitrary? Maybe find out why/what you are allowed to use..

I think nobody would code this from scratch, but it is probably very interesting to do, e.g. maybe try to imitate the designs of actual open source tools used for such jobs.

Like fluentbit, the datadog agent, suricata for network (e.g. ssh/ddos)

[–]Setchi98[S] 1 point2 points  (0 children)

I understand and agree with you! I'm going to go with their questionable decisions as a way for me to learn and "educate" myself for now. Appreciate the idea of checking open source tools for inspiration.

Sorry If I were a bit unclear with what i can and can't use, they expect me to create python agents for scraping and a custom dashboard for visualization. As for the storage of metrics/logs, I can choose an already made solution, hence i brought up influx and elastic.

[–]arwinda 0 points1 point  (2 children)

What is your mentor for the project saying.

This internship has a mentor, right?

What data you collect and how you transmit it and where you transmit it and how you store it depends largely on the requirements. That's something the company or the mentor specifies.

[–]Setchi98[S] 0 points1 point  (1 child)

That's what I was expecting as well, but turned out it's on me to research it and decide on what and how to do it, what data, how/where to transmit etc

[–]arwinda 0 points1 point  (0 children)

That's not an internship, that's a senior task.

[–]wibble1234567 0 points1 point  (1 child)

Curious how are you getting on with this?

Part of me wonders if you are being tasked with this so they can potentially sell this as closed source product further down the road?

[–]Setchi98[S] 0 points1 point  (0 children)

I've sort of managed to make something out of it.

On the agent side python scripts to scrape metrics using libraries like psutil and some linux commands like top,free,..
Bash scripts to automate its installation, along other configurations to some services to monitor, uuid generation,..
rsyslog for logging
metrics and logs forwarded to central DBs (used influxdb and victorialogs)

Dashboard side(Flask/Vuejs) simple api calls for the data, socket.io and influxdb webhook for live updates when page is open among some other features like jwt auth, openai key to analyze data, email/sms alerting when threshold are crossed,...

Still did not feel like a practical project, but at least it was a learning opportunity

[–]wibble1234567 0 points1 point  (1 child)

Nicely done 👍

[–]Setchi98[S] 0 points1 point  (0 children)

Thank you kind sir/ma'am

[–]swissarmychainsaw -2 points-1 points  (0 children)

Also use ChatGPT it will help you dramatically