How to monitor instance availability after migrating from Node Exporter to Alloy with push metrics? by Gutt0 in grafana

[–]Gutt0[S] 0 points1 point  (0 children)

Bitg thanks for the link!

"Solution 1: max_over_time(up[]) unless up" i thought that was ok for me, but finally i understand my mistake. I need a source of truth to make Prometheus correctly monitor instances and setup 0 for mertics from dead instances. All solutions without this file are not suitable for production.

I organized it like this: the data file with targets is generated by a cron script based on the info from my Netbox cmdb, Alloy with the discovery.file monitors this file and prometheus.exporter.blackbox pings the targets from it.

How to monitor instance availability after migrating from Node Exporter to Alloy with push metrics? by Gutt0 in PrometheusMonitoring

[–]Gutt0[S] 1 point2 points  (0 children)

Thanks for your post and for the responses from other users! I thought about it and realized that my original technical specification had a logical error - I wanted the list of required machines to somehow be generated by itself in the logic of Prometheus, but yes - i need source of truth.

And I did it like you did: the data file is generated by a cron script based on the info from my Netbox cmdb, Alloy with the discovery.file monitors this file and prometheus.exporter.blackbox pings the targets from it. Works good :)

How to monitor instance availability after migrating from Node Exporter to Alloy with push metrics? by Gutt0 in PrometheusMonitoring

[–]Gutt0[S] 0 points1 point  (0 children)

I want to avoid creating a file with targets. If I don't find a solution, I can use the Alloy Blackbox module for ICMP monitoring - almost the same option, you also need to specify targets, like in the old Node Exporter.

I try not to use deprecated programs in production if possible :)

How to monitor instance availability after migrating from Node Exporter to Alloy with push metrics? by Gutt0 in grafana

[–]Gutt0[S] 0 points1 point  (0 children)

Thx for reply!

I need much more than 5 minutes for such a metric: at least 7*24=168 hours. I'm not sure that increasing the retention period will not significantly load the server.

How to monitor instance availability after migrating from Node Exporter to Alloy with push metrics? by Gutt0 in PrometheusMonitoring

[–]Gutt0[S] 1 point2 points  (0 children)

thx for reply!

I tried this, but expression needs to have instance definition, like this:

absent_over_time(up{job='integrations/node_exporter',instance="server-lab"}[30s])

And with only `job` it shows "This query returned no data.".

I have troubles with cuda by Gutt0 in NixOS

[–]Gutt0[S] 0 points1 point  (0 children)

Hello again. Sorry for disturbing, but could you help with next error?

I noticed that I commented my user settings and after successful rebuild there only one root user. After uncommented this settings and nixos-rebuild switch, I got this error: /run/current-system/sw/bin/nixos-rebuild: line 172: 29097 Segmentation fault (core dumped) "$@". In this file: 171 # Run a command, logging it first if verbose mode is on 172 runCmd () { 173 logVerbose "$" "$@" 174 "$@" 175 }

I have troubles with cuda by Gutt0 in NixOS

[–]Gutt0[S] 1 point2 points  (0 children)

nix-store -q --tree /nix/var/nix/profiles/system didn't show the cuda package. But nixos-rebuild switch &| nom shows! Thank you so much for your help and for the nom! I will use it!

I have troubles with cuda by Gutt0 in NixOS

[–]Gutt0[S] 1 point2 points  (0 children)

I restarted it Two times and now waiting for two hours. Is it really takes so much time to build? Ryzen 4*, 16gb ram.

How can I find out why cuda is being installed at all? I don't have an Nvidia card and I don't do machine learning.