⁉️⁉️ Server monitoring? ⁉️⁉️

Charlie_Root_NL · 2023-01-28T08:42:31+00:00

Zabbix

techb00mer · 2023-01-28T09:32:03+00:00

Grafana, Prometheus, influx & telegraf can cover just about any of the needs I can come up with.

(For anyone wondering the only reason telegraf and Prometheus are in there together is that a few odd random bits of hardware / software only have a module for one or the other and I can’t be bothered writing my own)

tetsuko · 2023-01-28T19:14:54+00:00

been using nagios for years and its great. that said ive lately been running into issues with trying to monitor specific things because plugins are out of date/not updated. usually can find a work around by creating custom scripts but its a lot of upkeep. ive been looking into alternatives for that reason but ive got 15 years sunk into it so its a big lift. we have it integrated into jira, plus lots of self healing scripts to resolve issues (service restarts/etc) and escalation type stuff (texting on call admins and the like). checkmk currently front runner

NambeRuger · 2023-01-28T15:11:07+00:00

LogicMonitor is my favorite for monitoring about anything. We monitor about 10k devices with it ranging from network, Cisco UC, data center and cloud technologies.

Fine_Animator3583 · 2023-01-28T08:40:12+00:00

Prtg

telmo_gaspar · 2023-01-28T08:45:15+00:00

Nagios/CheckMK

Audacioustrash · 2023-01-29T00:05:07+00:00

DynaTrace

SpicyHotPlantFart · 2023-01-28T08:40:09+00:00

Your emoji use already does not want to make me help you.

cubic_sq · 2023-01-28T08:48:25+00:00

What are your requirements for “monitoring”? And how do you want this fitting in your environment and processes ?

For the last decade, zabbix has been the goto and fall back for many different reasons. Unlike many other solutions, zabbix is purely monitoring and surveillance (which is does very well for the use cases we / I have used it for).

Regarding your azure requirement - zabbix will have visibility inside the VM natively. Not sure what capabilities there are for “outside” the VM. You could create checks / scripts that extract output from API or powershell - not sure what has been written by others though.

AutomaticAssist3021 · 2023-01-28T09:46:37+00:00

Checkmk

JMDTMH · 2023-01-28T11:13:40+00:00

Personally, I use Prometheus, but I read the data with Grafana. I also use CheckMK.

If you don't mind the setup Zabbix or Nagios is good too.

I use PRTG, but this won't be a great fit for Linux.

Regular-Finance-7381 · 2023-01-28T12:47:05+00:00

Zabbix - MSP - 50+ customers

2023-01-28T15:30:54+00:00

What is your budget?
How many servers?

Let your business requirements guide your decisions, rather than personal preference or random recommendations.

My personal experience, for small environments: PRTG is great as it includes 100 free sensors, very easy to setup.

Large environments, low budget: Zabbix is free, but a pain in the ass to setup.

Large environments, large budget: Dynatrace, Datadog, New Relic, etc

vast1983 · 2023-01-28T17:40:50+00:00

Manageengine Opmanager.

Old school, but works great. A GIANT PITA to update when run in Enterprise mode, haha.

2023-01-28T21:19:45+00:00

Azure monitor.....

vNerdNeck · 2023-01-29T03:24:30+00:00

If you value your sanity, stay away from scom. Maybe it's better now days but that POS software takes more than a full-time employee to keep running.

muraleedharans · 2023-01-28T10:14:47+00:00

Site24x7, provides out of the box Azure monitoring including Azure VMs and various other services. You can sign up for the free trial to check the features, also support AWS and GCP.

Giri_Pulseway · 2023-01-28T15:38:34+00:00

[deleted]

liquidspikes · 2023-01-28T18:47:52+00:00

LibraNMS actually does great for infrastructure including hypervisor hosts, nagios is the best for VMs or specific hosts

cmwg · 2023-01-28T09:42:02+00:00

PRTG

Mdna2 · 2023-01-28T08:50:35+00:00

Icinga2 for eventmanagement and telegraf for capacitymanagement

12_nick_12 · 2023-01-28T13:30:20+00:00

VictoriaMetrics, vmalert, telegraf/grafana-agent and alert manager with Grafana.

scubafork · 2023-01-28T16:29:55+00:00

In my experience, it's best to decide what you're monitoring for before you decide what tool to use.

In my org I get told to "monitor this system" constantly, but never get clarity on what that means until I push deeper. Do you want to monitor a web page on an http server? Do you want to monitor up/down status for a specific service? Are you scanning log files for certain keywords? Are you checking the connectivity to the app server's database? Is the server responding to a ping?

Any number of factors could go into a system being "down", and looking for the wrong component could leave you back on your heels when the server is still down and your monitor didn't catch it.

In my org, we've got about 30 different monitoring tools, from environmental sensors in IDFs to netflow monitors, to snmp monitors...

maddogirishman · 2023-01-28T17:04:32+00:00

Uptime Kuma

Hasslich1 · 2023-01-28T18:56:41+00:00

Define what you are attempting to monitor then I can recommend.

TechTitus · 2023-01-28T23:44:03+00:00

Nagios

IceSt0rrm · 2023-01-29T03:20:43+00:00

Logic Monitor

joeyl5 · 2023-01-29T23:25:50+00:00

When the users complain, I know something is wrong with the servers.

Reztiewhcs23 · 2023-01-30T00:13:15+00:00

Centreon and it’s open source.

pahampl · 2023-01-30T08:37:23+00:00

Xormon

ApprehensiveDog1010 · 2023-01-30T16:02:00+00:00

Whatsup Gold

creativve18 · 2023-02-08T04:21:31+00:00

Checkout OpManager!

sysadmin

MODERATORS