On Mon, Jun 6, 2016, at 09:18, Manuel MarĂn wrote:
Dear Nanog community
We are currently planning to upgrade our monitoring system (Opsview) due to scalability issues and I was wondering what do you recommend for monitoring 5000 hosts and 35000 services. We would like to use a monitoring system that is compatible with the nagios plugin format, however we are not sure if systems like Icinga/Shinken/Op5 are the way to go.
Is someone using systems like Op5 or Icinga2 for monitoring > 5000 hosts? Would you recommend commercial systems like Sevone, Zabbix, etc instead of open source ones?
While not being completely drop-in compatible with Nagios plugins, Xymon (Big Brother clone) is up to the task of monitoring this many hosts/services. Here's a page with a list of businesses who are publicly reporting their use of Xymon and the number of hosts/services they're monitoring. ServiceNow is the biggest I've seen with 569,869 hosts and 740,185 status messages (different service checks being reported back in). It's really hard to find tools that can scale that large, but with the load distributed to a few Xymon Proxys which are reporting to your centralized instance it will scale as large as you want. https://en.wikibooks.org/wiki/System_Monitoring_with_Xymon/User_Guide/The_Xy.... I've used it for years and greatly prefer it to everything else due to its simplicity and config format. I find nagios's config format extremely tedious. As for Nagios plugins: Nagios derives the results of plugins from the status as exit codes: 0 = green, 1 = yellow, 2 = red if I recall correctly. If you just modify the plugin to execute a Xymon command as the last step and report the color instead of the exit code it should work fine. There was a tool called "xynagios" that automatically made nagios plugins work without modification but I haven't tried to use it and don't know if it's still out there. There are two things you might want to be aware of with Xymon: the monitoring data is not encrypted not the wire; it's up to you to handle that at the moment if you feel it is necessary. It also does not support IPv6. There was a huge rewrite in progress for years to handle both of these but it stalled out. Recently it has picked up a lot of development steam and they're scrapping the major rewrite and back porting the important things. I believe Xymon 4.4 will at least have the encrypted transport. -- Mark Felder feld@feld.me