Greatly improve OpenVPN check (read rewrite).
Now, instead of only checking for status file, it also reads status file to check whether it has been recently updated.
This has been tested in production and it properly catches a disconnection (while previous one wasn't).
Added a FIXME to the plugin that will have to fixed once our 3 last Debian 6 servers will have been upgraded to Debian 7.
svn path=/trunk/nagios/; revision=2086
Add support for devices that don't provide "standard" health information nor temperature information.
Degraded SMART check can then be done when asked with -ignore-details=1.
Read this commit as: "Add support for HP Smart Array P420i controller logical drives"
svn path=/trunk/nagios/; revision=2066
Add a plugin to check if we have buildbot master in weird state, with likely infinite build status running (for instance, the trigger).
Addition from Christophe Haen <christophe.haen at cern.ch>: This plugin CANNOT be used as it. You should use it with an event handler that disables
the check when it goes to critical state (so that you keep that state). Then, when you fix the buildmaster issue, you can reenable the plugin and it goes back
to OK. This is needed, because the plugin cannot check whether the found log line really leads to an issue, and cannot know when the issue is fixed.
svn path=/trunk/nagios/; revision=2050
Add a module to monitor VMware modules. These are not using DKMS, so it helps knowing when we are running without them.
Will help preventing failures on VMware Player testbot
svn path=/trunk/nagios/; revision=2048
Switch to a different representation for total storage in state file.
This allows having a server named "total" in the VPN
svn path=/trunk/munin/; revision=2014
Add a modified version of the check_serveraid plugin that allows monitoring ServeRAID raids.
It has been modified to remove any SSH nor sudo usage. It will be accessed through NRPE.
svn path=/trunk/nagios/; revision=1968
Use stat instead of open to effectively catch stale mount points.
Patch by Christophe Haen <christophe.haen at cern.ch>
svn path=/trunk/nagios/; revision=1423
Add the Icinga slave script. This script (to run as cron task) checks for Icinga master status.
If it's not OK twice, then the failover instance is enabled.
On the other hand, while failover is running, if master goes back to life, failover is disabled again.
This allows auto recovery of Icinga monitoring.
svn path=/trunk/nagios/; revision=1417
Add a new plugin to check Snort status, plus some data regarding the network monitored by Snort.
For the moment, it does not raise any alert on suspicious outputs
svn path=/trunk/nagios/; revision=1398
Add support in the check_smart plugin for our RAID controler so that we can check individually the SMART status of the disks in the array
svn path=/trunk/nagios/; revision=1392
Add a plugin to check mount points, and more specifically NFS mount points. Like the previous (python) one, this one compare /etc/fstab and /proc/mounts to ensure all mount points are populated.
Unlike the previous one, the plugin also attempts to open the mount points, to ensure those aren't stale. In case they are, it reports an error.
svn path=/trunk/nagios/; revision=1375
Add a plugin to check APC UPS status. This one reproduces behaviour of the one we used previously, but with the following change:
This one only handles the stuff our UPSes actually support.
This one relies on status file written by apcupsd instead of querying the UPS. This will prevent the issues we have on mothership.
Only issue is that this plugin might not be as reactive as the previous one depending on the refresh delay of apcupsd. But, that's only a matter of configuration.
svn path=/trunk/nagios/; revision=1374
Create a directory to store our Nagios/Icinga internally developed plugins.
Add a check_ram plugin, developed in C, faster (~ +20%) than the bash one we were using previously. Servers will use this new one.
All our scripts should be converted to C programs.
svn path=/trunk/nagios/; revision=1373