I have two UPS units at home, powering my batcave, which will get its own post sometime. Check_MK’s agent architecture really appeals to me, I naturally wanted to leverage it into the solution.
Basically, the Check_MK Agent is a small bash script that dumps a whole lot of plain text info out to STDOUT, this output is usually exposed over the network using xinetd, so when you open a socket on the port (6556 by default), with a telnet command, for example, you just get the whole output dumped right out.
This means that adding output is very simple, especially as the agent will try to run any executable files in the plugins dir, you can just read the agent script to see where that is on your installation. Creating the following script in the plugins dir, simply adds the output to the text dumped by the agent when it’s being invoked.
#!/bin/sh if which upsc > /dev/null 2>&1 ; then echo '<<<nut>>>' for ups in $(upsc -l) do upsc $ups| sed "s,^,$ups ," done fi
On my box, the output looks like this:
<<<nut>>> VT650 battery.voltage: 14.10 VT650 battery.voltage.high: -1.08 VT650 battery.voltage.low: -0.87 VT650 device.type: ups VT650 driver.name: blazer_usb VT650 driver.parameter.bus: 005 VT650 driver.parameter.pollinterval: 2 VT650 driver.parameter.port: auto VT650 driver.version: 2.6.4 VT650 driver.version.internal: 0.08 VT650 input.frequency: 49.9 VT650 input.voltage: 237.3 VT650 input.voltage.fault: 140.0 VT650 output.voltage: 238.3 VT650 ups.beeper.status: enabled VT650 ups.delay.shutdown: 30 VT650 ups.delay.start: 180 VT650 ups.load: 25 VT650 ups.productid: 0000 VT650 ups.status: OL VT650 ups.temperature: 30.0 VT650 ups.type: offline / line interactive VT650 ups.vendorid: ffff JP2000 battery.voltage: 28.10 JP2000 battery.voltage.high: -1.08 JP2000 battery.voltage.low: -0.87 JP2000 device.type: ups JP2000 driver.name: blazer_usb JP2000 driver.parameter.bus: 004 JP2000 driver.parameter.pollinterval: 2 JP2000 driver.parameter.port: auto JP2000 driver.version: 2.6.4 JP2000 driver.version.internal: 0.08 JP2000 input.frequency: 49.9 JP2000 input.voltage: 231.8 JP2000 input.voltage.fault: 150.0 JP2000 output.voltage: 235.2 JP2000 ups.beeper.status: enabled JP2000 ups.delay.shutdown: 30 JP2000 ups.delay.start: 180 JP2000 ups.load: 20 JP2000 ups.productid: 0000 JP2000 ups.status: OL JP2000 ups.temperature: 30.0 JP2000 ups.type: offline / line interactive JP2000 ups.vendorid: ffff
I’ll go into further detail on my NUT setup in a later post.
Now we have the data which is great, but Check_MK can’t make heads or tails of it. We need to parse it on the server side. Since Check_MK uses a python API, we are forced to use python to parse our data. The Check_MK parser should be in your check_MK server’s checks directory.
Since the python script itself is a bit on the long side of things, you can find it in my Check_MK plugin repo on github. This script has been created for Check_MK v1.1.12p7, but if you’re interested in adapting it to the latest (and more elegant) API, you’re welcome to fork it on github or ask me to do it.
The end result is quite pleasing and allows me to easily track my UPS load, get live email alerts about power outages at home and have all this data collected into RRDtool graphs through PNP4Nagios.
2 thoughts on “UPS monitoring with NUT, Nagios and Check_MK”
Great plugin, works well except the nut.input is throwing unknown errors, everything else works, this is on a APC-1500 i’m wondering it’s a parsing issue?
Do you still support the Check_MK plugin for nut? The plugin does not check if the string “RB” (= replace battery) appears in the ups.status. This is at least the APC UPS out when the battery is defective.
If the battery is defective, the following output appears:
ups.status: OL RB