Please go to https://www.claudiokuenzler.com/monitoring-plugins/check_smart.php for a complete and updated documentation including changelog, extensive usage examples, monitoring configurations (including Nagios, Icinga 1, Icinga 2, Shinken and Naemon).
This is a plugin to monitor the health and values of SMART attributes of hard (HDD), solid state (SSD) and NVMe drives. The plugin is a fork of check_smart released in 2009 by Kurt Yoder. Since then the plugin has undergone a lot of changes. It allows to monitor drives behind hardware controllers and added a lot of parameters to fine tune the checks and set thresholds (on a per attribute setting).
This plugin needs to run as root, otherwise you're not able to lauch smartctl correctly. You have two options
-
Launch the plugin itself as root with sudo
-
Lauch the plugin as Nagios user and the smartctl command as root with sudo
Entry in sudoers (of course adapt your paths if necessary):
nagios ALL = NOPASSWD: /usr/lib/nagios/plugins/check_smart.pl # for option 1
nagios ALL = NOPASSWD: /usr/local/sbin/smartctl # for option 2
SATA disk behind MDRaid (Software Raid) on Linux:
/usr/lib/nagios/plugins/check_smart.pl -d /dev/sda -i ata
MegaRAID on Linux:
/usr/lib/nagios/plugins/check_smart.pl -d /dev/sda -i megaraid,8
Intel RAID on FreeBSD 9.2 ("kldload mfip.ko" required):
/usr/local/libexec/nagios/check_smart.pl -d /dev/pass0 -i scsi
SATA drives behind Intel RAID on FreeBSD 9.2 ("kldload mfip.ko" required):
/usr/local/libexec/nagios/check_smart.pl -d /dev/pass12 -i sat
SCSI drives behind HP RAID (CCISS) on FreeBSD 6.0:
/usr/local/libexec/nagios/check_smart.pl -d /dev/ciss0 -i cciss,0
OK: no SMART errors detected|defect_list=0 sent_blocks=3093462752 temperature=24;;68
/usr/local/libexec/nagios/check_smart.pl -d /dev/ciss0 -i cciss,3
WARNING: 48 Elements in grown defect list | defect_list=48 sent_blocks=1137657348 temperature=22;;68
Using threshold option (-b) to ignore 1 bad element, warning only when 2 bad elements are found:
/usr/local/libexec/nagios/check_smart.pl -d /dev/ciss0 -i cciss,1 -b 2
OK: 1 Elements in grown defect list (but less than threshold 2)|defect_list=1;2;2;; sent_blocks=2769458900762624 temperature=27;;65
SCSI drives behind HP RAID (CCISS) on Linux (Ubuntu hardy):
/usr/lib/nagios/plugins/check_smart.pl -d /dev/cciss/c0d0 -i cciss,0
OK: no SMART errors detected. |
Check all SATA disks (sda - sdz) at the same time on Linux:
/usr/lib/nagios/plugins/check_smart.pl -g "/dev/sd[a-z]" -i ata
OK: [/dev/sda] - Device is clean --- [/dev/sdb] - Device is clean|
Check all SCSI disks behind Intel RAID on FreeBSD 9.2 ("kldload mfip.ko" required):
/usr/local/libexec/nagios/check_smart.pl -g /dev/pass[1-9] -i scsi
OK: [/dev/pass0] - Device is clean --- [/dev/pass1] - Device is clean --- [/dev/pass2] - Device is clean --- [/dev/pass3] - Device is clean --- [/dev/pass4] - Device is clean --- [/dev/pass5] - Device is clean --- [/dev/pass6] - Device is clean --- [/dev/pass7] - Device is clean --- [/dev/pass8] - Device is clean --- [/dev/pass9] - Device is clean |
Single SCSI drive on FreeBSD 10.1:
/usr/local/libexec/nagios/check_smart.pl -d /dev/da0 -i scsi
OK: no SMART errors detected. |sent_blocks=14067306 temperature=34;;60
Single NVMe drive on Linux:
/usr/lib/nagios/plugins/check_smart.pl -d /dev/nvme0 -i nvme
OK: Drive Samsung SSD 970 PRO 512GB S/N XXXXXXXXXXXXXXX: no SMART errors detected. |Temperature=34 Available_Spare=100 Available_Spare_Threshold=10 Percentage_Used=0 Data_Units_Read=2854 Data_Units_Written=107590 Host_Read_Commands=67150 Host_Write_Commands=1406316 Controller_Busy_Time=20 Power_Cycles=16 Power_On_Hours=105 Unsafe_Shutdowns=6 Media_and_Data_Integrity_Errors=0 Error_Information_Log_Entries=0 Warning__Comp._Temperature_Time=0 Critical_Comp._Temperature_Time=0 Temperature_Sensor_1=34 Temperature_Sensor_2=33
see https://www.claudiokuenzler.com/monitoring-plugins/check_smart.php for more examples