Friday, February 8, 2013

Monitor customized application in Windows by SNMP

The native SNMP service  in Windows can provide basic metrics like CPU, memory and disk etc, but it doesn’t have “extend” feature in net-snmp, which allows you run a script for application monitoring. Net-snmp can’t be used as replacement for Windows SNMP service because some SNMP extension agent relies on it and known issue like HOST-RESOURCES MIB doesn’t work in net-snmp. 

 The good news is that you can have net-snmp co-exist with Windows SNMP, you can have nice features like extend ability, in the mean time, pass the other functions to native Windows SNMP service.

As of Net-SNMP 5.4, the Net-SNMP agent is able to load the Windows SNMP service extension DLLs by using the Net-SNMP winExtDLL extension. The extension requires the net-snmp binary to be native (32bit net-snmp extension won’t work in 64bit Windows).

Net-snmp 64bit binary is hard to find, it seems only net-snmp-5.5.0-2 has 64bit binary pre-compiled, you might need to compile yourself for other versions. 

Install net-snmp

Run the net-snmp binary installer select “with Windows Extenstion” instead of standard agent, unselect “net-snmp trap service” and “Perl SNMP modules”, the default path is c:\usr

Configure net-snmp

Register net-snmp as Windows service

Edit c:\usr\registeragent.bat to disable modules conflicting to Windows   by adding parameter.
(Note: if system_mib is also disabled, SNMPv2-MIB::sysuptime won’t report correct time)
Run c:\usr\registeragent.bat

Edit C:\usr\etc\snmp\snmpd.conf

rocommunity public
#Test extend feature to execute a script, the script path must use Unix style ‘/’
extend userscript c:/temp/test1.bat

Start Windows service “net-snmp agent”(Native SNMP service must be stopped)


#Test standard SNMP metrics, the HOST-RESOURCES-MIB is provided by native SNMP service, not net-snmp
[root@zabbix]#/usr/bin/snmpwalk -v 2c  -c public   HOST-RESOURCES-MIB::hrSystemUptime
HOST-RESOURCES-MIB::hrSystemUptime.0 = Timeticks: (640892116) 74 days, 4:15:21.16

#The extend feature is provided by net-snmp, Execute the script by snmpwalk
[root@zabbix]#/usr/bin/snmpwalk -v 2c -Ov -c public 'NET-SNMP-EXTEND-MIB::nsExtendOutLine."userscript"'
STRING: web-time=80
STRING: web-status=[ok]


Check which Windows modules loaded, start snmpd in command line with debugging “WinExtDLL”
Snmpd.exe -I-udp,udpTable,tcp,tcpTable,icmp,ip,interfaces,snmp_mib  -DwinExtDLL 



Thursday, February 7, 2013

Shell script to check Oracle Tablespace usage

I searched a shell script to check Oracle Tablespace usage, most scripts returned use complex SQL statements and they don’t report usage accurately, because auto-extend or multiple data files was not taken into account for calculation. Actually, there is a built-in view “dba_tablespace_usage_metrics” for the purpose starting from Oracle 10g. 
The following script check the Oracle database availability or tablespace usage and measure the response time.The scripts output “key=value” format, which can be easily discovered by LLD in Zabbix.(with LLD, Zabbix can dynamically discover any number of items to monitor without adding the items manually )

Script sample output

db-time= 71
db-status=[OK]: Name:SYSAUX SizeMB:1024 Used%: 73 ; Name:SYSTEM SizeMB:1024 Used%: 72 ; Name:USERS SizeMB:5 Used%: 20 ; Name:TEMP SizeMB:2048 Used%: 2 ; Name:UNDOTBS1 SizeMB:2048 Used%: 1 ;  8 rows selected.

The Oracle login in the script should have permission to read the view or have “select_catalog_role” role granted.

Script detail

function checkdb {

t1="$(date +%s%N)"

rt=$($ORACLE_HOME/bin/sqlplus -S ${OUSER}/${OPASS}@${TNSNAME}<< _END
set heading off
set linesize 200
   'Name:'|| tablespace_name,
   'SizeMB:'||round(TABLESPACE_SIZE*8/1024)||' Used%:',
order by 3 desc;

t2="$(date +%s%N)"
echo "db-time= $((($t2 - $t1)/1000000))"
#remove blank lines,ignore UNDOTBS,get the numeric value by removing tab and spaces
tbpct=$(echo "$rt" | egrep -v '^$|UNDOTBS' | head -1 | sed 's/.*Used%:\(.*\);/\1/'  |  sed 's/[ \t]*//g')
#Critical condition: thresh-hold > 95 or non-numeric value returned
if [ $tbpct -gt 95 ] || [[ "$tbpct" != +(\d) ]] ; then
 echo "db-status=[CRITICAL]:" $rt
 echo "db-status=[OK]:" $rt