今天写了一个数据库的监控脚本,在测试脚本能否正常告警时,发现邮件发不出去。
这个系统的环境是这样的,在整个系统中,大部分的机器放在192.168.1网段,为trust area,数据库主机也在此网段;另外有几台机器在192.168.3网段,为DMZ area。发送邮件不能直接在数据库主机上直接发送,需要通过192.168.3网段上的一台smtp server进行relay。我们假设这台smtp server的IP为192.168.3.99,下面我们来开始配置db主机,使得db主机上用mailx命令发送的邮件能中继到smtp server上进行发送。
在这里,db主机的os环境是aix 5.3,需要配置的文件为/etc/sendmail.cf。我们先备份一下这个文件,然后来进行修改:
在这个文件中,找到有如下相关的行(如果没有需要自行添加):
# for sendmail
DSsmtp:[192.168.3.99]
DwMYDAB02
Cwlocalhost
DSsmtp:[192.168.3.99]
DwMYDAB02
Cwlocalhost
其中DSsmtp:[192.168.3.99]表示smtp server的IP为192.168.3.99
Dw后面直接跟本机的主机名
Cw后面跟localhost
修改上面的参数后,重启sendmail服务:
refresh -s sendmail
此时即可在db主机上,通过mailx命令将监控的告警邮件,relay到smtp sever,然后通过smtp sever集中发送:
#!/usr/bin/sh
###################################################################
# This script is written by username@cn.ibm.com at 2010-08-12.
# Because HQ monitor can not cover all the db parameters,
# it need to by monitor by this script.
# main_normal.sh monitor the normal process and run every 2 hours
# main_crital.sh monitor the crital process and run every 2 mins
####################################################################
#### PARAMETER AND WORKING PATH SETTING
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=/u01/app/oracle/oracle/product/10.2.0/db_1
export PATH=$ORACLE_HOME/bin:$PATH
export ORACLE_SID=MDBPRD
WORKPATH=/u03/db_monitor
LOGPATH=${WORKPATH}/log
SRPTPATH=${WORKPATH}/bin
MAILPATH=${WORKPATH}/mailresult
CLOG=${LOGPATH}/db_monitor_${ORACLE_SID}_$(date +%Y%m%d).clog
NLOG=${LOGPATH}/db_monitor_${ORACLE_SID}_$(date +%Y%m%d).nlog
MRESULT=${MAILPATH}/mail_result_${ORACLE_SID}_$(date +%Y%m%d).mresult
MAIL_TOOL=/usr/bin/mailx
TO_MAIL=jianminh@cn.ibm.com
CC_MAIL=jianminh@cn.ibm.com
cd ${WORKPATH}
#### CHECKING CRITAL PROCESS
v_lsnr=`ps -ef |grep tns |grep -v grep |wc -l`
v_process1521=`netstat -an |grep 1521|grep -v grep |wc -l`
v_crit_process=`ps -ef |grep ora_ |grep ${ORACLE_SID} |grep -v grep|wc -l`
#### WRITE CHECKING RESULT TO LOG
echo "#################################################">>$CLOG
echo "============= CRITAL REPORT BEGIN =============">>$CLOG
date>>$CLOG
echo "====THE NUMBER OF LNSR====">>$CLOG
echo $v_lsnr>>$CLOG
echo " ">>$CLOG
echo "====THE NUMBER OF PROCESS USING PORT 1521====">>$CLOG
echo $v_process1521>>$CLOG
echo " ">>$CLOG
echo "====THE NUMBER OF ORACLE BGPROCESS====">>$CLOG
echo $v_crit_process>>$CLOG
echo " ">>$CLOG
echo "============== CRITAL REPORT END ==============">>$CLOG
if [ $v_lsnr -lt 1 ]
then
cat /dev/null > $MRESULT
tail -12 $CLOG>$MRESULT
$MAIL_TOOL -s "IMPORTANT! MYDAB02 LSNR DOWN!" -c $CC_MAIL $TO_MAIL < $ MRESULT
else
echo " ok "
fi
if [ $ v_process1521 -lt 10 ]
then
cat / dev / null > $MRESULT
tail -12 $CLOG>$MRESULT
$MAIL_TOOL -s "IMPORTANT! MYDAB02 PORT 1521 DOWN!" -c $CC_MAIL $TO_MAIL < $ MRESULT
else
echo " ok "
fi
if [ $ v_crit_process -lt 5 ]
then
cat / dev / null > $MRESULT
tail -12 $CLOG>$MRESULT
$MAIL_TOOL -s "IMPORTANT! MYDAB02 ORACLE DOWN!" -c $CC_MAIL $TO_MAIL < $ MRESULT
else
echo " OK "
fi
###################################################################
# This script is written by username@cn.ibm.com at 2010-08-12.
# Because HQ monitor can not cover all the db parameters,
# it need to by monitor by this script.
# main_normal.sh monitor the normal process and run every 2 hours
# main_crital.sh monitor the crital process and run every 2 mins
####################################################################
#### PARAMETER AND WORKING PATH SETTING
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=/u01/app/oracle/oracle/product/10.2.0/db_1
export PATH=$ORACLE_HOME/bin:$PATH
export ORACLE_SID=MDBPRD
WORKPATH=/u03/db_monitor
LOGPATH=${WORKPATH}/log
SRPTPATH=${WORKPATH}/bin
MAILPATH=${WORKPATH}/mailresult
CLOG=${LOGPATH}/db_monitor_${ORACLE_SID}_$(date +%Y%m%d).clog
NLOG=${LOGPATH}/db_monitor_${ORACLE_SID}_$(date +%Y%m%d).nlog
MRESULT=${MAILPATH}/mail_result_${ORACLE_SID}_$(date +%Y%m%d).mresult
MAIL_TOOL=/usr/bin/mailx
TO_MAIL=jianminh@cn.ibm.com
CC_MAIL=jianminh@cn.ibm.com
cd ${WORKPATH}
#### CHECKING CRITAL PROCESS
v_lsnr=`ps -ef |grep tns |grep -v grep |wc -l`
v_process1521=`netstat -an |grep 1521|grep -v grep |wc -l`
v_crit_process=`ps -ef |grep ora_ |grep ${ORACLE_SID} |grep -v grep|wc -l`
#### WRITE CHECKING RESULT TO LOG
echo "#################################################">>$CLOG
echo "============= CRITAL REPORT BEGIN =============">>$CLOG
date>>$CLOG
echo "====THE NUMBER OF LNSR====">>$CLOG
echo $v_lsnr>>$CLOG
echo " ">>$CLOG
echo "====THE NUMBER OF PROCESS USING PORT 1521====">>$CLOG
echo $v_process1521>>$CLOG
echo " ">>$CLOG
echo "====THE NUMBER OF ORACLE BGPROCESS====">>$CLOG
echo $v_crit_process>>$CLOG
echo " ">>$CLOG
echo "============== CRITAL REPORT END ==============">>$CLOG
if [ $v_lsnr -lt 1 ]
then
cat /dev/null > $MRESULT
tail -12 $CLOG>$MRESULT
$MAIL_TOOL -s "IMPORTANT! MYDAB02 LSNR DOWN!" -c $CC_MAIL $TO_MAIL < $ MRESULT
else
echo " ok "
fi
if [ $ v_process1521 -lt 10 ]
then
cat / dev / null > $MRESULT
tail -12 $CLOG>$MRESULT
$MAIL_TOOL -s "IMPORTANT! MYDAB02 PORT 1521 DOWN!" -c $CC_MAIL $TO_MAIL < $ MRESULT
else
echo " ok "
fi
if [ $ v_crit_process -lt 5 ]
then
cat / dev / null > $MRESULT
tail -12 $CLOG>$MRESULT
$MAIL_TOOL -s "IMPORTANT! MYDAB02 ORACLE DOWN!" -c $CC_MAIL $TO_MAIL < $ MRESULT
else
echo " OK "
fi