对于生产数据库,DBA除了备份外,监控也是重中之重,通常需要监控的包含:
1,备份视图
备份监控v$rman_backup_job_details
select d.name,
d.dbid,
to_char(r.start_time,'mm/dd/yyyy hh24:mi:ss') start_time,
to_char(r.end_time,'mm/dd/yyyy hh24:mi:ss') end_time,
r.TIME_TAKEN_DISPLAY,
status,
--decode(r.status,'COMPLETED','DONE','WARNING/ERROR') "BACKUP STATUS",
session_recid
from v$database d,V$RMAN_BACKUP_JOB_DETAILS r
where to_char(r.start_time,'mm/dd/yyyy hh24:mi:ss')=(select to_char(max(start_time),'mm/dd/yyyy hh24:mi:ss') from v$rman_backup_job_details)
NAME DBID START_TIME END_TIME TIME_TAKEN_DISPLAY STATUS SESSION_RECID
---------- ---------- ------------------- ------------------- -------------------- ----------------------- -------------
HDB 1582089372 06/09/2012 12:00:04 06/09/2012 12:00:46 00:00:42 COMPLETED WITH WARNINGS 614
查看备份进度V$SESSION_LONGOPS
SELECT SID, SERIAL#, CONTEXT, SOFAR, TOTALWORK,
ROUND(SOFAR/TOTALWORK*100,2) "%_COMPLETE"
FROM V$SESSION_LONGOPS
WHERE OPNAME LIKE 'RMAN%'
AND OPNAME NOT LIKE '%aggregate%'
AND TOTALWORK != 0
AND SOFAR <> TOTALWORK;
通过shell脚本收集备份信息例子-1
[oracle@oracle backup]$ less rman_backup_report.sh
#! /bin/bash
# Author : ***
# Date : 06/11/2012 11:10:00
# Purpose: To check the rman backup status of database
# Define variable
THIS_SCRIPT=$(basename $0)
# Start script
DBNAME="hdb.oracle.sannet.net itods.oracle.sannet.net RDPSDB"
DATE=`date +"%Y%m%d"`
export SH_HOME=/home/oracle/backup
export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 //当用到环境变量时一定要export,否则cron执行会出现问题.
for i in $DBNAME
do
$ORACLE_HOME/bin/sqlplus sys/oracle@$i as sysdba > $SH_HOME/rman_$i.log << EOF
@SH_HOME/rman_report.sql
exit;
EOF
cat $SH_HOME/rman_$i.log |awk NF|sed -n '9,9p' >> $SH_HOME/rman_temp_$DATE.log
done
echo "Database RMAN backup status report on $HOSTNAME `date +%m/%d/%Y`" > $SH_HOME/rman_report_$DATE.log
echo "Start report ........................................................" >> $SH_HOME/rman_report_$DATE.log
echo -e '\t' >> $SH_HOME/rman_report_$DATE.log
echo "DBNAME DBID START_TIME END_TIME TIME_TAKEN_DISPLAY STATUS SESSION_RECID" >> $S
H_HOME/rman_report_$DATE.log
cat $SH_HOME/rman_temp_$DATE.log >> $SH_HOME/rman_report_$DATE.log
echo -e '\t' >> $SH_HOME/rman_report_$DATE.log
echo "End report .........................................................." >> $SH_HOME/rman_report_$DATE.log
# Check report
if [ -f "$SH_HOME/rman_report_$DATE.log" ];
then grep -E -i "invalid|failed|warnings|SP2-0306|errors" $SH_HOME/rman_report_$DATE.log >> /dev/null
if [ $? -eq 0 ];
then mail -s "Error Backup Status Report on $HOSTNAME `date +"%m/%d/%Y"`" monitor@oraclegc.com < $SH_HOME/rman_report_$DATE.log
else
mail -s "Successful Backup Status Report on $HOSTNAME `date +"%m/%d/%Y"`" monitor@oraclegc.com < $SH_HOME/rman_report_$DATE.log
fi
else mail -s "No Backup Status Report on $HOSTNAME `date +"%m/%d/%Y"`" monitor@oraclegc.com << EOF
No backup status report on $HOSTNAME `date +"%m/%d/%Y"`, please check it!!!
EOF
fi
[oracle@oracle backup]$less rman_report.sql
set linesize 132
col TIME_TAKEN_DISPLAY for a20
select d.name DBNAME,
d.dbid DBID,
to_char(r.start_time,'mm/dd/yyyy hh24:mi:ss') start_time,
to_char(r.end_time,'mm/dd/yyyy hh24:mi:ss') end_time,
r.TIME_TAKEN_DISPLAY,
status,
--decode(r.status,'COMPLETED','DONE','WARNING/ERROR') "BACKUP STATUS",
session_recid SESSION_RECID
from v$database d,V$RMAN_BACKUP_JOB_DETAILS r
where to_char(r.start_time,'mm/dd/yyyy hh24:mi:ss')=(select to_char(max(start_time),'mm/dd/yyyy hh24:mi:ss')
from v$rman_backup_job _details);
exit;
b. 对log进行筛选,有异常时发送告警邮件,见上script的check report部分.2,磁盘空间监控
select name,round((1-free_mb/total_mb),4)*100 USED_PCT,ROUND(FREE_MB/1024,3) FREE_GB FROM V$ASM_DISKGROUP;
NAME USED_PCT FREE_GB------------------------------ ---------- ----------
DATA 66.45 1.676
RECOVERY 83.99 .8
使用脚本定期检查ASM磁盘组的使用比例,及时进行磁盘的调整.
3,alert日志监控
alter日志的监控可以通过系统定义对alter日志进行筛选,并发出错误与异常的邮件;
当alert日志出现ORA-,errors,failure等字眼时发送邮件通知DBA去查看异常.我们只需要在crontab里定义每半小时,对alert日志进行一次处理.
参见check_alert.sh.
4.通过Grid Control EM进行Metrics监控.
定义的内容主要
主机状态:数据库的开关机信息,Alert ORA-#####异常.
主机负载:CPU/内存负载,数据库服务器其他指定的重要目录空间使用百分比,磁盘I/O.
数据库空间:表空间,归档空间,闪回空间.
数据库应用:锁定的会话,用户等待事件.
可以参考<Oracle Enterprise Manager邮件及规则>