1. Health Monitor简介
Health Monitor是11g里新增加的特性,用于数据库的各层和各个组建的诊断检查。例如可以检查:文件损坏、物理逻辑块损坏、redo和undo故障、数据字典损坏等。HM可以根据检查的结果产生一个报表,并提供解决问题的建议。
1.1 运行方式:
1). Reactive
Fault diagnosability infrastructure能自动响应严重的错误(critical error)。
2). Manually
可以通过DBMS_HM系统包或 Enterprise Manager来手工运行HM。
1.2 运行模式
1). DB-ONLINE
2). DB-OFFLINE
所有HM都能在online模式下运行,只有Redo Integrity Check 和 DB Structure Integrity Check能在offline模式下运行。
2. HM的类型
也可以通过视图来查询:
SYS@PROD1>SELECT name ,DESCRIPTION FROM v$hm_check WHERE internal_check='N'; --internal_check='N' 可以手动检查的项目
NAME DESCRIPTION
---------------------------------------------------------------- --------------------------------------------------
DB Structure Integrity Check Checks integrity of all database files
CF Block Integrity Check Checks integrity of a control file block
Data Block Integrity Check Checks integrity of a data file block
Redo Integrity Check Checks integrity of redo log content
Transaction Integrity Check Checks a transaction for corruptions
Undo Segment Integrity Check Checks integrity of an undo segment
Dictionary Integrity Check Checks dictionary integrity
ASM Allocation Check Diagnose allocation failure
8 rows selected.
3. 手工运行HM
3.1 通过DBMS_HM系统包
例如检查字典完整性检查:
BEGIN
DBMS_HM.RUN_CHECK(CHECK_NAME=>'Dictionary Integrity Check',RUN_NAME=>'my_run5');
END;
/
该程序允许有输入参数,可以通过如下方式查找:
set linesize 200
set pagesize 200
col check_name for a50
col parameter_name for a20
col type for a20
col default_value for a20
col description for a50
SELECT c.name check_name, p.name parameter_name, p.type,p.default_value, p.description
FROM v$hm_check_param p, v$hm_check c
WHERE p.check_id = c.id and c.internal_check = 'N'
ORDER BY c.name;
CHECK_NAME PARAMETER_NAME TYPE DEFAULT_VALUE DESCRIPTION
-------------------------------------------------- -------------------- -------------------- -------------------- --------------------------------------------------
ASM Allocation Check ASM_DISK_GRP_NAME DBKH_PARAM_TEXT ASM group name
CF Block Integrity Check CF_BL_NUM DBKH_PARAM_UB4 Control file block number
Data Block Integrity Check BLC_DF_NUM DBKH_PARAM_UB4 File number
Data Block Integrity Check BLC_BL_NUM DBKH_PARAM_UB4 Block number
Dictionary Integrity Check CHECK_MASK DBKH_PARAM_TEXT ALL Check mask
Dictionary Integrity Check TABLE_NAME DBKH_PARAM_TEXT ALL_CORE_TABLES Table name
Redo Integrity Check SCN_TEXT DBKH_PARAM_TEXT 0 SCN of the latest good redo (if known)
Transaction Integrity Check TXN_ID DBKH_PARAM_TEXT Transaction ID
Undo Segment Integrity Check USN_NUMBER DBKH_PARAM_TEXT Undo segment number
9 rows selected.
带参数实例:
BEGIN
DBMS_HM.RUN_CHECK (check_name => 'Transaction Integrity Check',run_name => 'my_run1', input_params => 'TXN_ID=7.33.2');
DBMS_HM.RUN_CHECK(CHECK_NAME=>'Data Block Integrity Check',RUN_NAME=>'my_run2',TIMEOUT=>null,INPUT_PARAMS=>'BLC_DF_NUM = 5; BLC_BL_NUM = 203');
END;
3.2 通过OEM允许
Advisor Central -》Checkers -》 Run
4. 查看HM报表
4.1 通过DBMS_HM系统包
实例:
SET LONG 100000
SET LONGCHUNKSIZE 1000
SET PAGESIZE 1000
SET LINESIZE 512
SELECT DBMS_HM.GET_RUN_REPORT('my_run3') FROM DUAL;
DBMS_HM.GET_RUN_REPORT('MY_RUN3')
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Basic Run Information
Run Name : my_run3
Run Id : 381
Check Name : Data Block Integrity Check
Mode : MANUAL
Status : COMPLETED
Start Time : 2019-08-20 15:26:50.517521 +08:00
End Time : 2019-08-20 15:26:50.964425 +08:00
Error Encountered : 0
Source Incident Id : 0
Number of Incidents Created : 0
Input Paramters for the Run
BLC_DF_NUM=1
BLC_BL_NUM=23456
Run Findings And Recommendations
4.2 通过ADRCI查看
adrci> show hm_run
adrci> show hm_run
ADR Home = /u01/app/oracle/diag/clients/user_oracle/host_3987733264_80:
*************************************************************************
0 rows fetched
<ADR_RELATION name="">
<ADR_HOME name="/u01/app/oracle/diag/clients/user_oracle/host_3987733264_80">
ADR Home = /u01/app/oracle/diag/clients/user_oracle/host_772309917_80:
*************************************************************************
0 rows fetched
</ADR_HOME>
<ADR_HOME name="/u01/app/oracle/diag/clients/user_oracle/host_772309917_80">
ADR Home = /u01/app/oracle/diag/rdbms/prod1/PROD1:
*************************************************************************
**********************************************************
HM RUN RECORD 4
**********************************************************
RUN_ID 381
RUN_NAME my_run3
CHECK_NAME Data Block Integrity Check
NAME_ID 3
MODE 0
START_TIME 2019-08-20 15:26:50.517521 +08:00
RESUME_TIME <NULL>
END_TIME 2019-08-20 15:26:50.964425 +08:00
MODIFIED_TIME 2019-08-20 15:30:51.962982 +08:00
TIMEOUT 0
FLAGS 0
STATUS 5
SRC_INCIDENT_ID 0
NUM_INCIDENTS 0
ERR_NUMBER 0
REPORT_FILE /u01/app/oracle/diag/rdbms/prod1/PROD1/hm/HMREPORT_my_run3.hm
4.3 通过OEM
Advisor Central -》Checkers-》Runs -》View Report
5. HM视图
除了可以创建一个checker报表外,还可以在ADR里直接查看已经产生的报表。可查看的视图有:
V$HM_RUN
,V$HM_FINDING
, and V$HM_RECOMMENDATION
.
例如:
SYS@PROD1>SELECT run_id, name, check_name, run_mode, src_incident FROM v$hm_run;
RUN_ID NAME CHECK_NAME RUN_MODE SRC_INCIDENT
---------- -------------------------------- -------------------------------------------------- -------- ------------
301 my_run1 Data Block Integrity Check MANUAL 0
361 my_run2 Data Block Integrity Check MANUAL 0
381 my_run3 Data Block Integrity Check MANUAL 0
401 my_run5 Dictionary Integrity Check MANUAL 0
241 my_run Dictionary Integrity Check MANUAL 0
SYS@PROD1> SELECT type, description FROM v$hm_finding WHERE run_id =381;
TYPE DESCRIPTION
------------- -----------------------------------------
FAILURE Block 64349 in datafile 1: '/u01/app/oracle/oradata/PROD1/system01.dbf ' is media corrupt
FAILURE Block 64351 in datafile 1: '/u01/app/oracle/oradata/PROD1/system01.dbf ' is media corrupt