环境说明:
DB :Oracle 11.2.0.4.0
OS:Redhat 7.9
问题:
PGA 占用内存告警。
告警信息:
[ID:U_33XXX33][CJC 集团系统(AAAA ):CJC 集团系统,10.0.10.12][ 数据库][3 级][AAAA],10.0.0.100-oracle,2022.11.15 07:21:30 PGA per Session On CCCJCXXX-DB-vip-oracle is warnning!(10.0.0.100-oracle)[CJC 集团], 故障发生时间:2022-11-15 07:21:57
恢复信息:
[ID:U_33XXX33, 故障恢复][CJC 集团系统(AAAA ):CJC 集团系统,10.0.10.12][ 数据库][3 级][AAAA],10.0.0.100-oracle,2022.11.15 07:21:30 PGA per Session On CCCJCXXX-DB-vip-oracle is warnning!(10.0.0.100-oracle)[CJC 集团], 故障恢复时间:2022-11-15 07:37:00
问题分析:
收集问题时间段的 AWR 和 ASH 报告,有两个 SQL 耗时 900 秒,和故障时间完全吻合。
耗时SQL 文本如下:
—1
900 秒
begin dbms_feature_usage_internal.exec_db_usage_sampling(:bind1); end;
—2
900.24 秒
SQL_ID:2d1p0p5k3f8fu
select p, NULL, NULL from (select count(*) p from v$rman_status where operation = 'BLOCK MEDIA RECOVERY');
生成SQL 执行计划
set linesize 150
set pagesize 2000
select * from TABLE(dbms_xplan.display_cursor('2d1p0p5k3f8fu'));
Plan hash value: 139239320
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)| |
| 1 | VIEW | | 1 | 13 | 4 (100)| 00:00:01 |
| 2 | SORT AGGREGATE | | 1 | 153 | | |
|* 3 | FILTER | | | | | |
|* 4 | HASH JOIN OUTER | | 1 | 153 | 4 (100)| 00:00:01 |
| 5 | MERGE JOIN CARTESIAN | | 1 | 96 | 4 (100)| 00:00:01 |
| 6 | MERGE JOIN CARTESIAN | | 1 | 70 | 2 (100)| 00:00:01 |
| 7 | FIXED TABLE FULL | X$KCCRSR | 1 | 44 | | |
| 8 | BUFFER SORT | | 1 | 26 | 2 (100)| 00:00:01 |
| 9 | VIEW | | 1 | 26 | 2 (100)| 00:00:01 |
| 10 | HASH GROUP BY | | 1 | 91 | 2 (100)| 00:00:01 |
|* 11 | HASH JOIN OUTER | | 1 | 91 | 1 (100)| 00:00:01 |
| 12 | FIXED TABLE FULL | X$KCCRSR | 1 | 26 | | |
| 13 | VIEW | | 1 | 65 | 1 (100)| 00:00:01 |
| 14 | HASH GROUP BY | | 1 | 65 | 1 (100)| 00:00:01 |
| 15 | FIXED TABLE FULL| X$KSFQP | 1 | 65 | | |
| 16 | BUFFER SORT | | 1 | 26 | 2 (100)| 00:00:01 |
| 17 | VIEW | | 1 | 26 | 2 (100)| 00:00:01 |
| 18 | HASH UNIQUE | | 1 | 62 | 2 (100)| 00:00:01 |
|* 19 | HASH JOIN OUTER | | 1 | 62 | 1 (100)| 00:00:01 |
| 20 | FIXED TABLE FULL | X$KCCRSR | 1 | 26 | | |
| 21 | VIEW | | 1 | 36 | 1 (100)| 00:00:01 |
| 22 | WINDOW SORT | | 1 | 49 | 1 (100)| 00:00:01 |
|* 23 | FIXED TABLE FULL | X$KSFQP | 1 | 49 | | |
|* 24 | FIXED TABLE FULL | X$KRBMRST | 1 | 57 | | |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter((NVL(UPPER("OPER_KRBMRST"),UPPER("R2"."RSROP"))='BLOCK MEDIA
RECOVERY' AND "HH"."RECID"=NVL("ID_KRBMRST","R2"."RSRRID") AND
"HH"."STAMP"=NVL("STAMP_KRBMRST","R2"."RSRTST") AND
"ODEV"."RECID"=NVL("ID_KRBMRST","R2"."RSRRID") AND
"ODEV"."STAMP"=NVL("STAMP_KRBMRST","R2"."RSRTST")))
4 - access("R2"."RSRRID"="ID_KRBMRST" AND "R2"."RSRTST"="STAMP_KRBMRST")
11 - access("R"."RSRRID"="RS"."RMAN_STATUS_RECID" AND
"R"."RSRTST"="RS"."RMAN_STATUS_STAMP")
19 - access("R"."RSRRID"="RS"."RMAN_STATUS_RECID" AND
"R"."RSRTST"="RS"."RMAN_STATUS_STAMP")
23 - filter("TYPE"=2)
24 - filter(("STATUS_KRBMRST"=1 OR "STATUS_KRBMRST"=9 OR "STATUS_KRBMRST"=17 OR
"STATUS_KRBMRST"=25))
54 rows selected.
检查表数据量
select count(*) from sys.X$KCCRSR; ---4512
select count(*) from sys.X$KSFQP; ---1703
select count(*) from sys.X$KRBMRST; ---0
表虽然都不大,但是通过MERGE JOIN CARTESIAN 连接方式,最多可以生成4512*1703=7683936 结果集。
查看历史执行信息,执行开始时间没什么规律,单次执行耗时基本都是15 分钟。
SET PAGESIZE 10000
SET LINE 300
COL EVENT FOR A30
select TO_CHAR(SAMPLE_TIME,'YYYY-MM-DD HH24:MI:SS') TIME,SQL_ID,EVENT,USER_ID FROM dba_hist_active_sess_history WHERE SQL_ID='2d1p0p5k3f8fu' ORDER BY 1 desc;
TIME SQL_ID EVENT USER_ID
-------------------------------------- -------------------------- ------------------------------ ----------
2022-11-15 07:34:05 2d1p0p5k3f8fu direct path write temp 0
......
2022-11-15 07:19:14 2d1p0p5k3f8fu direct path write temp 0
......
2022-11-14 02:33:41 2d1p0p5k3f8fu 0
......
2022-11-14 02:18:49 2d1p0p5k3f8fu direct path write temp 0
......
解决方案
通过SQL 文本和执行计划可以定位到,此问题和Bug 14078947 吻合
- 禁用_optimizer_mjc_enabled 参数
alter system set "_optimizer_mjc_enabled" = false;
- 删除并锁定X$KCCRSR 统计信息
exec dbms_stats.DELETE_TABLE_STATS('SYS','X$KCCRSR');
exec dbms_stats.LOCK_TABLE_STATS('SYS','X$KCCRSR');
exec dbms_stats.UNLOCK_TABLE_STATS('SYS','X$KCCRSR');
检查效果 , 耗时由 15 分钟提升到 0.07 秒。
set timing on
select p, NULL, NULL from (select count(*) p from v$rman_status where operation = 'BLOCK MEDIA RECOVERY');
P N N
---------- - -
0
Elapsed: 00:00:00.07
查看新执行计划,表连接方式已经由MERGE JOIN CARTESIAN 换成HASH JOIN 。
set line 300
set pagesize 1000
set autotrace on
select p, NULL, NULL from (select count(*) p from v$rman_status where operation = 'BLOCK MEDIA RECOVERY');
Execution Plan
----------------------------------------------------------
Plan hash value: 2827575901
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4 (100)| 00:00:01 |
| 1 | VIEW | | 1 | 13 | 4 (100)| 00:00:01 |
| 2 | SORT AGGREGATE | | 1 | 153 | | |
|* 3 | HASH JOIN | | 1 | 153 | 4 (100)| 00:00:01 |
|* 4 | HASH JOIN | | 1 | 127 | 2 (100)| 00:00:01 |
| 5 | VIEW | | 100 | 2600 | 2 (100)| 00:00:01 |
| 6 | HASH GROUP BY | | 100 | 9100 | 2 (100)| 00:00:01 |
|* 7 | HASH JOIN OUTER | | 100 | 9100 | 1 (100)| 00:00:01 |
| 8 | FIXED TABLE FULL | X$KCCRSR | 100 | 2600 | 0 (0)| 00:00:01 |
| 9 | VIEW | | 1 | 65 | 1 (100)| 00:00:01 |
| 10 | HASH GROUP BY | | 1 | 65 | 1 (100)| 00:00:01 |
| 11 | FIXED TABLE FULL| X$KSFQP | 1 | 65 | | |
|* 12 | FILTER | | | | | |
|* 13 | HASH JOIN OUTER | | 100 | 10100 | 0 (0)| 00:00:01 |
| 14 | FIXED TABLE FULL | X$KCCRSR | 100 | 4400 | 0 (0)| 00:00:01 |
|* 15 | FIXED TABLE FULL | X$KRBMRST | 1 | 57 | | |
| 16 | VIEW | | 100 | 2600 | 2 (100)| 00:00:01 |
| 17 | HASH UNIQUE | | 100 | 6200 | 2 (100)| 00:00:01 |
|* 18 | HASH JOIN OUTER | | 100 | 6200 | 1 (100)| 00:00:01 |
| 19 | FIXED TABLE FULL | X$KCCRSR | 100 | 2600 | 0 (0)| 00:00:01 |
| 20 | VIEW | | 1 | 36 | 1 (100)| 00:00:01 |
| 21 | WINDOW SORT | | 1 | 49 | 1 (100)| 00:00:01 |
|* 22 | FIXED TABLE FULL | X$KSFQP | 1 | 49 | | |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ODEV"."RECID"=NVL("ID_KRBMRST","R2"."RSRRID") AND
"ODEV"."STAMP"=NVL("STAMP_KRBMRST","R2"."RSRTST"))
4 - access("HH"."RECID"=NVL("ID_KRBMRST","R2"."RSRRID") AND
"HH"."STAMP"=NVL("STAMP_KRBMRST","R2"."RSRTST"))
7 - access("R"."RSRRID"="RS"."RMAN_STATUS_RECID"(+) AND
"R"."RSRTST"="RS"."RMAN_STATUS_STAMP"(+))
12 - filter(NVL(UPPER("OPER_KRBMRST"),UPPER("R2"."RSROP"))='BLOCK MEDIA
RECOVERY')
13 - access("R2"."RSRRID"="ID_KRBMRST"(+) AND
"R2"."RSRTST"="STAMP_KRBMRST"(+))
15 - filter("STATUS_KRBMRST"(+)=1 OR "STATUS_KRBMRST"(+)=9 OR
"STATUS_KRBMRST"(+)=17 OR "STATUS_KRBMRST"(+)=25)
18 - access("R"."RSRRID"="RS"."RMAN_STATUS_RECID"(+) AND
"R"."RSRTST"="RS"."RMAN_STATUS_STAMP"(+))
22 - filter("TYPE"=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
650 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
通过10053 看下SQL 转换情况
alter session set tracefile_identifier='10053';
alter session set events='10053 trace name context forever,level 1';
select count(*) from v$rman_status;
alter session set events '10053 trace name context off';
select value from v$diag_info where name='Default Trace File';
查看SQL 转换
原始SQL
******************************************
----- Current SQL Statement for this session (sql_id=bv3y9sfd6dx38) -----
select p, NULL, NULL from (select count(*) p from v$rman_status where operation = 'BLOCK MEDIA RECOVERY')
*******************************************
转换后的SQL
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "from$_subquery$_001"."P" "P", NULL "NULL", NULL "NULL"
FROM (SELECT COUNT(*) "P"
FROM SYS."X$KCCRSR" "R2",
SYS."X$KRBMRST" "X$KRBMRST",
(SELECT "R"."RSRRID" "RECID",
"R"."RSRTST" "STAMP",
SUM("RS"."AGGRCOL") / 1048576 "MBYTES",
SUM("RS"."INPCOL") "INPBYTES",
SUM("RS"."OUTCOL") "OUTBYTES"
FROM SYS."X$KCCRSR" "R",
(SELECT "X$KSFQP"."RMAN_STATUS_RECID" "RMAN_STATUS_RECID",
"X$KSFQP"."RMAN_STATUS_STAMP" "RMAN_STATUS_STAMP",
SUM(CASE "X$KSFQP"."TYPE"
WHEN 3 THEN
"X$KSFQP"."BLOCKS" *
"X$KSFQP"."BLOCK_SIZE"
ELSE
0
END) "AGGRCOL",
SUM(CASE "X$KSFQP"."TYPE"
WHEN 1 THEN
"X$KSFQP"."BLOCKS" *
"X$KSFQP"."BLOCK_SIZE"
ELSE
0
END) "INPCOL",
SUM(CASE "X$KSFQP"."TYPE"
WHEN 2 THEN
"X$KSFQP"."BLOCKS" *
"X$KSFQP"."BLOCK_SIZE"
ELSE
0
END) "OUTCOL"
FROM SYS."X$KSFQP" "X$KSFQP"
GROUP BY "X$KSFQP"."RMAN_STATUS_RECID",
"X$KSFQP"."RMAN_STATUS_STAMP") "RS"
WHERE "R"."RSRRID" = "RS"."RMAN_STATUS_RECID"(+)
AND "R"."RSRTST" = "RS"."RMAN_STATUS_STAMP"(+)
GROUP BY "R"."RSRRID", "R"."RSRTST") "HH",
(SELECT DISTINCT "R"."RSRRID" "RECID",
"R"."RSRTST" "STAMP",
"RS"."DEVICE_TYPE" "DEVICE_TYPE"
FROM SYS."X$KCCRSR" "R",
(SELECT "X$KSFQP"."RMAN_STATUS_RECID" "RMAN_STATUS_RECID",
"X$KSFQP"."RMAN_STATUS_STAMP" "RMAN_STATUS_STAMP",
DECODE(COUNT(DISTINCT "X$KSFQP"."DEVTYPE")
OVER(PARTITION BY
"X$KSFQP"."RMAN_STATUS_RECID",
"X$KSFQP"."RMAN_STATUS_STAMP"),
1,
FIRST_VALUE("X$KSFQP"."DEVTYPE")
OVER(PARTITION BY
"X$KSFQP"."RMAN_STATUS_RECID",
"X$KSFQP"."RMAN_STATUS_STAMP"),
0,
NULL,
'*') "DEVICE_TYPE"
FROM SYS."X$KSFQP" "X$KSFQP"
WHERE "X$KSFQP"."TYPE" = 2) "RS"
WHERE "R"."RSRRID" = "RS"."RMAN_STATUS_RECID"(+)
AND "R"."RSRTST" = "RS"."RMAN_STATUS_STAMP"(+)) "ODEV"
WHERE NVL(UPPER("X$KRBMRST"."OPER_KRBMRST"), UPPER("R2"."RSROP")) =
'BLOCK MEDIA RECOVERY'
AND "HH"."RECID" = NVL("X$KRBMRST"."ID_KRBMRST", "R2"."RSRRID")
AND "HH"."STAMP" =
NVL("X$KRBMRST"."STAMP_KRBMRST", "R2"."RSRTST")
AND "ODEV"."RECID" = NVL("X$KRBMRST"."ID_KRBMRST", "R2"."RSRRID")
AND "ODEV"."STAMP" =
NVL("X$KRBMRST"."STAMP_KRBMRST", "R2"."RSRTST")
AND "R2"."RSRRID" = "X$KRBMRST"."ID_KRBMRST"(+)
AND "R2"."RSRTST" = "X$KRBMRST"."STAMP_KRBMRST"(+)
AND ("X$KRBMRST"."STATUS_KRBMRST"(+) = 1 OR
"X$KRBMRST"."STATUS_KRBMRST"(+) = 9 OR
"X$KRBMRST"."STATUS_KRBMRST"(+) = 17 OR
"X$KRBMRST"."STATUS_KRBMRST"(+) = 25)) "from$_subquery$_001"
欢迎关注我的公众号《IT小Chen》