DBA手记:OEM罪几何?-空间监控的性能问题
在某金融行业用户的ERP数据库中,一个小时的采样报告,位于Elapsed Time消耗排行第二位的SQL消耗了19.41%的DB Time,该SQL同样是OEM发出来的,其SQL
Module是Oracle
Enterprise Manager.Metric Engine,这个SQL每次执行需要245.77秒的时间,是极其缓慢的,数据库环境是Oracle Database
10g 10.2.0.4版本:
该SQL的文本内容是:
insert into mgmt_db_size_gtt
select tablespace_name, NVL(sum(bytes) / 1048576, 0) sz
from sys.dba_free_space
group by tablespace_name
这显然是通过dba_free_space来计算各表空间的Free空间,这个SQL同样是OEM发出的,其执行计划可以通过AWR获得:
SQL> select *
from table(dbms_xplan.display_awr('4d6m2q3ngjcv9'));
insert into
mgmt_db_size_gttselecttablespace_name,NVL(sum(bytes)/1048576, 0)
sz
from
sys.dba_free_spacegroup by
tablespace_name
Plan hash value: 2413628916
---------------------------------------------------------------------------------
| Id| Operation| Name| Rows| Bytes | Cost|
---------------------------------------------------------------------------------
|0 | INSERT STATEMENT||||82 |
|1 |SORT GROUP BY||189 |5670 |82 |
|2 |VIEW|
DBA_FREE_SPACE|189 |5670 |35 |
|3 |UNION-ALL|||| |
|4 |NESTED LOOPS||68 |2584 |6 |
|5 |NESTED LOOPS||68 |2176 |6 |
|6 |TABLE ACCESS FULL| TS$|1 |23 |5 |
|7 |TABLE ACCESS CLUSTER| FET$|136 |1224 |1 |
|8 |INDEX UNIQUE SCAN|
I_FILE2|1 |6 ||
|9 |NESTED LOOPS||119 |5355 |6 |
|10 |NESTED LOOPS||119 |4641 |6 |
|11 |TABLE ACCESS FULL| TS$|19 |551 |5 |
|12 |FIXED TABLE FIXED INDEX| X$KTFBFE (ind:1)
|6 |60 |1 |
|13 |INDEX UNIQUE SCAN|
I_FILE2|1 |6 ||
|14 |NESTED LOOPS||1 |126 |20 |
|15 |NESTED LOOPS||1 |120 |20 |
|16 |NESTED LOOPS||1 |68 |3 |
|17 |TABLE ACCESS FULL| RECYCLEBIN$|1 |39 |2 |
|18 |TABLE ACCESS CLUSTER| TS$|1 |29 |1 |
|19 |INDEX UNIQUE SCAN| I_TS#|1 |||
|20 |FIXED TABLE FIXED INDEX| X$KTFBUE
(ind:1) |100 |5200 |17 |
|21 |INDEX UNIQUE SCAN| I_FILE2|1 |6 ||
|22 |NESTED LOOPS||1 |81 |3 |
|23 |NESTED LOOPS||1 |58 |2 |
|24 |NESTED LOOPS||1 |52 |2 |
|25 |TABLE ACCESS FULL| RECYCLEBIN$|1 |39 |2 |
|26 |TABLE ACCESS CLUSTER| UET$|1 |13 ||
|27 |INDEX UNIQUE SCAN| I_FILE#_BLOCK#|1 |||
|28 |INDEX UNIQUE SCAN| I_FILE2|1 |6 ||
|29 |TABLE ACCESS CLUSTER| TS$|1 |23 |1 |
|30 |INDEX UNIQUE SCAN| I_TS#|1 |||
---------------------------------------------------------------------------------
通过执行计划可以看到,在Oracle
Database 10g引入了回收站功能后,会将回收站(RECYCLEBIN$)中的空间计算为自由空间,加入到dba_free_space字典中。
如果数据库中存在大量的回收站对象,则这部分回收站空间的计算将会极为耗时,在这个数据库环境中,共有5万多个回收站对象:
SQL> select
count(*) from RECYCLEBIN$;
COUNT(*)
----------
51986
清理这些回收站对象可以大幅提升这个SQL查询的性能,在OEM中禁用这个Metric监控则可以彻底去除这个SQL访问。
在SQL报告中,显示了该SQL如下的详细信息:
在$ORACLE_HOME/rdbms/admin/catspace.sql脚本中,可以找到创建DBA_FREE_SPACE视图的脚本:
create or replace
view DBA_FREE_SPACE
(TABLESPACE_NAME, FILE_ID, BLOCK_ID,BYTES,
BLOCKS, RELATIVE_FNO)
as
select ts.name,
fi.file#, f.block#, f.length * ts.blocksize, f.length, f.file#
from sys.ts$ ts,
sys.fet$ f, sys.file$ fi
where ts.ts# = f.ts#
and f.ts# = fi.ts# and f.file# = fi.relfile# and ts.bitmapped = 0
union all
select/*+ ordered use_nl(f) use_nl(fi) */ ts.name, fi.file#, f.ktfbfebno,
f.ktfbfeblks * ts.blocksize,
f.ktfbfeblks, f.ktfbfefno
from sys.ts$ ts,
sys.x$ktfbfe f, sys.file$ fi
where ts.ts# =
f.ktfbfetsn and f.ktfbfetsn = fi.ts# and f.ktfbfefno = fi.relfile#
and ts.bitmapped <> 0 and ts.online$ in
(1,4) and ts.contents$ = 0
union all
select /*+ ordered use_nl(u) use_nl(fi) */ ts.name,
fi.file#, u.ktfbuebno,
u.ktfbueblks * ts.blocksize,
u.ktfbueblks, u.ktfbuefno
from sys.recyclebin$ rb, sys.ts$ ts,
sys.x$ktfbue u, sys.file$ fi
where ts.ts# =
rb.ts# and rb.ts# = fi.ts# and u.ktfbuefno = fi.relfile#
and u.ktfbuesegtsn = rb.ts# and
u.ktfbuesegfno = rb.file# and u.ktfbuesegbno = rb.block#
and ts.bitmapped <> 0 and ts.online$ in
(1,4) and ts.contents$ = 0
union all
select ts.name,
fi.file#, u.block#,u.length * ts.blocksize, u.length, u.file#
from sys.ts$ ts,
sys.uet$ u, sys.file$ fi, sys.recyclebin$
rb
where ts.ts# = u.ts#
and u.ts# = fi.ts# and u.segfile# = fi.relfile#
and u.ts# = rb.ts# and u.segfile# = rb.file#
and u.segblock# = rb.block# and ts.bitmapped
= 0
/
以上脚本中,后面两个UNION
ALL查询块是Oracle 10g引入的,并且为了修正这个视图带来的Bug,Oracle一直不停的在改进视图语句。注意视图中Hints的制定对于执行计划的强制影响。
我们要时刻牢记的是:当Oracle引入了某个新功能之后,同时也会引入很多问题,所以在使用新功能、新特性时要加强监控,及时发现和解决可能出现的问题。
By eygle on 2011-02-21 08:30 |
Comments (2) |
Case | 2731 |
2 Comments
曾经,我说的是曾经。
一套跑在p595满配上的10g rac系统
因为有了GC来监控,直接宕机
从此以后,所有的服务器上不准用GC