1.今天发生一起enq: HW - contention引起的性能问题,记录一下:
2.查看enq: HW - contention有关的信息
select SAMPLE_TIME,MACHINE,SESSION_ID,BLOCKING_SESSION ,
SQL_ID,EVENT,TIME_WAITED,P1TEXT,P1,P2TEXT,P2,P3TEXT,P3
from dba_hist_active_sess_history where to_char(sample_time,'YYYY-MM-DD HH24:MI:SS') BETWEEN '2022-08-23 07:30:00' AND '2022-08-23 09:18:00' --and sql_id='7zkmy6p6ugy53'
AND event='enq: HW - contention'
结果:
3.其中p3参数是block address,我们可以使用dbms_utility工具去定位file_id与block号,进而定位object
select dbms_utility.data_block_address_file(p3) file_id,dbms_utility.data_block_address_block(p3) block# from (
select SAMPLE_TIME,MACHINE,SESSION_ID,BLOCKING_SESSION ,
SQL_ID,EVENT,TIME_WAITED,P1TEXT,P1,P2TEXT,P2,P3TEXT,P3
from dba_hist_active_sess_history where to_char(sample_time,'YYYY-MM-DD HH24:MI:SS') BETWEEN '2022-08-23 07:30:00' AND '2022-08-23 09:18:00' ---and sql_id='7zkmy6p6ugy53'
AND event='enq: HW - contention'
)
结果:
4.上述sql再套一层,定位object,发现是回滚段
select owner,segment_name from dba_extents a,
(select dbms_utility.data_block_address_file(p3) file_id,dbms_utility.data_block_address_block(p3) block# from (
select SAMPLE_TIME,MACHINE,SESSION_ID,BLOCKING_SESSION ,
SQL_ID,EVENT,TIME_WAITED,P1TEXT,P1,P2TEXT,P2,P3TEXT,P3
from dba_hist_active_sess_history where to_char(sample_time,'YYYY-MM-DD HH24:MI:SS') BETWEEN '2022-08-23 07:30:00' AND '2022-08-23 09:18:00' ---and sql_id='7zkmy6p6ugy53'
AND event='enq: HW - contention'
)) b
where b.file_id=a.file_id and b.block# between a.block_id and a.block_id+a.blocks
结果:
5.由于查出来是回滚段,file_id是2,我们再来看看回滚段的资讯
select file_id,file_name,bytes/1024/1024 bytes,maxbytes/1024/1024 maxbytes,autoextensible from dba_data_files where file_id=2
结果:
6.再看看目前回滚段的水位线
select tablespace_name, usedsize, freesize , autosize,used_per
from
(
select /* + RULE */ allsize.tablespace_name,allsize.usedsize , allsize.freesize, NVL(aut.autosize ,0) autosize,
Round((allsize.usedsize -allsize.freesize) * 100 / (usedsize + NVL(aut.autosize ,0)) ) used_per
FROM
(SELECT df.tablespace_name tablespace_name ,
ROUND(df.bytes / (1024 * 1024 * 1024),2) usedsize,
ROUND(SUM(fs.bytes) / (1024 * 1024 * 1024 ),2) freesize,
Nvl(Round(SUM(fs.bytes) * 100 / df.bytes),1) pused,
Round((df.bytes - SUM(fs.bytes)) * 100 / df.bytes) pfree
FROM dba_free_space fs,
(SELECT tablespace_name,SUM(bytes) bytes
FROM dba_data_files GROUP BY tablespace_name) df
WHERE fs.tablespace_name (+) = df.tablespace_name
AND df.tablespace_name NOT IN ('SYSAUX','SYSTEM' )
GROUP BY df.tablespace_name,df.bytes ) allsize,
(SELECT tablespace_name , ROUND(SUM((MAXBYTES-USER_BYTES)/(1024*1024*1024)),2) autosize
from dba_data_files
WHERE AUTOEXTENSIBLE ='YES' GROUP BY tablespace_name ) aut
where aut.tablespace_name (+) = allsize.tablespace_name
order by used_per desc
)
结果:
这里,事情就明朗了,主要是由于dml操作,产生大量的undo,用户进程一直推高undo tablespace的水位线,oracle首先会使用已分派的free space,如果free space不够,才会从undo tablespace扩展新得free space(前提是autoextensible=y),假如无法扩展,它将重用expired的空间,之后才会使用unexpired的空间,甚至从其他undo segment steal(偷)expired或者unexpired空间
undo分派空间的机制可见这篇文章Oracle怎样重用已过期和未过期的 undo extent?
SELECT * FROM dba_hist_undostat WHERE TO_CHAR(BEGIN_TIME,'YYYY-MM-DD HH24:MI:SS')>='2022-08-23 07:30:00' AND TO_CHAR(END_TIME,'YYYY-MM-DD HH24:MI:SS')<='2022-08-23 09:30:00'
从上图中可以看出,已经开始偷了,所以先解除undotablespace 的size限制,设定为可以自动扩充
alter database datafile 2 autoextend on maxsize unlimited
另外,需要注意的是,如果enq:hw-contention发生在table,可以使用如下语法预分配
alter table TABNAME allocate extent