Oracle 磁盘IO异常诊断定位

本文详细分析了一个系统因SQL查询导致的高IO负载问题,通过查询数据库会话、物理读统计、等待事件等信息,发现特定SQL存在大量物理读和直接路径读。关闭直接路径读并检查执行计划后,提出优化建议,包括停用相关业务和调整SQL以减少对存储设备的压力。
摘要由CSDN通过智能技术生成

说明:
客户反应,系统IO负载很高,前台业务卡顿,远程连接排查问题.
首先IO负载高存在两种情况.
操作系统层面观察IO浮动情况,判断是否由于存储设备异常导致.
数据库层面定位物理读,进而确定导致IO异常的原因.

由于操作系统IO负载很高,我们在数据库内部可以直接通过v$ sesstat,v$ statname以及v$session三个视图获取到物理读最多的会话正在执行的SQL.

确定物理读统计信息号:
select name,statistic# from v$statname where name ='physical reads';

查询物理读:
select a.sid,b.sql_id,value as "physical reads" from v$sesstat a,v$session b 
where a.sid=b.sid and a.statistic#=94 order by value; 
       SID SQL_ID                                  physical reads
---------- --------------------------------------- --------------
      1584                                                3560472
      1758                                                3750105
      3003                                                4061739
      2917                                                4217401
      4236                                                4681117
       597                                                4875034
      3578 bztcfgg1by2zv                                  9638387
      5156                                               27184195
      1261                                               28419847
      1096 70phss1ap5z03                                 41173207
      5242                                               41764813

       SID SQL_ID                                  physical reads
---------- --------------------------------------- --------------
      5065                                               43494710
      3580                                               45471744
      2754 70phss1ap5z03                                 47600984
      2326 70phss1ap5z03                                 48762014
       836                                               48931546
      3741                                               48931548
        87 70phss1ap5z03                                 49082225
      2502 70phss1ap5z03                                 49330412
      5230 70phss1ap5z03                                 49331403
      4498 70phss1ap5z03                                 49576109
      1580                                               51155712

       SID SQL_ID                                  physical reads
---------- --------------------------------------- --------------
      1913                                               51402846
      4662 70phss1ap5z03                                 52789677
      2506                                               54615520

~
从SQL结果就可以看出很多会话因为这条SQL进行了很多的物理读
跟踪IO浮动情况,发现在没有会话执行这条SQL时,IO负载很小(百分之30以下),在存在会话执行该SQL时,IO负载很高(百分之99).

获取SQL_TEXT内容:

SQL> select sql_text from v$sql where sql_id='70phss1ap5z03';
SQL_TEXT
--------------------------------------------------------------------------------
SELECT Count ( *) FROM l_patientinfo WHERE jgid =:1 and txsj < sysdate AND resul
tstatus < 4 AND labdepartment =:2 AND stayhospitalmode =1 AND ( SubStr ( checker
opinion , 1 , 1 ) <> '§' Or checkeropinion Is Null )

拉取当天9.30-10.00的AWR报告以及3天前9.30-10.00的AWR报告
(由于2天前进行过数据库重启,与数据库重启前进行对比 )
/home/oracle/awrrpt_1_72800_72801.html

select event,count(*) from gv$session_wait where wait_class<>‘Idle’ group by event;
存在很多direct path read等待事件(Oracle11g建议关闭直接路径读功能)

根据等待事件查询一下SQL_ID:

select sid,serial#,event,sql_id from gv$session where wait_class<>'Idle' and event like  '%direct%' ;
       SID    SERIAL# EVENT                SQL_ID
---------- ---------- -------------------- --------------------
        87      24639 direct path read     70phss1ap5z03
      1096       9483 direct path read     70phss1ap5z03
      2502       1099 direct path read     70phss1ap5z03
      2754        819 direct path read     70phss1ap5z03
      4498       1287 direct path read     70phss1ap5z03
      4662      14049 direct path read     70phss1ap5z03
      5065       4973 direct path read     70phss1ap5z03
      5230       5015 direct path read     70phss1ap5z03

根据SQL_ID捕获bind变量:

SELECT t.name,t.POSITION,T.DATATYPE_STRING,t.VALUE_STRING, t.VALUE_ANYDATA
FROM v$sql_bind_capture t
WHERE sql_id = '70phss1ap5z03'
:1	1	NUMBER	      6
:2	2	CHAR(32)	3081
:1	1	NUMBER	      7
:2	2	CHAR(32)	3081
:1	1	NUMBER	      2
:2	2	CHAR(32)	3087

历史BIND查询视图:DBA_HIST_SQLBIND

代入bind变量,最终运行SQL

SELECT Count ( *) FROM bslis52.l_patientinfo 
WHERE jgid =6 and txsj < sysdate 
AND resultstatus < 4 AND labdepartment =3081 
AND stayhospitalmode =1 
AND ( SubStr ( checkeropinion , 1 , 1 ) <> '§' Or checkeropinion Is Null )
直接通过PLSQL进行解析执行计划.
全表扫描.

查询该表相关索引:

select index_name,table_name,table_owner,column_name from dba_ind_columns 
where table_name='L_PATIENTINFO'  and table_owner='BSLIS52';

L_PATIENTINFO_CHECKTIME	     L_PATIENTINFO	BSLIS52	CHECKTIME
L_PATIENTINFO_YBID	            L_PATIENTINFO	BSLIS52	YBID
PK_PATIENTINFO	            L_PATIENTINFO	BSLIS52	SAMPLENO
L_PATIENTINFO_SFZH	            L_PATIENTINFO	BSLIS52	SFZH
IDX_L_PATIENTINFO_DOCTADVISENO   L_PATIENTINFO	BSLIS52	DOCTADVISENO
PK_PATIENTINFO	            L_PATIENTINFO	BSLIS52	JGID
IDX_L_PATIENTINFO_PATEIENTID     L_PATIENTINFO	BSLIS52	PATIENTID

查询该表统计信息是否过期:

select owner, table_name, object_type, stale_stats, last_analyzed
from dba_tab_statistics
where owner='BSLIS52' 
and table_name='L_PATIENTINFO';
OWNER      TABLE_NAME           OBJECT_TYP STALE_STA LAST_ANALYZED
---------- -------------------- ---------- --------- -------------------
BSLIS52    L_PATIENTINFO        TABLE      NO        2021-01-02 22:09:46

获取真实的执行计划看看:

SQL> alter session set STATISTICS_LEVEL = ALL;
Session altered.

SQL> SELECT Count ( *) FROM bslis52.l_patientinfo 
  2  WHERE jgid =6 and txsj < sysdate 
  3  AND resultstatus < 4 AND labdepartment =3081 
  4  AND stayhospitalmode =1 
  5  AND ( SubStr ( checkeropinion , 1 , 1 ) <> '§' Or checkeropinion Is Null );

  COUNT(*)
----------
         0

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  fzsvyndrunzk3, child number 0
-------------------------------------
SELECT Count ( *) FROM bslis52.l_patientinfo WHERE jgid =6 and txsj <
sysdate AND resultstatus < 4 AND labdepartment =3081 AND
stayhospitalmode =1 AND ( SubStr ( checkeropinion , 1 , 1 ) <> '§' Or
checkeropinion Is Null )

Plan hash value: 1785457606

-------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |               |      1 |        |      1 |00:00:12.43 |     247K|    247K|
|   1 |  SORT AGGREGATE    |               |      1 |      1 |      1 |00:00:12.43 |     247K|    247K|
|*  2 |   TABLE ACCESS FULL| L_PATIENTINFO |      1 |     37 |      0 |00:00:12.43 |     247K|    247K|
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("RESULTSTATUS"<4 AND "JGID"=6 AND "STAYHOSPITALMODE"=1 AND
              TO_NUMBER("LABDEPARTMENT")=3081 AND ("CHECKEROPINION" IS NULL OR
              SUBSTR("CHECKEROPINION",1,1)<>'§') AND "TXSJ"<SYSDATE@!))


24 rows selected.

由于在执行该问题SQL时,出现的等待事件为直接路径读,起初分析直接路径读从PGA中直接读取数据,是否可能存在PGA占用过高情况.如果关闭直接路径读,是否可以降低物理读情况,一次性将该表扫描至buffer cache中,后续从buffer cache中获取.

查询会话执行问题SQL时的PGA占用情况:

select a.sid,value/1024/1024,c.sql_id,c.event as Mbytes from v$sesstat a, v$statname b, v$session c
where                                                               
a.STATISTIC# = b.STATISTIC# and  a.sid=c.sid                                  
and b.name = 'session pga memory' order by 2;

关闭直接路径.让其进行离散读:

alter system set "_serial_direct_read"='FALSE' scope=both sid='*';

查询直接路径读是否开启:
SELECT   ksppinm, ksppstvl, ksppdesc
FROM   x$ksppi x, x$ksppcv y
WHERE   x.indx = y.indx AND  ksppinm = '_serial_direct_read';

先将该条SQL对应的业务停掉.后续给出优化建议.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值