Read by other session等待事件

http://www.xifenfei.com/1200.html

今天中午发现福建生产库报负载有点异常,处理思路记录下来:
1、使用top命令查看系统,发现系统负载是比以前要搞(平时都是1以下,今天已经稳定在4左右,总是有部分进城占用cpu比较高,系统cpu等待明显)
1.1)第一反应是有人执行sql导致,抓取占用cpu较高的spid,查询出对应sql,发现都是一些比较简单sql
1.2)查询这些spid的客户端是应用服务器,也就是说不是人为执行,那在一个稳定的系统中,不会出现sql突然改变的原因

2、查询系统是否因为有对象被阻塞导致,查询发现无对象被阻塞

3、查询系统等待事件,发现几十个read by other session等待,都是从一台web的服务器上连接过来
SELECT * FROM v$session WHERE wait_class#<>6;

4、read by other session等待事件比较陌生,幸好伴随有db file sequential read的等待事件,初步怀疑读取数据到内存中等待导致

5、查询资料发现
read by other session Definition: When information is requested from the database, Oracle will first read the data from disk into the database buffer cache. If two or more sessions request the same information, the first session will read the data into the buffer cache while other sessions wait.
In previous versions this wait was classified under the “buffer busy waits” event.
However, in Oracle 10.1 and higher this wait time is now broken out into the “read by other session” wait event. Excessive waits for this event are typically due to several processes repeatedly reading the same blocks, e.g. many sessions scanning the same index or performing full-table scans on the same table. Tuning this issue is a matter of finding and eliminating this contention.
Confio concludes with a summary that “read by other session waits” are very similar to buffer busy waits
When a session waits on the “read by other session” event, it indicates a wait for another session to read the data from disk into the Oracle buffer cache.
If this happens too often the performance of the query or the entire database can suffer. Typically this is caused by contention for “hot” blocks or objects so it is imperative to find out which data is being contended for. Once that is known this document lists several alternative methods for solving the issue.

总结:两个或者多个会话同时需要把硬盘中的对象装载到data buffer中,当其中一个会话把对象装入后,其他会话就处于read by other session等待状态;这个是oracle 10g 从oracle 9i的buffer busy waits中分离出来的,也是需要一种热块现象

6、根据FILE#,BLOCK#查询热块对象
SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
FROM DBA_EXTENTS A
WHERE FILE_ID = &FILE_ID
AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS – 1;

7、通过这个对象查询出对应的操作语句
select * from v$sql where upper(sql_text) like ‘%object_name%’;

8、直接查找热点块对象语句

SELECT *
   FROM ( SELECT O.OWNER, O.OBJECT_NAME, O.OBJECT_TYPE, SUM (TCH) TOUCHTIME,
           FROM X$BH B, DBA_OBJECTS O
          WHERE B.OBJ = O.DATA_OBJECT_ID
            AND B.TS# > 0
          GROUP BY O.OWNER, O.OBJECT_NAME, O.OBJECT_TYPE
          ORDER BY SUM (TCH) DESC )
  WHERE ROWNUM <= 10
--或者
SELECT E.OWNER, E.SEGMENT_NAME, E.SEGMENT_TYPE
   FROM DBA_EXTENTS E,
        ( SELECT *
           FROM ( SELECT ADDR, TS#, FILE#, DBARFIL, DBABLK, TCH
                   FROM X$BH
                  ORDER BY TCH DESC )
          WHERE ROWNUM < 11) B
  WHERE E.RELATIVE_FNO = B.DBARFIL
    AND E.BLOCK_ID <= B.DBABLK
    AND E.BLOCK_ID + E.BLOCKS > B.DBABLK;

9、直接查找热点块操作语句

SELECT /*+ rule */
  HASH_VALUE, SQL_TEXT
   FROM V$SQLTEXT
  WHERE (HASH_VALUE, ADDRESS) IN
        ( SELECT A.HASH_VALUE, A.ADDRESS
           FROM V$SQLTEXT A,
                ( SELECT DISTINCT A.OWNER, A.SEGMENT_NAME, A.SEGMENT_TYPE
                   FROM DBA_EXTENTS A,
                        ( SELECT DBARFIL, DBABLK
                           FROM ( SELECT DBARFIL, DBABLK
                                   FROM X$BH
                                  ORDER BY TCH DESC )
                          WHERE ROWNUM < 11) B
                  WHERE A.RELATIVE_FNO = B.DBARFIL
                    AND A.BLOCK_ID <= B.DBABLK
                    AND A.BLOCK_ID + A.BLOCKS > B.DBABLK) B
          WHERE A.SQL_TEXT LIKE '%' || B.SEGMENT_NAME || '%'
            AND B.SEGMENT_TYPE = 'TABLE' )
  ORDER BY HASH_VALUE, ADDRESS, PIECE;

10、也可以通过awr查询出来相关对象
Segments by Buffer Busy Waits
语句需要通过这些等待对象进行判断

到了1点钟左右,数据库read by other session等待事件消失,数据库恢复正常负载(因为系统还能够承受,所以当时没有采用kill进程的方法)

事后对查询出来的热点块对象和操作语句,已经访问服务器和开发确认的结果为:这个是一个报表功能,但是平时没有人用,所以也没有关注,今天突然被人用了下,导致这个问题发生,他们承诺在升级下个版本中解决这个问题。



===========================

今天上午发现dw数据库比正常要慢,很多过程长时间未结束。

在此之前发现从0点到10点多数据库生成的wrh快照都在一个snap_id下,正常情况应该是每小时都生成新的snap_id。

之后是先将scheduler中的GATHER_STATS_JOB,AUTO_SPACE_ADVISOR_JOB 停到

同时将这两个job正在执行session 杀掉。


之后查询等待事件,发现有几个等待事件等待时间特别长

db file scattered read 这个长时间等待开始没注意,本以为是数据仓库大数据量的全部扫描,到最后才发现是AUTO_SPACE_ADVISOR_JOB的长时间等待,之前由于只把活动session 杀掉了,对应的进程还在。找到对应的进程杀掉后,所有长时间等待时间均消失。

read by other session 最开始问题是从这里入手的,通过上面的文章介绍后,知道了这是有多个不同session在等待某个session将相同的数据库读取到buffer cache中,

但是找了好久不知道被等待的session是哪个。后来听说read by other session 和db file scattered read 事件一般同时出现,向前一看有个db file scattered read ,



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值