Long running JOB due to parallel execution

今天遇到一个并行运行案例, ETL开发人员发信说某个job跑了6小时了还没跑完.

他们只给了我一个SID, SID=4861, 于是我根据SID 查询等待事件,该等待事件是

PX Deq: Execute Reply , 只要是PX 开头的等待事件,它通常与并行执行有关,这里分享

一个小心得,遇到并行执行,我们不要去查SID,而是要去查询SQL_ID

select (sysdate-logon_time)*24*60 minutes, username,machine,sid,serial#,event,p1,p2,p3,

row_wait_obj# obj#,row_wait_file# file#,row_wait_block# block#,sql_id,sql_child_number from

gv$session where sql_id='9tuyvbacwkc5k';

MINUTES USERNAME MACHINE SID SERIAL# EVENT P1 P2 P3 OBJ# FILE# BLOCK# SQL_ID SQL_CHILD_NUMBER 442 ADWU_OPTIMA_WE11_ETL bdhp4462 2237 180 PX Deq: Execution Msg 268566527 10600 0 -1 0 0 9tuyvbacwkc5k 1 442 ADWU_OPTIMA_WE11_ETL bdhp4462 2303 19672 read by other session 80 982406 1 61027312 80 982406 9tuyvbacwkc5k 1 442 ADWU_OPTIMA_WE11_ETL bdhp4462 2375 29536 read by other session 80 982342 1 61027312 80 982342 9tuyvbacwkc5k 1 442 ADWU_OPTIMA_WE11_ETL bdhp4462 2402 56932 PX Deq: Execution Msg 268566527 10600 0 -1 0 0 9tuyvbacwkc5k 1 442 ADWU_OPTIMA_WE11_ETL bdhp4462 4816 60934 read by other session 80 982406 1 61027312 80 982406 9tuyvbacwkc5k 1 442 ADWU_OPTIMA_WE11_ETL bdhp4462 4818 5555 unspecified wait event 0 0 0 61027312 80 982406 9tuyvbacwkc5k 1 442 ADWU_OPTIMA_WE11_ETL bdhp4462 4827 51041 PX Deq: Execution Msg 268566527 10601 0 -1 0 0 9tuyvbacwkc5k 1 443 ADWU_OPTIMA_WE11_ETL bdhp4462 4861 8746 PX Deq: Execute Reply 200 768 0 65764751 217 54823 9tuyvbacwkc5k 0 442 ADWU_OPTIMA_WE11_ETL bdhp4462 4867 42001 PX Deq: Execution Msg 268566527 10600 0 -1 0 0 9tuyvbacwkc5k 1

大家请看 SID=2303, SID=4816 他们都是read by other session 并且 p1, p2都相等.

我查看了执行计划,有个表会扫描8次,而且是并行的扫描,所以就出现了read by other session

我们只需要关闭并行,这个job就可以跑完了.最终这个JOB跑了不到30分钟

read by other session 等待事件的解释(Metalink 732891.1):

When information is requested from the database, Oracle will first read the data from disk into the database buffer cache. If two or more sessions request the same information, the first session will read the data into the buffer cache while other sessions wait. In previous versions, this wait was classified under the "buffer busy waits" event. However, in Oracle 10.1 and higher, this wait time is now broken out into the "read by other session" wait event. Excessive waits for this event are typically due to several processes repeatedly reading the same blocks, e.g. many sessions scanning the same index or performing full table scans on the same table. Tuning this issue is a matter of finding and eliminating this contention.

也就是说当多个session需要读取同一个blockbuffer cache(注意:该block没有在buffer cache)

这个时候只能有一个session读,另外的session只能等待

10g之前,这个等待事件叫做 buffer busy waits,到了10g之后,这个等待事件叫做read by other session

不过让我搞不明白的就是为什么 并行扫描会读取block到 buffer cache中而不是 direct path read.莫非和跨节点并行扫描有关?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值