oracle 回话堵塞,oracle如何处理阻塞会话,library cache lock

问题现象

##### 统一监控告警平台-DB: [触发]

`告警集群:` jq-prod-db-prometheus

`告警描述:` 数据库Oracle堵塞会话超过5分钟,会话ID:1057

`告警环境:` prod

`告警类型:` oracle_prod

`告警实例:` nba_stb

`告警IP:` 11.11.11.11

`实例别名:` nba_stb

`当前数值:` 2

`告警级别:` 严重

`触发时间:` 2021-03-02 13:33:59

`持续时间:` 10s

`告警次数:` 1

`告警应用:`nba_oracle

`应用负责人:` unknown

`告警集群:` jq-prod-db-prometheus

`告警描述:` 数据库Oracle堵塞会话超过5分钟,会话ID:4137

`告警环境:` prod

`告警类型:` oracle_prod

`告警实例:` nba_stb

`告警IP:` 11.11.11.11

`实例别名:` nba_stb

`当前数值:` 1

`告警级别:` 严重

`触发时间:` 2021-03-02 13:33:59

`持续时间:` 10s

`告警次数:` 1

`告警应用:`nba_oracle

`应用负责人:` unknown

处理过程

登录11.11.11.11进入oracle,并设置列宽(如果使用plsql则不用设置)

[bzops@nba-db-01 ~]$ sudo -i

[root@nba-db-01 ~]# su - oracle

[oracle@nba-db-01 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 2 15:05:36 2021

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>

SQL> col sample_time for a30

col event for a30

col program for a30

col machine for a40

set lines 200 pages 999

查看当前的阻塞会话是空,说明当前阻塞已经结束,blocking_session代表阻塞者,此字段不为空则说明当前有阻塞

SQL> select program,sql_id,event,blocking_session,sid,last_call_et from v$session where blocking_session is not null;

no rows selected

SQL>

虽然当前阻塞结束了,但是oracle有活动会话历史视图v$active_session_history可以查看最近每个活动会话超过1秒的状态,

从结果中可以看到,在下午2点16分,不管是后台进程LGWR还是用户进程JDBC的等待事件都是library cache lock,且阻塞者是4323和1057

library cache lock是oracle内存结构中shared pool的内存锁,

library cache是oracle内存中解析各种对象的地方,此等待事件一般说明有对象被大量解析导致争用

接下来分析阻塞者4323和1057 在干什么

SQL> select sample_time,sql_id,event,blocking_session,program from v$active_session_history where blocking_session is not null and rownum <20;

SAMPLE_TIME SQL_ID EVENT BLOCKING_SESSION PROGRAM

------------------------------ ------------- ------------------------------ ---------------- ------------------------------------------------

02-MAR-21 02.16.09.315 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

02-MAR-21 02.16.09.315 PM library cache lock 1057 JDBC Thin Client

02-MAR-21 02.16.09.315 PM library cache lock 1057 sqlplus@nbastb.cloud.bz (TNS V1-V3)

02-MAR-21 02.16.08.315 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

02-MAR-21 02.16.08.315 PM library cache lock 1057 JDBC Thin Client

02-MAR-21 02.16.08.315 PM library cache lock 1057 sqlplus@nbastb.cloud.bz (TNS V1-V3)

02-MAR-21 02.16.07.315 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

02-MAR-21 02.16.07.315 PM library cache lock 1057 JDBC Thin Client

02-MAR-21 02.16.07.315 PM library cache lock 1057 sqlplus@nbastb.cloud.bz (TNS V1-V3)

02-MAR-21 02.16.06.315 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

02-MAR-21 02.16.06.315 PM library cache lock 1057 JDBC Thin Client

02-MAR-21 02.16.06.315 PM library cache lock 1057 sqlplus@nbastb.cloud.bz (TNS V1-V3)

02-MAR-21 02.16.05.305 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

02-MAR-21 02.16.05.305 PM library cache lock 1057 JDBC Thin Client

02-MAR-21 02.16.05.305 PM library cache lock 1057 sqlplus@nbastb.cloud.bz (TNS V1-V3)

02-MAR-21 02.16.04.305 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

02-MAR-21 02.16.04.305 PM library cache lock 1057 JDBC Thin Client

02-MAR-21 02.16.04.305 PM library cache lock 1057 sqlplus@nbastb.cloud.bz (TNS V1-V3)

02-MAR-21 02.16.03.305 PM library cache lock 4323 oracle@nbastb.cloud.bz (LGWR)

查看会话4323的历史发现blocking_session为空,说明它不是被别的会话阻塞的,也就说明它是阻塞的源头

通过其他信息可以看到会话4323是从主机名WORKGROUP\WIN-1DNF2S54TTS的设备上使用plsqldev.exe连上来的,执行的sql是1t7jfygm1b944

SQL> select sample_time,sql_id,event,blocking_session,program,machine from v$active_session_history where session_id=4323 and rownum <20;

SAMPLE_TIME SQL_ID EVENT BLOCKING_SESSION PROGRAM MACHINE

------------------------------ ------------- ------------------------------ ---------------- ------------------------------ ----------------------------------------

02-MAR-21 02.16.09.315 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.08.315 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.07.315 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.06.315 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.05.305 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.04.305 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.03.305 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.02.305 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.01.305 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.16.00.305 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.59.295 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.58.295 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.57.285 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.56.285 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.55.285 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.54.285 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.53.285 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.52.275 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

02-MAR-21 02.15.51.275 PM 1t7jfygm1b944 plsqldev.exe WORKGROUP\WIN-1DNF2S54TTS

查看1t7jfygm1b944的具体sql内容是什么,发现查不出来,oracle一般会把sql语句hash出sql_id便于用来解析,

有sql_id但是没有语句的情况一般是由于解析不成功,所以不在内存里,

但是如果真的是解析不成功,那么在开启10035事件的情况下,是可以在后台日志里看到解析失败的sql的

巧了,这个库是开启的

SQL> select sql_fulltext from v$sql where sql_id='1t7jfygm1b944';

no rows selected

顺道看一下会话4323是谁连上来的,等一下去通知他

SQL> select sid,osuser,machine from v$session where machine='WORKGROUP\WIN-1DNF2S54TTS';

SID OSUSER MACHINE

---------- ------------------------------ ----------------------------------------

1972 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

2074 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

2405 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

2458 zsh4305 WORKGROUP\WIN-1DNF2S54TTS

3226 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

3315 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

3460 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

3508 csh11556 WORKGROUP\WIN-1DNF2S54TTS

3555 csh11556 WORKGROUP\WIN-1DNF2S54TTS

3609 CSH8245 WORKGROUP\WIN-1DNF2S54TTS

3700 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

3748 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

3895 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

3987 csh11556 WORKGROUP\WIN-1DNF2S54TTS

4137 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

4227 zsh4305 WORKGROUP\WIN-1DNF2S54TTS

4281 CSH8245 WORKGROUP\WIN-1DNF2S54TTS

4323 zsh11571 WORKGROUP\WIN-1DNF2S54TTS

查看后台日志,果然定位到了2点16分的失败解析sql,是一个包含大量in的语句,

in包含的对象太多,对于oracle的解析工作是个非常大的负担,这种情况应该改写成join

[oracle@nbastb ~]$ less /s01/app/oracle/diag/rdbms/nba_stb/nba_stb/trace/alert_nba_stb.log

Tue Mar 02 14:16:09 2021

PARSE ERROR: ospid=178903, error=1013 for statement:

select

s.slip_code1,

s.slip_code2,

s.outer_order_code outer_order#,

s.code,

(s.slip_code2||'_'||sk.ext_code2) vlookup_01,

s.shop_id,

s.status,

e.name,

nvl(decode(e.name,

'',

'POD',

ˢ',

'Shipment confirmation',

'',

'Waiting for shipment',

'ȡ',

'Cancel'),

0) STATUS_T,

sk.code SKU_CODE,

sk.supplier_code,

sk.ext_code1,

sk.ext_code2,

sl.requested_qty,

sl.unit_price,

sl.total_actual,

m.province,m.city,

wh.name WH_NAME,

to_char(s.create_time + 0, 'yyyy-MM-dd hh24:mi:ss') create_time,

to_char(s.tb_payment_time + 0, 'yyyy-MM-dd hh24:mi:ss') tb_payment_time,

to_char(s.cancel_time + 0, 'yyyy-MM-dd hh24:mi:ss') cancel_time,

to_char(s.delivery_time + 0, 'yyyy-MM-dd hh24:mi:ss') delivery_time,

s.sys_memo,

s.to_wh_remark,

s.is_cod_order,

m.trans_time_type,

s.trans_code

from

nba.t_so_sales_order s

left join nba.t1 m on m.id = s.id

left join nba.t2 wt on wt.ref_slip_code = s.code

left join nba.t3 sl on s.id = sl.so_id

left join nba.t4 sk on sk.ext_code1 = sl.extension_code

LEFT JOIN nba.t5 wh ON s.wh_id = wh.id,

nba.t6 e

where

1 = 1

and s.shop_id in (4422, 4423)

and wt.workflow_Id = 3

and e.node_no = wt.current_node_no

And to_char(S.Create_Time, 'yyyy mmdd hh24mi') >= '2021 0223 0000'

-- And to_char(S.Create_Time, 'yyyy mmdd hh24mi') < '2020 0901 0000'

-- and s.code like 'B%'

and s.slip_code2 in ('NBA1185360744',

'NBA1186058875',

'NBA1186058891',

'NBA1186058880',

'NBA1186058874',

'NBA1186070646',

'NBA1186068022',

'NBA1186128429',

'NBA1186142312',

'NBA1186142062',

'NBA1186141961',

'NBA1186141905',

'NBA1186143256',

'NBA1186142919',

'NBA1186143035',

'NBA1186142819',

'NBA1186142067',

'NBA1186143571',

'NBA1186143575',

'NBA1186144320',

'NBA1186144239',

'NBA1186144248',

'NBA1186142180',

'NBA1186142155',

'NBA1186142080',

'NBA1186146271',

'NBA1186145407',

'NBA1186145779',

'NBA1186146081',

'NBA1186145458',

'NBA1186145997',

'NBA1186145481',

'NBA1186144368',

'NBA1186146097',

'NBA1186146061',

'NBA1186146085',

'NBA1186146019',

'NBA1186145971',

'NBA1186144654',

'NBA1186145984',

'NBA1186145148',

'NBA1186144718',

'NBA1186145708',

'NBA1186145818',

'NBA1186144845',

'NBA1186144664',

'NBA1186146049',

'NBA1186145585',

'NBA1186144448',

'NBA1186145518',

'NBA1186145835',

'NBA1186145544',

'NBA1186145232',

'NBA1186146089',

'NBA1186146277',

'NBA1186145785',

'NBA1186145773',

'NBA1186144753',

'NBA1186146032',

'NBA1186144655',

'NBA1186146052',

'NBA1186144499',

'NBA1186144703',

'NBA1186144775',

'NBA1186146038',

'NBA1186144536',

'NBA1186144890',

'NBA1186146099',

'NBA1186144493',

'NBA1186146116',

'NBA1186145258',

'NBA1186144729',

'NBA1186145513',

'NBA1186145522',

'NBA1186146253',

'NBA1186146100',

'NBA1186145482',

'NBA1186144352',

'NBA1186144699',

'NBA1186145537',

'NBA1186145219',

'NBA1186145995',

'NBA1186144656',

'NBA1186145480',

'NBA1186145803',

'NBA1186145977',

'NBA1186145400',

'NBA1186146110',

'NBA1186144765',

'NBA1186145998',

'NBA1186146259',

'NBA1186145970',

'NBA1186146034',

'NBA1186145991',

'NBA1186145515',

'NBA1186145332',

'NBA1186144832',

'NBA1186145778',

'NBA1186145987',

'NBA1186145357',

'NBA1186146008',

......

解决办法

通过刚才找到的osuser也就是工号确定是谁,然后让他把in改成join

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值