HASH_AREA内存不足导致job Hung

昨天晚上 我做了一个关于Hash Join的调整

老外说 某个JOB Hung住了,用户名是IDWSQ6

于是我查询等待事件:direct patch write temp

SQL> SELECT INST_ID,SID,EVENT,P1,P2,P3,MACHINE FROM GV$SESSION WHERE USERNAME='IDWSQ6';

INST_ID SID EVENT P1 P2 P3 MACHINE

---------- ---------- ------------------------------ ---------- ---------- ---------- -------------------

4 4752 SQL*Net message from client 1413697536 1 0 bdhp4423

4 4813 direct path write temp 20008 1132507 7 bdhp4423

等待事件为direct path write temp ,说明这个session在写数据到磁盘中

SQL> select a.username,a.inst_id, a.sid, a.serial#, a.machine,a.sql_id,b.tablespace, b.blocks*

2 (select value from v$parameter where name='db_block_size')/1024/1024 "Size(M)",b.segtype

3 from gv$session a, gv$tempseg_usage b where a.inst_id=b.inst_id and a.saddr = b.session_addr

4 and a.inst_id=4 and a.sid=4813;

USERNAME INST_ID SID SERIAL# MACHINE SQL_ID TABLESPACE Size(M) SEGTYPE

-------------------- ---------- ---------- ---------- --------------- --------------- --------------- ---------- --------

IDWSQ6 4 4813 35649 bdhp4423 bqamju47q25wu TEMP 1 LOB_DATA

IDWSQ6 4 4813 35649 bdhp4423 bqamju47q25wu TEMP 1893 HASH

根据查询我发现 是由于HASH JOIN导致的,并且占用了1893m

下面是 SQL的执行计划:

select * from table(dbms_xplan.display_cursor('bqamju47q25wu','0'));

Plan hash value: 1285921926

----------------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |

----------------------------------------------------------------------------------------------------------------------------------

| 0 | CREATE TABLE STATEMENT | | | | 63603 (100)| | | |

| 1 | LOAD AS SELECT | | | | | | | |

|* 2 | HASH JOIN OUTER | | 91 | 72709 | 63601 (49)| 00:10:37 | | |

|* 3 | HASH JOIN RIGHT OUTER | | 91 | 68523 | 48900 (49)| 00:08:09 | | |

|* 4 | MAT_VIEW ACCESS FULL | SH30_V_PROD_BOM_DENORM_FAC | 1 | 51 | 1847 (11)| 00:00:19 | | |

| 5 | VIEW | SH30_8530337_8453_0_TMP | 91 | 63882 | 47052 (51)| 00:07:51 | | |

|* 6 | HASH JOIN | | 91 | 28574 | 47051 (51)| 00:07:51 | | |

|* 7 | HASH JOIN | | 1 | 77 | 301 (12)| 00:00:04 | | |

|* 8 | HASH JOIN | | 14 | 784 | 224 (14)| 00:00:03 | | |

|* 9 | TABLE ACCESS BY INDEX ROWID | TIME_PERD | 301 | 6321 | 76 (4)| 00:00:01 | | |

|* 10 | INDEX RANGE SCAN | TIME_PERD_IDX3 | 602 | | 6 (17)| 00:00:01 | | |

|* 11 | HASH JOIN | | 564 | 19740 | 147 (19)| 00:00:02 | | |

|* 12 | TABLE ACCESS BY INDEX ROWID| TIME_PERD | 301 | 6321 | 76 (4)| 00:00:01 | | |

|* 13 | INDEX RANGE SCAN | TIME_PERD_IDX3 | 602 | | 6 (17)| 00:00:01 | | |

| 14 | INDEX FULL SCAN | TIME_PERD_IDX2 | 22887 | 312K| 63 (26)| 00:00:01 | | |

|* 15 | TABLE ACCESS BY INDEX ROWID | TIME_PERD | 301 | 6321 | 76 (4)| 00:00:01 | | |

|* 16 | INDEX RANGE SCAN | TIME_PERD_IDX3 | 602 | | 6 (17)| 00:00:01 | | |

| 17 | TABLE ACCESS FULL | SH30_8530337_8452_0_TMP | 6093K| 1377M| 44845 (49)| 00:07:29 | | |

| 18 | PARTITION RANGE SINGLE | | 5024K| 220M| 13130 (41)| 00:02:12 | 3 | 3 |

|* 19 | TABLE ACCESS FULL | PROD_UNIT_OF_MEASR | 5024K| 220M| 13130 (41)| 00:02:12 | 3 | 3 |

----------------------------------------------------------------------------------------------------------------------------------

我查看了SQL,执行计划是正确的。

但是这个 这个JOB 就一直Huang住,已经跑了10多个小时了。

其实在这个Job Huang住之前,之前的Job(不同的SQL) 也是由于这个原因被Huang住过。

那么出现这样的问题就比较郁闷了。我采取了一个很 “变态” 的方法:

SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ

2 FROM x$ksppi x, x$ksppcv y

3 WHERE x.inst_id = USERENV ('Instance')

4 AND y.inst_id = USERENV ('Instance')

5 AND x.indx = y.indx

6 AND x.ksppinm LIKE '%pga_max_size%'

7 /

NAME VALUE DESCRIB

------------------------------ ---------- ------------------------------

_pga_max_size 2147483648 Maximum size of the PGA memory

for one process

SQL> select 2147483648/1024/1024/1024 from dual;

2147483648/1024/1024/1024

-------------------------

2

Oracle有个限制 ,一个进程不能够分配超过2G的内存

SQL> alter session set workarea_size_policy = manual;

Session altered.

SQL> alter session set hash_area_size =2147483648;

alter session set hash_area_size =2147483648

*

ERROR at line 1:

ORA-02017: integer value required 这里超过了2G 就报错了

SQL> alter session set hash_area_size =2147483647;

Session altered.

我手工配置接近2G hash_area_size就解决了Job Huang住的问题。

SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ

2 FROM x$ksppi x, x$ksppcv y

3 WHERE x.inst_id = USERENV ('Instance')

4 AND y.inst_id = USERENV ('Instance')

5 AND x.indx = y.indx

6 AND x.ksppinm LIKE '%_smm_max_size%'

7 /

NAME VALUE DESCRIB

------------------------------ ---------- ------------------------------

_smm_max_size 1048576 maximum work area size in auto

mode (serial)

隐含参数_smm_max_size表示在auto 模式下 最大的work area

SQL> select 1048576/1024/1024 from dual;

1048576/1024/1024

-----------------

1

这里查询出来为1G

由于手动设置加大了HASH_AREA的大小,使得HASH JOIN能够尽可能的避免写数据到temp 表空间,所以解决了Job Huang 问题。大家如果遇到类似的case,不妨可以一试。当然了前提是 SQL不能优化的前提下。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值