Hash join semi带来的temp tablespace使用过度问题-Part1

最新推荐文章于 2023-06-28 22:44:56 发布

cser3000

最新推荐文章于 2023-06-28 22:44:56 发布

阅读量249

点赞数

前日，Production DB N突然反应变慢(RHEL4UP4,9208,64bit,MTS mode)

用top –d 1 察看，发现CPU usage大约在40%~70%间浮动，IOWAIT大约在4%~9%左右。Sys使用量1%~3%，和平常一样。

再用iostat –x 1 看了下各个mount point的IO，发现/u04的IOPS很重，使用率一直在100%。

[@more@]

Hash join semi带来的temp tablespace使用过度问题。

前日，Production DB N突然反应变慢(RHEL4UP4,9208,64bit,MTS mode)

用top –d 1 察看，发现CPU usage大约在40%~70%间浮动，IOWAIT大约在4%~9%左右。Sys使用量1%~3%，和平常一样。

再用iostat –x 1 看了下各个mount point的IO，发现/u04的IOPS很重，使用率一直在100%。

/u04摆放的是TEMP,UNDO,历史Partition,和几个INDEX的Datafiles（旧系统，目前只用local disks，最近由于资料量成长到较大，10月准备上SAN Storage）。

由于观察到/u04上的IOPS主要是write,考虑是否是产生了大量UNDO,但是由REDO的产生量看，和往常并没差太多。

历史Partition的资料可能性更小，查询Datafile的Physical IO也证明不是它们在造成Write IO。

再想是否是那几个Index Datafile,这些file 是因为原本放Index的mount point空间不足以支持到SAN Storage上线而临时作的调整，决定哪些file的时侯我已经充分考虑到他们的IO量在整个系统中的权重。

还是先看下Session wait:

select * from v$session_wait where event not in

(select event from stats$idle_event)

连着刷了十几遍，发现一些端倪，有几个Session的wait events一直是direct path write。

这是个在我们的系统中很不常见的Wait events。

官方的释义：

direct path write and direct path write temp

When a process is writing buffers directly from PGA (as opposed to the DBWR writing them from the buffer cache), the process waits on this event for the write call to complete. Operations that could perform direct path writes include when a sort goes to disk, during parallel DML operations, direct-path INSERTs, parallel create table as select, and some LOB operations.

Like direct path reads, the number of waits is not the same as number of write calls issued if the I/O subsystem supports asynchronous writes. The session waits if it has processed all buffers in the PGA and is unable to continue work until an I/O request completes.

于是抓取该Session的SQL:

DELETE FROM table_d

WHERE device IN (SELECT device

FROM table_d

WHERE class = 94

AND info = to_char(:b1))

看一下PLAN:

DELETE STATEMENT, GOAL = CHOOSE Cost=111 Cardinality=125 Bytes=4875

DELETE Object owner=TP Object name=table_d

HASH JOIN SEMI Cost=111 Cardinality=125 Bytes=4875

TABLE ACCESS FULL Object owner=TP Object name=table_d Cost=43 Cardinality=61807 Bytes=865298

TABLE ACCESS FULL Object owner=TP Object name=table_d Cost=43 Cardinality=1 Bytes=25

HASH JOIN SEMI.

再从v$sort_usage里查看下：

select * from v$sort_usage;

这几个Session果然在用TEMP做HASH

SQL> show parameter hash;

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

hash_area_size integer 2097152

hash_join_enabled boolean TRUE

hash_area_size设定为2M，这在平时确实足够了。

到这里原因就弄清楚了，HASH_JOIN在PGA里没足够空间，使用了TEMP，引起/u04的IOPS突增，影响到UNDO，拖慢整个系统。

解决的办法：Hash_area_size只能在Session级别修改。

打算检查了下PLAN是否正确：中秋刚Analyze过，但发现table实际大小和Statistics差得比较多，Gather stats后再RUN，发现已经走

DELETE STATEMENT, GOAL = CHOOSE Cost=93 Cardinality=1 Bytes=39

DELETE Object owner=TP Object name=table_d

HASH JOIN Cost=93 Cardinality=1 Bytes=39

SORT UNIQUE

TABLE ACCESS FULL Object owner=TP Object name=table_d Cost=43 Cardinality=1 Bytes=25

TABLE ACCESS FULL Object owner=TP Object name=table_d Cost=43 Cardinality=61807 Bytes=865298

检查v$sort_usage 已经没有再用temp进行hash。检查了一下其他Production DB，发现一样的SQL 一样的PLAN，并没发生HASH_AREA不足的情况，table_d的资料量和Production DB N的table_d差得比较大，和Developer人员讨论了下，认为是最近的资料转档可能发生点问题，有该删除的没删除掉，那接下来的事情就交给他们了。

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/10856805/viewspace-1011024/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/10856805/viewspace-1011024/

cser3000

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hash join semi带来的temp tablespace使用过度问题-Part1

前日，Production DB N突然反应变慢(RHEL4UP4,9208,64bit,MTS mode)用top –d 1 察看，发现CPU usage大约在40%~70%间浮动，IOWAIT大约在4%~9%左右。Sys使用...
复制链接

扫一扫