对Hash Join的一次优化

最新推荐文章于 2024-02-26 20:27:03 发布

Harvey.Wang

最新推荐文章于 2024-02-26 20:27:03 发布

阅读量335

点赞数

分类专栏：转载文章标签： hash join 优化

转载专栏收录该内容

1 篇文章 0 订阅

订阅专栏

转载原地址

前两天解决了一个优化SQL的case，SQL语句如下，big_table为150G大小，small_table很小，9000多条记录，不到1M大小
hash_area_size, sort_area_size均设置足够大，可以进行optimal hash join和memory sort

select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where (a.category  = b.from_cat or
       a.category2 = b.from_cat) and
       a.site_id  = b.site_id and
       a.sale_end >= sysdate;

--------------------------------------------------------------------------
| Id  | Operation            |  Name        | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |     2 |   174 |    18  (17)|
|   1 |  SORT UNIQUE         |              |     2 |   174 |    18  (17)|
|*  2 |   HASH JOIN          |              |     2 |   174 |    17  (12)|
|   3 |    TABLE ACCESS FULL | SMALL_TABLE  |  1879 | 48854 |    14   (8)|
|*  4 |    TABLE ACCESS FULL | BIG_TABLE    |     4 |   244 |     3  (34)|
--------------------------------------------------------------------------

粗略来看，PLAN非常的完美，SQL HINT写的也很到位，小表在内build hash table,大表在外进行probe操作，
根据经验来看，整个SQL执行的时间应该和FTS BIG_TABLE的时间差不多

但是FTS BIG_TABLE的时间大约是8分钟，而真个SQL执行的时间长达3~4小时

那么问题究竟出在哪里?

FTS时间应该不会有太大变化，那么问题应该在hash join，设置event来trace一下hash join的过程。

SQL> alter session set events '10104 trace name context forever, level 2';

Session altered.

select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where (a.category  = b.from_cat or
       a.category2 = b.from_cat) and
       a.site_id  = b.site_id and
       a.sale_end >= sysdate;

从trace file中Hash Table中这一段找出了问题所在：

### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
#       than the true number.
Number of buckets with   0 rows:      16373
Number of buckets with   1 rows:          0
Number of buckets with   2 rows:          0
Number of buckets with   3 rows:          1
Number of buckets with   4 rows:          0
Number of buckets with   5 rows:          0
Number of buckets with   6 rows:          0
Number of buckets with   7 rows:          1
Number of buckets with   8 rows:          0
Number of buckets with   9 rows:          0
Number of buckets with between  10 and  19 rows:          1
Number of buckets with between  20 and  29 rows:          1
Number of buckets with between  30 and  39 rows:          3
Number of buckets with between  40 and  49 rows:          0
Number of buckets with between  50 and  59 rows:          0
Number of buckets with between  60 and  69 rows:          0
Number of buckets with between  70 and  79 rows:          0
Number of buckets with between  80 and  89 rows:          0
Number of buckets with between  90 and  99 rows:          0
Number of buckets with 100 or more rows:          4
### Hash table overall statistics ###
Total buckets: 16384 Empty buckets: 16373 Non-empty buckets: 11
Total number of rows: 9232
Maximum number of rows in a bucket: 2531
Average number of rows in non-empty buckets: 839.272705

仔细看，在一个bucket中最多的行数竟然有2531行，因为bucket中是一个链表的结构，所以这几千行都是串在一个链表上。
由这一点想到这个Hash Table所依赖的hash key的distinct value可能太少，重复值太多。否则不应该会有这么多行在同一个bucket里面。

因为Join条件里面有两个列from_cat和site_id,穷举法有三种情况

Build hash table based on (from_cat,site_id):

SQL> select site_id,from_cat,count(*) from SMALL_TABLE group by site_id,from_cat having count(*)>100;

no rows selected

Build hash table based on (from_cat):

SQL> select from_cat,count(*) from SMALL_TABLE group by from_cat having count(*)>100;

no rows selected

Build hash table based on (site_id):

SQL> select site_id,count(*) from SMALL_TABLE group by site_id having count(*)>100;

   SITE_ID   COUNT(*)
---------- ----------
         0       2531
         2       2527
       146       1490
       210       2526

到这里可以发现，基于site_id这种情况和trace file中这两行很相符：

Number of buckets with 100 or more rows: 4
Maximum number of rows in a bucket: 2531

所以推断这个hash table是基于site_id而建的，而Big_Table中大量的行site_id=0，都落在这个linked list最长的bucket中.
而大部分行都会扫描完整个链表而最后被丢弃掉,所以这个Hash Join的操作效率非常差，几乎变为了Nest Loop操作

找到了根本原因，问题也就迎刃而解了。

理想状况下，hash table应当建立于(site_id,from_cat)上，那么问题肯定出在这个OR上，把OR用UNION改写

select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where  a.category  = b.from_cat and
       a.site_id  = b.site_id and
       a.sale_end >= sysdate
UNION
select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where  a.category2 = b.from_cat and
       a.site_id  = b.site_id and
       a.sale_end >= sysdate;

--------------------------------------------------------------------------
| Id  | Operation            |  Name        | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |     2 |   148 |    36  (59)|
|   1 |  SORT UNIQUE         |              |     2 |   148 |    36  (59)|
|   2 |   UNION-ALL          |              |       |       |            |
|*  3 |    HASH JOIN         |              |     1 |    74 |    17  (12)|
|   4 |     TABLE ACCESS FULL| SMALL_TABLE  |  1879 | 48854 |    14   (8)|
|*  5 |     TABLE ACCESS FULL| BIG_TABLE    |     4 |   192 |     3  (34)|
|*  6 |    HASH JOIN         |              |     1 |    74 |    17  (12)|
|   7 |     TABLE ACCESS FULL| SMALL_TABLE  |  1879 | 48854 |    14   (8)|
|*  8 |     TABLE ACCESS FULL| BIG_TABLE    |     4 |   192 |     3  (34)|
--------------------------------------------------------------------------

初看这个PLAN好像不如第一个PLAN，因为执行了两次BIG_TABLE的FTS，但是让我们在来看看HASH TABLE的结构

### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
#       than the true number.
Number of buckets with   0 rows:       9306
Number of buckets with   1 rows:       5310
Number of buckets with   2 rows:       1436
Number of buckets with   3 rows:        285
Number of buckets with   4 rows:         43
Number of buckets with   5 rows:          4
Number of buckets with   6 rows:          0
Number of buckets with   7 rows:          0
Number of buckets with   8 rows:          0
Number of buckets with   9 rows:          0
Number of buckets with between  10 and  19 rows:          0
Number of buckets with between  20 and  29 rows:          0
Number of buckets with between  30 and  39 rows:          0
Number of buckets with between  40 and  49 rows:          0
Number of buckets with between  50 and  59 rows:          0
Number of buckets with between  60 and  69 rows:          0
Number of buckets with between  70 and  79 rows:          0
Number of buckets with between  80 and  89 rows:          0
Number of buckets with between  90 and  99 rows:          0
Number of buckets with 100 or more rows:          0
### Hash table overall statistics ###
Total buckets: 16384 Empty buckets: 9306 Non-empty buckets: 7078
Total number of rows: 9232
Maximum number of rows in a bucket: 5
Average number of rows in non-empty buckets: 1.304323

这就是我们所需要的Hash Table，最长的链表只有五行数据

整个SQL的执行时间从三四个小时缩短为16分钟，大大超出了developer的预期

这个SQL单纯从PLAN上很难看出问题所在，需要了解Hash Join的机制，进行更深一步的分析

Harvey.Wang

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
对Hash Join的一次优化

转载原地址前两天解决了一个优化SQL的case，SQL语句如下，big_table为150G大小，small_table很小，9000多条记录，不到1M大小 hash_area_size, sort_area_size均设置足够大，可以进行optimal hash join和memory sortselect /*+ leading(b) use_hash(a b) */ distinct a.I
复制链接

扫一扫