排序合并连接的使用原则和分析（use_merge）

最新推荐文章于 2021-04-12 11:19:53 发布

cpfengyo322322

最新推荐文章于 2021-04-12 11:19:53 发布

阅读量784

点赞数

文章标签：大数据

=============================================================================================================================================
                                                                                                         排序合并连接（示例）

1。由于排序合并连接两个连接表都是独立执行排序，然后合并连接，没有驱动表的概念

SYS@standby1/2011-05-05 10:55:46>explain plan for select /*+leading(a) use_merge(b a)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.02
SYS@standby1/2011-05-05 10:55:49>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2310871475

-----------------------------------------------------------------------------------------
| Id | Operation           | Name      | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           | 1201 | 25221 |       | 1190   (2)| 00:00:15 |
|   1 | MERGE JOIN         |           | 1201 | 25221 |       | 1190   (2)| 00:00:15 |
|   2 |   SORT JOIN         |           | 12248 |   119K|   504K| 1175   (2)| 00:00:15 |
|* 3 |    TABLE ACCESS FULL| TEST_TAB | 12248 |   119K|       | 1123   (2)| 00:00:14 |
|* 4 |   SORT JOIN         |           | 1201 | 13211 |       |    15   (7)| 00:00:01 |
|* 5 |    TABLE ACCESS FULL| TEST_TAB2 | 1201 | 13211 |       |    14   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("A"."T1"<=110 AND "A"."T1">=100)
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   5 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')

SYS@standby1/2011-05-05 10:56:07>explain plan for select /*+leading(b) use_merge(a b)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.00
SYS@standby1/2011-05-05 10:58:48>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2103792417

-----------------------------------------------------------------------------------------
| Id | Operation           | Name      | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           | 1201 | 25221 |       | 1190   (2)| 00:00:15 |
|   1 | MERGE JOIN         |           | 1201 | 25221 |       | 1190   (2)| 00:00:15 |
|   2 |   SORT JOIN         |           | 1201 | 13211 |       |    15   (7)| 00:00:01 |
|* 3 |    TABLE ACCESS FULL| TEST_TAB2 | 1201 | 13211 |       |    14   (0)| 00:00:01 |
|* 4 |   SORT JOIN         |           | 12248 |   119K|   504K| 1175   (2)| 00:00:15 |
|* 5 |    TABLE ACCESS FULL| TEST_TAB | 12248 |   119K|       | 1123   (2)| 00:00:14 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   5 - filter("A"."T1"<=110 AND "A"."T1">=100)

总结：排序合并连接执行步骤为首先各自进行排序（上面使用的都是全表扫描），然后sort join 然后merge join（采取的是扫描的方式，而非随机读）
执行计划的主要成本在各自对两表的排序行为。没有驱动表的概念，执行顺序无关，

=============================================================================================================================================

2。连接列使用索引和不使用索引，可否避免排序操作

SYS@standby1/2011-05-05 11:08:03>explain plan for select /*+leading(b) use_merge(a b) index(a pk_test_id) index(b pk_test2_id)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.00
SYS@standby1/2011-05-05 11:09:27>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 209170912

-----------------------------------------------------------------------------------------------------
| Id | Operation                     | Name        | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             | 1201 | 25221 |       | 6384   (1)| 00:01:17 |
|   1 | MERGE JOIN                   |             | 1201 | 25221 |       | 6384   (1)| 00:01:17 |
|* 2 |   TABLE ACCESS BY INDEX ROWID | TEST_TAB2   | 1201 | 13211 |       |    61   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN            | PK_TEST2_ID | 9999 |       |       |    22   (0)| 00:00:01 |
|* 4 |   SORT JOIN                   |             | 12248 |   119K|   504K| 6323   (1)| 00:01:16 |
|* 5 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB    | 12248 |   119K|       | 6272   (1)| 00:01:16 |
|   6 |     INDEX FULL SCAN           | PK_TEST_ID | 1000K|       |       | 2240   (1)| 00:00:27 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   5 - filter("A"."T1"<=110 AND "A"."T1">=100)

SYS@standby1/2011-05-05 17:22:04>explain plan for select /*+leading(a) use_merge(b a) index(a pk_test_id) index(b pk_test2_id)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.03
SYS@standby1/2011-05-05 17:33:48>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2961992521

---------------------------------------------------------------------------------------------
| Id | Operation                     | Name        | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             | 1201 | 25221 | 6334   (1)| 00:01:17 |
|   1 | MERGE JOIN                   |             | 1201 | 25221 | 6334   (1)| 00:01:17 |
|* 2 |   TABLE ACCESS BY INDEX ROWID | TEST_TAB    | 12248 |   119K| 6272   (1)| 00:01:16 |
|   3 |    INDEX FULL SCAN            | PK_TEST_ID | 1000K|       | 2240   (1)| 00:00:27 |
|* 4 |   SORT JOIN                   |             | 1201 | 13211 |    62   (2)| 00:00:01 |
|* 5 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB2   | 1201 | 13211 |    61   (0)| 00:00:01 |
|   6 |     INDEX FULL SCAN           | PK_TEST2_ID | 9999 |       |    22   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."T1"<=110 AND "A"."T1">=100)
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   5 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')

分析：成本主要在处理大表上，小表使用排序好的索引，避免了从索引读取数据块后的排序操作，索引扫描为全索引扫描，通过改变leading的值却改变了不需要sort join的对象（不知道原因，成本却又没有改变还有就是始终有一个表需要sort join操作同样不明白（求高手分析！！！）本人觉得两个索引都是连接列的索引，应该都已经排序好了为何还要执行sort join操作）

SYS@standby1/2011-05-05 11:17:33>explain plan for select /*+leading(b) use_merge(a b) index(a ind_test_t1) index(b pk_test2_id )*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.01
SYS@standby1/2011-05-05 11:18:03>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3358827034

-----------------------------------------------------------------------------------------------------
| Id | Operation                     | Name        | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             | 1201 | 25221 |       | 10851   (1)| 00:02:11 |
|   1 | MERGE JOIN                   |             | 1201 | 25221 |       | 10851   (1)| 00:02:11 |
|* 2 |   TABLE ACCESS BY INDEX ROWID | TEST_TAB2   | 1201 | 13211 |       |    61   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN            | PK_TEST2_ID | 9999 |       |       |    22   (0)| 00:00:01 |
|* 4 |   SORT JOIN                   |             | 12248 |   119K|   504K| 10790   (1)| 00:02:10 |
|   5 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB    | 12248 |   119K|       | 10739   (1)| 00:02:09 |
|* 6 |     INDEX RANGE SCAN          | IND_TEST_T1 | 12248 |       |       |    28   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   6 - access("A"."T1">=100 AND "A"."T1"<=110)

分析：当没有使用PK_TEST_ID索引时，索引采用范围扫描，成本较小，返回了12248行，但是通过索引读取对应数据块时成本太大，耗时00:02:09 ，
主要是因为IND_TEST_T1 的clustor factor太大，导致每次读一行差不多要读取一个数据块

创建id和t1列的组合索引，因为是以id列为先导列所以clustor factor非常号，下面查询提供佐证

SYS@standby1/2011-05-05 11:18:04>create index ind_test_id_t1 on test_tab(id,t1);

Index created.

Elapsed: 00:00:10.94
SYS@standby1/2011-05-05 11:24:11>@index_statistics.sql
Enter value for tablename: test_tab

TYPE     INDEX          TABLE            BLOCKS NUM_ROWS    DIS_KEY   KEY_ROWS LEAFBLK CLU_FACTOR
-------- -------------- ------------ ---------- --------- ---------- ---------- ------- ----------
NORMAL   IND_TEST_ID_T1 TEST_TAB           4087   1000000    1000000          1    2785       4015
NORMAL   IND_TEST_T1    TEST_TAB           4087   1000000        899       1112    2091     874010
NORMAL   PK_TEST_ID     TEST_TAB           4087   1000000    1000000          1    2226       4015

3 rows selected.

SYS@standby1/2011-05-05 11:24:22>explain plan for select /*+leading(b) use_merge(a b) index(a ind_test_id_t1) index(b pk_test2_id )*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.00
SYS@standby1/2011-05-05 11:27:52>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3926643204

--------------------------------------------------------------------------------------------------------
| Id | Operation                     | Name           | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                | 1201 | 25221 |       | 2968   (1)| 00:00:36 |
|   1 | MERGE JOIN                   |                | 1201 | 25221 |       | 2968   (1)| 00:00:36 |
|* 2 |   TABLE ACCESS BY INDEX ROWID | TEST_TAB2      | 1201 | 13211 |       |    61   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN            | PK_TEST2_ID    | 9999 |       |       |    22   (0)| 00:00:01 |
|* 4 |   SORT JOIN                   |                | 12248 |   119K|   504K| 2907   (1)| 00:00:35 |
|   5 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB       | 12248 |   119K|       | 2855   (1)| 00:00:35 |
|* 6 |     INDEX FULL SCAN           | IND_TEST_ID_T1 | 12248 |       |       | 2805   (1)| 00:00:34 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   6 - access("A"."T1">=100 AND "A"."T1"<=110)
       filter("A"."T1"<=110 AND "A"."T1">=100)

分析：通过创建组合索引IND_TEST_ID_T1，这样可以利用id列索引的clustor factor 又能通过t1列的
6 - access("A"."T1">=100 AND "A"."T1"<=110) 缩减查询范围，致使成本大减，

SYS@standby1/2011-05-05 11:53:30>explain plan for select   a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 110 and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.01
SYS@standby1/2011-05-05 11:53:53>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1566271925

--------------------------------------------------------------------------------
| Id | Operation          | Name      | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 1201 | 25221 | 1138   (2)| 00:00:14 |
|* 1 | HASH JOIN         |           | 1201 | 25221 | 1138   (2)| 00:00:14 |
|* 2 |   TABLE ACCESS FULL| TEST_TAB2 | 1201 | 13211 |    14   (0)| 00:00:01 |
|* 3 |   TABLE ACCESS FULL| TEST_TAB | 12248 |   119K| 1123   (2)| 00:00:14 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."ID"="B"."ID")
   2 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   3 - filter("A"."T1"<=110 AND "A"."T1">=100)

分析：当什么提示都不加，优化器采用hash join方式，成本最低，默认时以小表作为build input 大表作为probe input，

=============================================================================================================================================

=============================================================================================================================================


3。查询条件列建立组合索引的影响

SYS@standby1/2011-05-05 16:45:25>explain plan for select /*+ use_merge(b a) */ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 150 and a.str2 between 'AAAAA' and 'DDDDD' and b.str1 between 'AAA' and 'DDD' and b.t2 between 30000 and 40000;

Explained.

Elapsed: 00:00:00.00
SYS@standby1/2011-05-05 16:48:02>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2103792417

-----------------------------------------------------------------------------------------
| Id | Operation           | Name      | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |   134 | 4020 |       | 1180   (2)| 00:00:15 |
|   1 | MERGE JOIN         |           |   134 | 4020 |       | 1180   (2)| 00:00:15 |
|   2 |   SORT JOIN         |           |   134 | 2010 |       |    15   (7)| 00:00:01 |
|* 3 |    TABLE ACCESS FULL| TEST_TAB2 |   134 | 2010 |       |    14   (0)| 00:00:01 |
|* 4 |   SORT JOIN         |           | 6815 |    99K|   392K| 1165   (2)| 00:00:14 |
|* 5 |    TABLE ACCESS FULL| TEST_TAB | 6815 |    99K|       | 1127   (2)| 00:00:14 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("B"."STR1"<='DDD' AND "B"."T2"<=40000 AND "B"."T2">=30000 AND
              "B"."STR1">='AAA')
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   5 - filter("A"."T1"<=150 AND "A"."STR2"<='DDDDD' AND "A"."T1">=100 AND
              "A"."STR2">='AAAAA')

SYS@standby1/2011-05-05 16:52:12>create index ind_test2_t2_str1 on test_tab2(t2,str1);

Index created.

Elapsed: 00:00:00.09

SYS@standby1/2011-05-05 16:52:12>create index ind_test2_t2_str1 on test_tab2(t2,str1);

Index created.

Elapsed: 00:00:00.09
SYS@standby1/2011-05-05 16:52:34>@index_statistics
Enter value for tablename: test_tab2

INDEX              TABLE        TB_BLKS ID_BLKS LEAFBLK NUM_ROWS    DIS_KEY   KEY_ROWS CLU_FACTOR
------------------ ------------ ------- ------- ------- --------- ---------- ---------- ----------
IND_TEST2_T2_STR1 TEST_TAB2         48      32      28      9999       9999          1       9744
IND_TEST2_T2       TEST_TAB2         48      32      23      9999       9481          1       9746
PK_TEST2_ID        TEST_TAB2         48      24      21      9999       9999          1         39

3 rows selected.

SYS@standby1/2011-05-05 16:51:00>create index ind_test_t1_str2 on test_tab(t1,str2);

Index created.

Elapsed: 00:00:07.64

SYS@standby1/2011-05-05 16:53:19>@index_statistics
Enter value for tablename: test_tab

INDEX              TABLE        TB_BLKS ID_BLKS LEAFBLK NUM_ROWS    DIS_KEY   KEY_ROWS CLU_FACTOR
------------------ ------------ ------- ------- ------- --------- ---------- ---------- ----------
IND_TEST_T1_STR2   TEST_TAB        4096    2944    2932   1000000     999951          1     999768
IND_TEST_T1        TEST_TAB        4096    2176    2091   1000000        899       1112     874010
PK_TEST_ID         TEST_TAB        4096    2304    2226   1000000    1000000          1       4015

3 rows selected.

查询二：使用查询条件列的组合索引

SYS@standby1/2011-05-05 16:53:38>explain plan for select /*+ use_merge(b a) index(a ind_test_t1_str2) index(b ind_test2_t2_str1)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 150 and a.str2 between 'AAAAA' and 'DDDDD' and b.str1 between 'AAA' and 'DDD' and b.t2 between 30000 and 40000;

Explained.

Elapsed: 00:00:00.03
SYS@standby1/2011-05-05 16:55:25>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1083562757

-----------------------------------------------------------------------------------------------------------
| Id | Operation                     | Name              | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                   |   134 | 4020 |       | 7162   (1)| 00:01:26 |
|   1 | MERGE JOIN                   |                   |   134 | 4020 |       | 7162   (1)| 00:01:26 |
|   2 |   SORT JOIN                   |                   |   134 | 2010 |       |   137   (1)| 00:00:02 |
|   3 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB2         |   134 | 2010 |       |   136   (0)| 00:00:02 |
|* 4 |     INDEX RANGE SCAN          | IND_TEST2_T2_STR1 |   134 |       |       |     5   (0)| 00:00:01 |
|* 5 |   SORT JOIN                   |                   | 6815 |    99K|   392K| 7025   (1)| 00:01:25 |
|   6 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB          | 6815 |    99K|       | 6987   (1)| 00:01:24 |
|* 7 |     INDEX RANGE SCAN          | IND_TEST_T1_STR2 | 6815 |       |       |   170   (1)| 00:00:03 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("B"."T2">=30000 AND "B"."STR1">='AAA' AND "B"."T2"<=40000 AND "B"."STR1"<='DDD')
       filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   5 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   7 - access("A"."T1">=100 AND "A"."STR2">='AAAAA' AND "A"."T1"<=150 AND "A"."STR2"<='DDDDD')
       filter("A"."STR2"<='DDDDD' AND "A"."STR2">='AAAAA')

分析：查询一对于两表都使用全表扫描，然后排序，最后合并，是最常规的排序合并，查询二使用，查询条件列建立的组合索引，
分别返回了 6815和134行数据，主要成本在于，对大表的6815行数据进行随即读，由于IND_TEST_T1_STR2 的clustor factor接近数据行数
所以执行计划较差，若是当排序合并的两个表通过组合索引返回的数据很少时，查询成本如下：查询三

查询三

SYS@standby1/2011-05-05 17:05:53>explain plan for select /*+ use_merge(b a) */ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 104 and a.str2 between 'AAAAA' and 'BBBBB' and b.str1 between 'AAA' and 'BBB' and b.t2 between 30000 and 31000;

Explained.

Elapsed: 00:00:00.03
SYS@standby1/2011-05-05 17:06:44>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1083562757

---------------------------------------------------------------------------------------------------
| Id | Operation                     | Name              | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                   |     5 |   150 |   251   (1)| 00:00:04 |
|   1 | MERGE JOIN                   |                   |     5 |   150 |   251   (1)| 00:00:04 |
|   2 |   SORT JOIN                   |                   |     5 |    75 |     8 (13)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB2         |     5 |    75 |     7   (0)| 00:00:01 |
|* 4 |     INDEX RANGE SCAN          | IND_TEST2_T2_STR1 |     5 |       |     2   (0)| 00:00:01 |
|* 5 |   SORT JOIN                   |                   |   223 | 3345 |   243   (1)| 00:00:03 |
|   6 |    TABLE ACCESS BY INDEX ROWID| TEST_TAB          |   223 | 3345 |   242   (0)| 00:00:03 |
|* 7 |     INDEX RANGE SCAN          | IND_TEST_T1_STR2 |   223 |       |    19   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("B"."T2">=30000 AND "B"."STR1">='AAA' AND "B"."T2"<=31000 AND
              "B"."STR1"<='BBB')
       filter("B"."STR1"<='BBB' AND "B"."STR1">='AAA')
   5 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   7 - access("A"."T1">=100 AND "A"."STR2">='AAAAA' AND "A"."T1"<=104 AND
              "A"."STR2"<='BBBBB')
       filter("A"."STR2"<='BBBBB' AND "A"."STR2">='AAAAA')

查询四：

SYS@standby1/2011-05-05 17:12:23>explain plan for select /*+ use_merge(b a) no_index(a ind_test_t1_str2) no_index(b ind_test2_t2_str1)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 104 and a.str2 between 'AAAAA' and 'BBBBB' and b.str1 between 'AAA' and 'BBB' and b.t2 between 30000 and 31000;

Explained.

Elapsed: 00:00:00.01
SYS@standby1/2011-05-05 17:12:52>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2103792417

---------------------------------------------------------------------------------
| Id | Operation           | Name      | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |     5 |   150 | 1143   (2)| 00:00:14 |
|   1 | MERGE JOIN         |           |     5 |   150 | 1143   (2)| 00:00:14 |
|   2 |   SORT JOIN         |           |     5 |    75 |    15   (7)| 00:00:01 |
|* 3 |    TABLE ACCESS FULL| TEST_TAB2 |     5 |    75 |    14   (0)| 00:00:01 |
|* 4 |   SORT JOIN         |           |   223 | 3345 | 1127   (2)| 00:00:14 |
|* 5 |    TABLE ACCESS FULL| TEST_TAB |   223 | 3345 | 1126   (2)| 00:00:14 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("B"."STR1"<='BBB' AND "B"."T2"<=31000 AND "B"."T2">=30000
              AND "B"."STR1">='AAA')
   4 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   5 - filter("A"."T1"<=104 AND "A"."STR2"<='BBBBB' AND "A"."T1">=100
              AND "A"."STR2">='AAAAA')

分析：查询四禁止使用查询条件的组合索引，导致排序连接的两个表都使用全表扫描，主要原因是
3 - filter("B"."STR1"<='BBB' AND "B"."T2"<=31000 AND "B"."T2">=30000
              AND "B"."STR1">='AAA')
5 - filter("A"."T1"<=104 AND "A"."STR2"<='BBBBB' AND "A"."T1">=100
              AND "A"."STR2">='AAAAA')（查询三）

这些条件只是起到了过滤全表扫描返回不满足的记录，而查询三通过组合索引使随机读的数量大为减少，
因为
4 - access("B"."T2">=30000 AND "B"."STR1">='AAA' AND "B"."T2"<=31000 AND
              "B"."STR1"<='BBB')
7 - access("A"."T1">=100 AND "A"."STR2">='AAAAA' AND "A"."T1"<=104 AND
              "A"."STR2"<='BBBBB')（查询四）

都起到了缩减数据查询范围的作用，仅仅返回的行数分别是 223 和5，IND_TEST_T1_STR2 的clustor factor再差，也只是发生223次随即读，所以较为
高效

总结：当查询条件能有效缩减查询范围时，而不用依靠嵌套外循环提供的常量时，使用排序合并较为高效，连接列上的索引对于排序合并的执行计划成本影响不大
排序合并查询成本，对于两个排序集合的随机读或者全表扫描及其排序操作再有就是连接时的扫描合并（所占比重较小因为排序好的数据扫描很快速）

排序合并连接总结：
特点：
1。两个连接对象处理无关，只是按照自身的查询条件读取数据后排序，各个集合独立处理数据，不依赖和受其它集合干扰，没有驱动表的概念
2。只有当排序操作结束之后在能是实现连接操作，应该必须等两个集合都处理完后才能返回数据（无法实现先返回一部分数据）
3。连接时按照扫描方式，只有在为了缩减自身查询范围通过查询条件上的索引获得的rowid访问数据块时才发生随机读操作
4。连接条件列上的索引对执行计划影响不大（最多也就可能减少结果集的排序操作，可以尝试用连接列作为组合索引的先导列）

应用准则：
1。适合于最大吞吐量优化模式
2。当查询条件能够很好缩减自身查询范围（不像嵌套查询，需要依靠驱动表提供连接列的常量）
3。当连接列没有索引的时候，比较适合于处理大数据量情况
4。需要消耗内存空间，主义sort_area_size 的大小设置合适

=============================================================================================================================================

粘上一个复杂点的执行计划阅读

SYS@standby1/2011-05-05 17:12:53>explain plan for select /*+ use_merge(b a) index(a ind_test_t1_str2) index(b ind_test2_t2_str1)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 in(select t1 from test_tab where t1 between 100 and 104) and a.str2 between 'AAAAA' and 'BBBBB' and b.str1 between 'AAA' and 'BBB' and b.t2 in (select t2 from test_tab where t2 between 30000 and 31000);

Explained.

Elapsed: 00:00:00.11
SYS@standby1/2011-05-05 17:22:02>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2007291439

------------------------------------------------------------------------------------------------------
| Id | Operation                        | Name              | Rows | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                   |     1 | 6034 | 6101   (1)| 00:01:14 |
|   1 | VIEW                            | VM_NWVW_2         |     1 | 6034 | 6101   (1)| 00:01:14 |
|   2 |   HASH UNIQUE                    |                   |     1 |    57 | 6101   (1)| 00:01:14 |
|* 3 |    HASH JOIN                     |                   |     1 |    57 | 6087   (1)| 00:01:14 |
|   4 |     MERGE JOIN                   |                   |     1 |    53 | 4962   (1)| 00:01:00 |
|   5 |      SORT JOIN                   |                   |     5 |   130 |     8 (13)| 00:00:01 |
|   6 |       TABLE ACCESS BY INDEX ROWID| TEST_TAB2         |     5 |   130 |     7   (0)| 00:00:01 |
|* 7 |        INDEX RANGE SCAN          | IND_TEST2_T2_STR1 |     5 |       |     2   (0)| 00:00:01 |
|* 8 |      SORT JOIN                   |                   | 2000 | 54000 | 4954   (1)| 00:01:00 |
|   9 |       TABLE ACCESS BY INDEX ROWID| TEST_TAB          | 2000 | 54000 | 4953   (1)| 00:01:00 |
|* 10 |        INDEX FULL SCAN           | IND_TEST_T1_STR2 | 2000 |       | 2952   (1)| 00:00:36 |
|* 11 |         FILTER                   |                   |       |       |            |          |
|* 12 |          INDEX RANGE SCAN        | IND_TEST_T1       |     6 |    18 |     5   (0)| 00:00:01 |
|* 13 |     TABLE ACCESS FULL            | TEST_TAB          | 11134 | 44536 | 1125   (2)| 00:00:14 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B"."T2"="T2")
   7 - access("B"."T2">=30000 AND "B"."STR1">='AAA' AND "B"."T2"<=31000 AND "B"."STR1"<='BBB')
       filter("B"."STR1"<='BBB' AND "B"."STR1">='AAA')
   8 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
10 - access("A"."STR2">='AAAAA' AND "A"."STR2"<='BBBBB')
       filter("A"."STR2"<='BBBBB' AND "A"."STR2">='AAAAA' AND EXISTS (SELECT 0 FROM
              "TEST_TAB" "TEST_TAB" WHERE :B1<=104 AND :B2>=100 AND "T1"=:B3 AND "T1"<=104 AND "T1">=100))
11 - filter(:B1<=104 AND :B2>=100)
12 - access("T1"=:B1)
       filter("T1"<=104 AND "T1">=100)
13 - filter("T2"<=31000 AND "T2">=30000)

＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

4。组合索引中between 和in比较运算符的一个小区别

SYS@standby1/2011-05-05 12:27:10>create index ind_test_id_t1_str1 on test_tab(id,t1,str1);

Index created.

Elapsed: 00:00:09.56

SYS@standby1/2011-05-05 12:27:54>explain plan for select /*+ use_merge(b a) index(a ind_test_id_t1_str1) */ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 between 100 and 150 and a.str1 in ('ARL','ARS') and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.00
SYS@standby1/2011-05-05 12:28:42>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 538279467

-------------------------------------------------------------------------------------------
| Id | Operation           | Name                | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                     |     6 |   126 | 3379   (1)| 00:00:41 |
|   1 | MERGE JOIN         |                     |     6 |   126 | 3379   (1)| 00:00:41 |
|* 2 |   INDEX FULL SCAN   | IND_TEST_ID_T1_STR1 |     6 |    60 | 3364   (1)| 00:00:41 |
|* 3 |   SORT JOIN         |                     | 1201 | 13211 |    15   (7)| 00:00:01 |
|* 4 |    TABLE ACCESS FULL| TEST_TAB2           | 1201 | 13211 |    14   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."T1">=100 AND "A"."T1"<=150)
       filter("A"."T1"<=150 AND ("A"."STR1"='ARL' OR "A"."STR1"='ARS') AND
              "A"."T1">=100)
   3 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   4 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')

由于test_tab 表select－list列都在索引中，没有读取表，直接从索引中获取结果，但是由于组合索引第二列使用between运算符，
使用的是全索引扫描，下面的执行计划通过in和子查询实现

SYS@standby1/2011-05-05 17:33:49>select count(*) from test_tab where t1 between 100 and 150;

COUNT(*)
----------
     56808

1 row selected.

Elapsed: 00:00:00.13
SYS@standby1/2011-05-05 18:44:45>select count(*) from test_tab where t1 between 100 and 150 and str1 in ('ARL','ARS');

COUNT(*)
----------
         6

1 row selected.

分析：上面查询主要成本在通过索引访问
2 - access("A"."T1">=100 AND "A"."T1"<=150)
       filter("A"."T1"<=150 AND ("A"."STR1"='ARL' OR "A"."STR1"='ARS') AND
              "A"."T1">=100)
虽然显示只返回6行，但是那只是通过filter条件过滤后的结果，从索引返回了56808行数据，然后过滤的剩下6行（使用了INDEX FULL SCAN）

SYS@standby1/2011-05-05 12:29:19>explain plan for select /*+ use_merge(b a) index(a ind_test_id_t1_str1)*/ a.id,a.t1,a.str1,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 in (select t1 from test_tab where t1 between 100 and 150) and a.str1 in ('ARL','ARS') and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.02
SYS@standby1/2011-05-05 12:32:20>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3199294679

-------------------------------------------------------------------------------------------------
| Id | Operation                 | Name                | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                     |   408 | 2404K|   954   (2)| 00:00:12 |
|   1 | VIEW                     | VM_NWVW_2           |   408 | 2404K|   954   (2)| 00:00:12 |
|   2 |   HASH UNIQUE             |                     |   408 | 19584 |   954   (2)| 00:00:12 |
|   3 |    NESTED LOOPS           |                     |   408 | 19584 |   953   (2)| 00:00:12 |
|   4 |     MERGE JOIN            |                     |     6 |   270 |   934   (2)| 00:00:12 |
|   5 |      SORT JOIN            |                     |     6 |   132 |   919   (2)| 00:00:12 |
|* 6 |       INDEX FAST FULL SCAN| IND_TEST_ID_T1_STR1 |     6 |   132 |   918   (2)| 00:00:12 |
|* 7 |      SORT JOIN            |                     | 1201 | 27623 |    15   (7)| 00:00:01 |
|* 8 |       TABLE ACCESS FULL   | TEST_TAB2           | 1201 | 27623 |    14   (0)| 00:00:01 |
|* 9 |     INDEX RANGE SCAN      | IND_TEST_T1         |    63 |   189 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("A"."T1"<=150 AND ("A"."STR1"='ARL' OR "A"."STR1"='ARS') AND "A"."T1">=100)
   7 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   8 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   9 - access("A"."T1"="T1")
       filter("T1"<=150 AND "T1">=100)

分析：使用了INDEX FAST FULL SCAN，原因是将between 换成in 导致的，这个执行计划用到了子查询，以后再分析

添加 a.str2列，使之无法实现index_ffs

SYS@standby1/2011-05-05 12:32:22>explain plan for select /*+ use_merge(b a) index(a ind_test_id_t1) */ a.id,a.t1,a.str1,a.str2,b.str1,b.str2 from test_tab a,test_tab2 b where a.id=b.id and a.t1 in (select t1 from test_tab where t1 between 100 and 150) and a.str1 in ('ARL','ARS') and b.str1 between 'AAA' and 'DDD' ;

Explained.

Elapsed: 00:00:00.01
SYS@standby1/2011-05-05 12:34:57>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3465583884

--------------------------------------------------------------------------------------
| Id | Operation              | Name        | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             |   408 | 3201K| 1161   (2)| 00:00:14 |
|   1 | VIEW                  | VM_NWVW_2   |   408 | 3201K| 1161   (2)| 00:00:14 |
|   2 |   HASH UNIQUE          |             |   408 | 21624 | 1161   (2)| 00:00:14 |
|   3 |    NESTED LOOPS        |             |   408 | 21624 | 1160   (2)| 00:00:14 |
|   4 |     MERGE JOIN         |             |     6 |   300 | 1142   (2)| 00:00:14 |
|   5 |      SORT JOIN         |             |     6 |   162 | 1127   (2)| 00:00:14 |
|* 6 |       TABLE ACCESS FULL| TEST_TAB    |     6 |   162 | 1126   (2)| 00:00:14 |
|* 7 |      SORT JOIN         |             | 1201 | 27623 |    15   (7)| 00:00:01 |
|* 8 |       TABLE ACCESS FULL| TEST_TAB2   | 1201 | 27623 |    14   (0)| 00:00:01 |
|* 9 |     INDEX RANGE SCAN   | IND_TEST_T1 |    63 |   189 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("A"."T1"<=150 AND ("A"."STR1"='ARL' OR "A"."STR1"='ARS') AND
              "A"."T1">=100)
   7 - access("A"."ID"="B"."ID")
       filter("A"."ID"="B"."ID")
   8 - filter("B"."STR1"<='DDD' AND "B"."STR1">='AAA')
   9 - access("A"."T1"="T1")
       filter("T1"<=150 AND "T1">=100)

总结：自己分析吧有点累了！！！
＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/24890594/viewspace-694546/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/24890594/viewspace-694546/

cpfengyo322322

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
排序合并连接的使用原则和分析（use_merge）

========================================================================================================...
复制链接

扫一扫