mysql not exists not in 效率,EXISTS 和 IN ,NOT EXISTS 和 NOT IN的效率比较

在我们一般的观点中,总是认为使用EXISTS(或NOT EXISTS)通常将提高查询的效率,所以一般推荐使用exists来代替in。但实际情况是不是这个样子呢?我们用实际的例子来看一下,我们分别在两种不同的优化器模式下来看。

create table test1 as select * from dba_objects ;

create table test2 as select * from dba_tables ;

create index idx_object_name on test1(object_name) ;

create index idx_table_name on test2(table_name) ;

exec dbms_stats.gather_table_stats('XIGUA','TEST1') ;

exec dbms_stats.gather_table_stats('XIGUA','TEST2') ;

在CBO模式下:

select * from test1

where object_name in

( select table_name from test2 where table_name like 'M%') ;

419 rows selected.

Elapsed: 00:00:00.09

Execution Plan

----------------------------------------------------------

Plan hash value: 2580328806

-------------------------------------------------------------------------------

| Id  | Operation            | Name           | Rows  | Bytes | Cost (%CPU)| Tim

e     |

-------------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |                |   423 | 47376 |   163   (2)| 00:

00:02 |

|*  1 |  HASH JOIN RIGHT SEMI|                |   423 | 47376 |   163   (2)| 00:

00:02 |

|*  2 |   INDEX RANGE SCAN   | IDX_TABLE_NAME |   420 |  7980 |     3   (0)| 00:

00:01 |

|*  3 |   TABLE ACCESS FULL  | TEST1          |  1125 |   102K|   160   (2)| 00:

00:02 |

-------------------------------------------------------------------------------Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("OBJECT_NAME"="TABLE_NAME")

2 - access("TABLE_NAME" LIKE 'M%')

filter("TABLE_NAME" LIKE 'M%')

3 - filter("OBJECT_NAME" LIKE 'M%')

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

727  consistent gets

0  physical reads

0  redo size

22152  bytes sent via SQL*Net to client

682  bytes received via SQL*Net from client

29  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

419  rows processed

select * from test1

where exists

( select 1 from test2 where test1.object_name = test2.table_name and test2.table_name like 'M%') ;

419 rows selected.

Elapsed: 00:00:00.08

Execution Plan

----------------------------------------------------------

Plan hash value: 2580328806

| Id  | Operation            | Name           | Rows  | Bytes | Cost (%CPU)| Tim

e     |

-------------------------------------------------------------------------------|   0 | SELECT STATEMENT     |                |   423 | 47376 |   163   (2)| 00:

00:02 |

|*  1 |  HASH JOIN RIGHT SEMI|                |   423 | 47376 |   163   (2)| 00:

00:02 |

|*  2 |   INDEX RANGE SCAN   | IDX_TABLE_NAME |   420 |  7980 |     3   (0)| 00:

00:01 |

|*  3 |   TABLE ACCESS FULL  | TEST1          |  1125 |   102K|   160   (2)| 00:

00:02 |

-------------------------------------------------------------------------------Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("TEST1"."OBJECT_NAME"="TEST2"."TABLE_NAME")

2 - access("TEST2"."TABLE_NAME" LIKE 'M%')

filter("TEST2"."TABLE_NAME" LIKE 'M%')

3 - filter("TEST1"."OBJECT_NAME" LIKE 'M%')

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

727  consistent gets

0  physical reads

0  redo size

22152  bytes sent via SQL*Net to client

682  bytes received via SQL*Net from client

29  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

419  rows processed

在CBO模式下,我们可以看到这两者的执行计划完全相同,统计数据也相同。

我们再来看一下RBO模式下的情况,这种情况相对复杂一些。

SQL> select /*+ rule */ * from test1

2  where object_name in

3        ( select table_name from test2 where table_name like 'M%') ;

419 rows selected.

Elapsed: 00:00:00.13

Execution Plan

----------------------------------------------------------

Plan hash value: 618497113

-------------------------------------------------------

| Id  | Operation                   | Name            |

-------------------------------------------------------

|   0 | SELECT STATEMENT            |                 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TEST1           |

|   2 |   NESTED LOOPS              |                 |

|   3 |    VIEW                     | VW_NSO_1        |

|   4 |     SORT UNIQUE             |                 |

|*  5 |      INDEX RANGE SCAN       | IDX_TABLE_NAME  |

|*  6 |    INDEX RANGE SCAN         | IDX_OBJECT_NAME |

-------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

5 - access("TABLE_NAME" LIKE 'M%')

filter("TABLE_NAME" LIKE 'M%')

6 - access("OBJECT_NAME"="$nso_col_1")

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

646  consistent gets

0  physical reads

0  redo size

27884  bytes sent via SQL*Net to client

682  bytes received via SQL*Net from client

29  SQL*Net roundtrips to/from client

1  sorts (memory)

0  sorts (disk)

419  rows processed

SQL> select /*+ rule */ * from test1

2  where exists

3        ( select 1 from test2 where test1.object_name = test2.table_name and test2.table_name like 'M%') ;

419 rows selected.

Elapsed: 00:00:00.87

Execution Plan

----------------------------------------------------------

Plan hash value: 1694944701

---------------------------------------------

| Id  | Operation          | Name           |

---------------------------------------------

|   0 | SELECT STATEMENT   |                |

|*  1 |  FILTER            |                |

|   2 |   TABLE ACCESS FULL| TEST1          |

|*  3 |   INDEX RANGE SCAN | IDX_TABLE_NAME |

---------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( EXISTS (SELECT 0 FROM "TEST2" "TEST2" WHERE

"TEST2"."TABLE_NAME"=:B1 AND "TEST2"."TABLE_NAME" LIKE 'M%'))

3 - access("TEST2"."TABLE_NAME"=:B1)

filter("TEST2"."TABLE_NAME" LIKE 'M%')

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

70840  consistent gets

0  physical reads

0  redo size

22152  bytes sent via SQL*Net to client

682  bytes received via SQL*Net from client

29  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

419  rows processed

在这里,我们可以看到实际上,使用in效率比exists效率更高。我们可以这样来理解这种情况:

对于in,rule优化器选择的内存查询的结果作为驱动表来进行nest loops连接,所以当内存查询的结果集比较小的时候,这个in的效率还是比较高的。

对于exists,则是利用外查询表的全表扫描结果集过滤内查询的结果集,当外查询的表比较大的时候,相对效率比较低。

再来看一下内查询结果集比较大,而外查询较小的时候的情况。

SQL> select /*+ rule */ * from test2

where exists

2    3        ( select 1 from test1 where test1.object_name = test2.table_name and test1.object_name like 'S%') ;

172 rows selected.

Elapsed: 00:00:00.08

Execution Plan

----------------------------------------------------------

Plan hash value: 833525739

----------------------------------------------

| Id  | Operation          | Name            |

----------------------------------------------

|   0 | SELECT STATEMENT   |                 |

|*  1 |  FILTER            |                 |

|   2 |   TABLE ACCESS FULL| TEST2           |

|*  3 |   INDEX RANGE SCAN | IDX_OBJECT_NAME |

----------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( EXISTS (SELECT 0 FROM "TEST1" "TEST1" WHERE

"TEST1"."OBJECT_NAME"=:B1 AND "TEST1"."OBJECT_NAME" LIKE 'S%'))

3 - access("TEST1"."OBJECT_NAME"=:B1)

filter("TEST1"."OBJECT_NAME" LIKE 'S%')

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

4837  consistent gets

0  physical reads

2596  redo size

13642  bytes sent via SQL*Net to client

506  bytes received via SQL*Net from client

13  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

172  rows processed

SQL> select /*+ rule */ * from test2

2  where table_name in

3        ( select object_name from test1 where object_name like 'S%') ;

172 rows selected.

Elapsed: 00:00:00.12

Execution Plan

----------------------------------------------------------

Plan hash value: 2497755124

-------------------------------------------------------

| Id  | Operation                   | Name            |

-------------------------------------------------------

|   0 | SELECT STATEMENT            |                 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TEST2           |

|   2 |   NESTED LOOPS              |                 |

|   3 |    VIEW                     | VW_NSO_1        |

|   4 |     SORT UNIQUE             |                 |

|*  5 |      INDEX RANGE SCAN       | IDX_OBJECT_NAME |

|*  6 |    INDEX RANGE SCAN         | IDX_TABLE_NAME  |

-------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

5 - access("OBJECT_NAME" LIKE 'S%')

filter("OBJECT_NAME" LIKE 'S%')

6 - access("TABLE_NAME"="$nso_col_1")

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

2214  consistent gets

0  physical reads

7556  redo size

13378  bytes sent via SQL*Net to client

506  bytes received via SQL*Net from client

13  SQL*Net roundtrips to/from client

1  sorts (memory)

0  sorts (disk)

172  rows processed

从理论上分析,当内存查询的结果集比较大,而外查询比较小的时候,应该增加in的cost,对于exists的filter来说,也应该效率有所提高,所以再这种情况下,我们应该看到exists效率更高。但是在实际的测试中,我们可以看到,这两者并不存在明显的性能上的差异,甚至于用in的效果比exists更好。

总结:

1、在CBO模式下,两者效率一致。

2、在RBO模式下,当外层数据集远大于内层数据集时,使用in的效率比较高。

3、在RBO模式下,当外层数据集远小于内层数据集时,两者效率相差不大。

可能我的测试数据选择不太理想,欢迎大家讨论。

==========

补充not exists和not in的测试数据

先来看一下CBO模式下的情况:

SQL> select * from test2

2  where table_name not in

3        ( select object_name from test1 where object_name like 'A%') ;

2950 rows selected.

Elapsed: 00:00:13.93

Execution Plan

----------------------------------------------------------

Plan hash value: 833525739

---------------------------------------------------------------------------

| Id  | Operation          | Name            | Rows  | Bytes | Cost (%CPU)| Time

|

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |                 |  1576 |   315K|  6364   (1)| 00:0

1:17 |

|*  1 |  FILTER            |                 |       |       |            |

|

|   2 |   TABLE ACCESS FULL| TEST2           |  1577 |   315K|    14   (0)| 00:0

0:01 |

|*  3 |   INDEX RANGE SCAN | IDX_OBJECT_NAME |     1 |    25 |     8   (0)| 00:0

0:01 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST1" "TEST1" WHERE

"OBJECT_NAME" LIKE 'A%' AND LNNVL("OBJECT_NAME"<>:B1)))

3 - access("OBJECT_NAME" LIKE 'A%')

filter("OBJECT_NAME" LIKE 'A%' AND LNNVL("OBJECT_NAME"<>:B1))

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

60806  consistent gets

29  physical reads

1104  redo size

165661  bytes sent via SQL*Net to client

2541  bytes received via SQL*Net from client

198  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

2950  rows processed

SQL> select * from test2

2  where not exists

3        ( select 1 from test1 where test1.object_name = test2.table_name and test1.object_name like 'A%') ;

2950 rows selected.

Elapsed: 00:00:00.19

Execution Plan

----------------------------------------------------------

Plan hash value: 349475409

-------------------------------------------------------------------------------

| Id  | Operation            | Name            | Rows  | Bytes | Cost (%CPU)| Ti

me     |

--------------------------------------------------------------------------------

--------

|   0 | SELECT STATEMENT     |                 |  1538 |   345K|    23   (5)| 00

:00:01 |

|*  1 |  HASH JOIN RIGHT ANTI|                 |  1538 |   345K|    23   (5)| 00

:00:01 |

|*  2 |   INDEX RANGE SCAN   | IDX_OBJECT_NAME |  1259 | 31475 |     8   (0)| 00

:00:01 |

|   3 |   TABLE ACCESS FULL  | TEST2           |  1577 |   315K|    14   (0)| 00

:00:01 |

-------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("TEST1"."OBJECT_NAME"="TEST2"."TABLE_NAME")

2 - access("TEST1"."OBJECT_NAME" LIKE 'A%')

filter("TEST1"."OBJECT_NAME" LIKE 'A%')

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

314  consistent gets

0  physical reads

0  redo size

165661  bytes sent via SQL*Net to client

2541  bytes received via SQL*Net from client

198  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

2950  rows processed

在CBO模式下,not in采用的是filter,对外层表做全表扫描,再去filter内层查询结果集。not exists采用内层查询结果集作hash join连接。

从结果明显可以看到,not exists效率比较高。把内层和外层的表对换一下,结果也是相同的,not exists的效率明显高很多。

在rule模式下:

SQL> select /*+ rule */ * from test1

2  where object_name not in

3        ( select table_name from test2 where table_name like 'A%' ) ;

229092 rows selected.

Elapsed: 00:00:22.34

Execution Plan

----------------------------------------------------------

Plan hash value: 1694944701

---------------------------------------------

| Id  | Operation          | Name           |

---------------------------------------------

|   0 | SELECT STATEMENT   |                |

|*  1 |  FILTER            |                |

|   2 |   TABLE ACCESS FULL| TEST1          |

|*  3 |   INDEX RANGE SCAN | IDX_TABLE_NAME |

---------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( NOT EXISTS (SELECT 0 FROM "TEST2" "TEST2" WHERE

LNNVL("TABLE_NAME"<>:B1) AND "TABLE_NAME" LIKE 'A%'))

3 - access("TABLE_NAME" LIKE 'A%')

filter(LNNVL("TABLE_NAME"<>:B1) AND "TABLE_NAME" LIKE 'A%')

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

348779  consistent gets

0  physical reads

0  redo size

11059645  bytes sent via SQL*Net to client

168377  bytes received via SQL*Net from client

15274  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

229092  rows processed

SQL>

SQL> select /*+ rule */ * from test1

2  where not exists

3        ( select 1 from test2 where test1.object_name = test2.table_name and test2.table_name like 'A%' ) ;

229092 rows selected.

Elapsed: 00:00:08.00

Execution Plan

----------------------------------------------------------

Plan hash value: 1694944701

---------------------------------------------

| Id  | Operation          | Name           |

---------------------------------------------

|   0 | SELECT STATEMENT   |                |

|*  1 |  FILTER            |                |

|   2 |   TABLE ACCESS FULL| TEST1          |

|*  3 |   INDEX RANGE SCAN | IDX_TABLE_NAME |

---------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( NOT EXISTS (SELECT 0 FROM "TEST2" "TEST2" WHERE

"TEST2"."TABLE_NAME"=:B1 AND "TEST2"."TABLE_NAME" LIKE 'A%'))

3 - access("TEST2"."TABLE_NAME"=:B1)

filter("TEST2"."TABLE_NAME" LIKE 'A%')

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

352897  consistent gets

0  physical reads

0  redo size

11059645  bytes sent via SQL*Net to client

168377  bytes received via SQL*Net from client

15274  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

229092  rows processed

在RBO模式下,两者的执行计划完全相同。

总结:

1、CBO模式下,not exists效率较高。

2、RBO模式下,两者效率差不多。

[本帖最后由 BTxigua 于 2008-10-14 10:42 编辑]

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值