关联查询时使用树状查询要小心

很多人为了方便经常将几个表的关联查询和树状查询合在一起,这样做很可能产生一些并不需要的记录。


如果在查询的时候两张或两张以上的表进行关联查询,同时需要对其中一张执行CONNECT BY操作,那么建议要不然将CONNECT BY放到查询最里层,要不然将CONNECT BY放到查询最外层,而不要将CONNECT BY与多表连接放在一起。

通过一个例子来说明这个问题吧:

SQL> CREATE TABLE T (ID NUMBER, FID NUMBER, NAME VARCHAR2(30));

表已创建。

SQL> ALTER TABLE T ADD CONSTRAINT PK_T PRIMARY KEY (ID);

表已更改。

SQL> INSERT INTO T VALUES (1, 0, 'TABLE');

已创建 1 行。

SQL> INSERT INTO T VALUES (2, 0, 'INDEX');

已创建 1 行。

SQL> INSERT INTO T SELECT 10000 + ROWNUM, 1, TABLE_NAME FROM DBA_TABLES;

已创建1624行。

SQL> INSERT INTO T SELECT 20000 + ROWNUM, 2, INDEX_NAME FROM DBA_INDEXES;

已创建2401行。

SQL> CREATE INDEX IND_T_FID ON T(FID);

索引已创建。

SQL> CREATE TABLE T1 (ID NUMBER, USERNAME VARCHAR2(30), TID NUMBER);

表已创建。

SQL> ALTER TABLE T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (ID);

表已更改。

SQL> INSERT INTO T1 VALUES (1, 'YANGTK', 1);

已创建 1 行。

SQL> INSERT INTO T1 VALUES (2, 'YANGTK', 10005);

已创建 1 行。

SQL> INSERT INTO T1 VALUES (3, 'YANGTK', 10006);

已创建 1 行。

SQL> INSERT INTO T1 VALUES (4, 'TEST', 1);

已创建 1 行。

SQL> CREATE INDEX IND_T1_NAME ON T1(USERNAME);

索引已创建。

构造一个很简单的树型结构的表,并构造一张关联表,下面执行一个简单的连接查询:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK';

ID FID NAME USERNAME
---------- ---------- ------------------------------ -------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

如果在这个查询的基础上加上一个树型查询,START WITH FID = 0,看看执行的结果:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK'
5 START WITH FID = 0
6 CONNECT BY PRIOR T.ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ -----------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

现在结果变成了5条,为什么不加树型查询得到3条记录,而添加了树型查询结果变成了5条呢。

观察一下执行计划:

SQL> SET AUTOT ON EXP
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK'
5 START WITH FID = 0
6 CONNECT BY PRIOR T.ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

执行计划
----------------------------------------------------------
Plan hash value: 1198160839

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 292 | 3 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | FILTER | | | | | |
| 4 | COUNT | | | | | |
| 5 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 16 | COUNT | | | | | |
| 17 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 18 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T1"."USERNAME"='YANGTK')
2 - filter("FID"=0)
3 - filter("FID"=0)
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
15 - access("T"."ID"="T1"."TID")
20 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

Oracle先进行的连接,然后是CONNECT BY,最后使用filter("T1"."USERNAME"='YANGTK')来进行过滤。

这就是造成数据重复的原因。

实际上这个SQL相当于先执行了一个:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 START WITH FID = 0
5 CONNECT BY PRIOR T.ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
1 0 TABLE TEST
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

已选择6行。

执行计划
----------------------------------------------------------
Plan hash value: 2039799266

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 292 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
|* 2 | FILTER | | | | | |
| 3 | COUNT | | | | | |
| 4 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 8 | HASH JOIN | | | | | |
| 9 | CONNECT BY PUMP | | | | | |
| 10 | COUNT | | | | | |
| 11 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 15 | COUNT | | | | | |
| 16 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 17 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 18 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 19 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("FID"=0)
2 - filter("FID"=0)
7 - access("T"."ID"="T1"."TID")
8 - access("FID"=NULL)
14 - access("T"."ID"="T1"."TID")
19 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

然后再次基础上执行了T1.USERNAME = 'YANGTK'

直接的关联加树型查询等价于下面的SQL

SQL> SELECT * FROM
2 (
3 SELECT T.ID, FID, NAME, USERNAME
4 FROM T, T1
5 WHERE T.ID = T1.TID
6 START WITH FID = 0
7 CONNECT BY PRIOR T.ID = FID
8 )
9 WHERE USERNAME = 'YANGTK';

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

执行计划
----------------------------------------------------------
Plan hash value: 778818883

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 240 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 4 | 240 | 3 (0)| 00:00:01 |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | FILTER | | | | | |
| 4 | COUNT | | | | | |
| 5 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 16 | COUNT | | | | | |
| 17 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 18 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("USERNAME"='YANGTK')
2 - filter("FID"=0)
3 - filter("FID"=0)
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
15 - access("T"."ID"="T1"."TID")
20 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

而这个结果显然不是希望得到的。

对于这种即包含关联又包含树型查询的SQL,最好的方法是先做一个,再做另一个,这样可以确保得到预期的结果:

SQL> SELECT * FROM
2 (
3 SELECT T.ID, FID, NAME, USERNAME
4 FROM T, T1
5 WHERE T.ID = T1.TID
6 AND T1.USERNAME = 'YANGTK'
7 )
8 START WITH FID = 0
9 CONNECT BY PRIOR ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

执行计划
----------------------------------------------------------
Plan hash value: 3886537187

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 219 | 2 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
|* 2 | FILTER | | | | | |
| 3 | COUNT | | | | | |
| 4 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID| T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 15 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 16 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 17 | COUNT | | | | | |
| 18 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 20 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 21 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 22 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("FID"=0)
2 - filter("FID"=0)
6 - access("T1"."USERNAME"='YANGTK')
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
14 - access("T1"."USERNAME"='YANGTK')
16 - access("T"."ID"="T1"."TID")
20 - access("T1"."USERNAME"='YANGTK')
22 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

刚才是先做关联,然后做树型查询,也可以反过来,先做树型查询:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM
3 (
4 SELECT ID, FID, NAME
5 FROM T
6 START WITH FID = 0
7 CONNECT BY PRIOR ID = FID
8 ) T, T1
9 WHERE T.ID = T1.TID
10 AND T1.USERNAME = 'YANGTK';

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

执行计划
----------------------------------------------------------
Plan hash value: 973607771

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40 | 2920 | 3 (34)| 00:00:01 |
|* 1 | HASH JOIN | | 40 | 2920 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 4 | VIEW | | 40 | 1720 | 1 (0)| 00:00:01 |
|* 5 | CONNECT BY WITH FILTERING | | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID | T | | | | |
|* 7 | INDEX RANGE SCAN | IND_T_FID | 2 | 26 | 1 (0)| 00:00:01 |
| 8 | NESTED LOOPS | | | | | |
| 9 | BUFFER SORT | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | TABLE ACCESS BY INDEX ROWID| T | 40 | 1720 | 1 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | IND_T_FID | 16 | | 1 (0)| 00:00:01 |
|* 13 | TABLE ACCESS FULL | T | 40 | 1720 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T"."ID"="T1"."TID")
3 - access("T1"."USERNAME"='YANGTK')
5 - filter("FID"=0)
7 - access("FID"=0)
12 - access("FID"=NULL)
13 - access("FID"=NULL)

Note
-----
- dynamic sampling used for this statement

两种方法选择那种,就需要根据具体情况进行分析了。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-69444/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/4227/viewspace-69444/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值