我们知道对于一个视图或者说子查询应用某些谓词的时候,优化器在进行查询重写的时候,有时候会应用谓词推入技术(push predicate),把这样的谓词应用推入到子查询内部,过滤出更少的行之后,再进行其它的操作,比如和其它表的连接操作等,这样有时候是效率极高的.
数据库版本:10.2.0.4
create table zsj_users as select user_id,username from dba_users;
alter table zsj_users add constraint pk_zsj_users primary key(user_id);
create table zsj_objs as
select u.user_id,o.object_name,o.object_id,o.object_type
from dba_objects o,dba_users u
where o.owner=u.username;
alter table zsj_objs modify(object_name not null);
create index ind1_zsj_objs on zsj_objs(user_id,object_name);
desc zsj_users;
Name Null? Type
-------------------- -------- --------
USER_ID NOT NULL NUMBER
USERNAME NOT NULL VARCHAR2(30)
desc zsj_objs;
Name Null? Type
-------------------- -------- ---------
USER_ID NOT NULL NUMBER
OBJECT_NAME NOT NULL VARCHAR2(128)
OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
exec dbms_stats.gather_table_stats(user,'ZSJ_USERS',cascade => TRUE,estimate_percent => 100,method_opt => 'FOR ALL COLUMNS SIZE 1');
exec dbms_stats.gather_table_stats(user,'ZSJ_OBJS',cascade => TRUE,estimate_percent => 100,method_opt => 'FOR ALL COLUMNS SIZE 254');
select user_id,count(1) from zsj_objs group by user_id order by 2;
USER_ID COUNT(1)
---------- ----------
21 3
46 8
57 8
25 8
11 9
45 10
61 12
24 46
36 189
35 282
37 313
26 315
5 454
39 677
49 720
47 932
54 1341
44 1721
60 2247
0 23328
我们使用对象最少的这个用户id:21测试,因为我在zsj_objs的user_id列上收集了柱状图统计信息,所以优化器是知道21这个用户id对应了很少的数据行的.
SELECT *
FROM (SELECT *
FROM (SELECT o.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID)
WHERE Rn = 1)
WHERE USER_ID = 21;
Execution Plan
----------------------------------------------------------
Plan hash value: 969073438
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 366 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 3 | 366 | 3 (0)| 00:00:01 |
|* 2 | WINDOW NOSORT | | 3 | 141 | 3 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 3 | 141 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| ZSJ_USERS | 1 | 10 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_ZSJ_USERS | 1 | | 1 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| ZSJ_OBJS | 3 | 111 | 2 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | IND1_ZSJ_OBJS | 3 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN"=1)
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "O"."USER_ID" ORDER BY
"O"."OBJECT_NAME")<=1)
5 - access("U"."USER_ID"=21)
7 - access("O"."USER_ID"=21)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
1 rows processed
明显,这里使用了谓词推入技术,将user_id=21这个谓词推入到了子查询内部.
但我们看看一种变换的等价形式:
SELECT *
FROM (SELECT *
FROM (SELECT u.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID)
WHERE Rn = 1)
WHERE USER_ID = 21;
Execution Plan
----------------------------------------------------------
Plan hash value: 113949649
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32623 | 3886K| | 390 (2)| 00:00:05 |
|* 1 | VIEW | | 32623 | 3886K| | 390 (2)| 00:00:05 |
|* 2 | WINDOW SORT PUSHED RANK| | 32623 | 1306K| 3336K| 390 (2)| 00:00:05 |
|* 3 | HASH JOIN | | 32623 | 1306K| | 41 (3)| 00:00:01 |
| 4 | TABLE ACCESS FULL | ZSJ_USERS | 24 | 240 | | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | ZSJ_OBJS | 32623 | 987K| | 38 (3)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."USER_ID"=21 AND "RN"=1)
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "O"."USER_ID" ORDER BY
"O"."OBJECT_NAME")<=1)
3 - access("U"."USER_ID"="O"."USER_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
206 consistent gets
1 rows processed
实际上这里的实现和第一个sql的实现是完全等价的,但明显这里却没有使用谓词推入,所以逻辑io和下面的一致了:
SELECT *
FROM (SELECT o.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY u.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID)
WHERE Rn = 1 and USER_ID = 21;
也和下面的一致了:
SELECT *
FROM (SELECT u.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID)
WHERE Rn = 1;
上面这个sql的执行计划:
Execution Plan
----------------------------------------------------------
Plan hash value: 113949649
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32623 | 3886K| | 390 (2)| 00:00:05 |
|* 1 | VIEW | | 32623 | 3886K| | 390 (2)| 00:00:05 |
|* 2 | WINDOW SORT PUSHED RANK| | 32623 | 1306K| 3336K| 390 (2)| 00:00:05 |
|* 3 | HASH JOIN | | 32623 | 1306K| | 41 (3)| 00:00:01 |
| 4 | TABLE ACCESS FULL | ZSJ_USERS | 24 | 240 | | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | ZSJ_OBJS | 32623 | 987K| | 38 (3)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN"=1)
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "O"."USER_ID" ORDER BY
"O"."OBJECT_NAME")<=1)
3 - access("U"."USER_ID"="O"."USER_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
206 consistent gets
20 rows processed
按说,可以使用两个提示push_pred和no_push_pred来指定推入谓词或者是不推入谓词的
于是我使用提示来强制推入谓词:
SELECT /*+ push_pred(v) */*
FROM (SELECT *
FROM (SELECT u.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID)
WHERE Rn = 1) v
WHERE USER_ID = 21;
SELECT /*+ push_pred(v1) */*
FROM (SELECT /*+ push_pred(v2) */*
FROM (SELECT u.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID) v2
WHERE Rn = 1) v1
WHERE USER_ID = 21;
SELECT /*+ push_pred(v1) */*
FROM (SELECT u.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID) v1
WHERE v1.USER_ID = 21 and rn=1;
可这3个都实现不了谓词的推入,也就是说这里优化器确实不让推入这个谓词,根本不把推入谓词作为一个可选的执行计划,而不是说因为它计算出的cost偏高而被优化器抛弃了.但为什么会是这样呢?难道和传递闭包(Transitive Closure)对于分析函数来说传递不了有关
让我搞不明白的还有,对于推入谓词的sql,我想通过使用提示阻止谓词的推入,也不行,难道和我这里使用提示的方法不对有关?
SELECT /*+ no_merge(v) no_push_pred(v) */*
FROM (SELECT *
FROM (SELECT o.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID)
WHERE Rn = 1) v
WHERE USER_ID = 21;
SELECT /*+ no_merge(v) no_push_pred(v) */*
FROM (SELECT o.USER_ID,
u.username,
ROW_NUMBER() OVER(PARTITION BY O.USER_ID ORDER BY O.OBJECT_NAME) Rn,
O.OBJECT_ID,
O.OBJECT_NAME
FROM ZSJ_USERS U, ZSJ_OBJS O
WHERE U.USER_ID = O.USER_ID) v
WHERE USER_ID = 21 and Rn = 1;