A表和B表都有个字段是code,code是表示部门编码的意思,如03表示一个大部门,0301,0302是03的小部门,030101,030102是0301的子部门,现在有个需求,就是查A表中部门的子部门在B表中的记录,可以是两种方式:
一种substr(A.code,1,length(B.code))=B.code,
另一种B.code like A.code||'%'。
下面我们来做个试验:
SQL> create table test_object as select * from all_objects;SQL> create table test_obj1 as select * from test_object where length(object_name)<8;
SQL> insert into test_obj1 select * from test_object
where length(object_name) < 16
and length(object_name) >= 8;
SQL> insert into test_obj1 select * from test_object where
length(object_name)<24 and length(object_name)>=16;
SQL> commit;
SQL> select count(*) from test_object;
COUNT(*)
----------
50250
SQL> select count(*) from test_obj1;
COUNT(*)
----------
22564
SQL> exec dbms_stats.gather_table_stats(user,'test_object');
SQL> exec dbms_stats.gather_table_stats(user,'test_obj1');
SQL> set timing on
SQL> set autotrace traceonly
SQL> select a.*, b.object_name
from TEST_OBJECT a, TEST_OBJ1 b
where substr(a.object_name, 1, length(b.object_name)) = b.object_name;
已选择67744行。
已用时间: 00: 07: 36.09
执行计划
----------------------------------------------------------
Plan hash value: 661671976
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 77053 | 8427K| 3169K (1)| 11:42:32 |
| 1 | NESTED LOOPS | | 77053 | 8427K| 3169K (1)| 11:42:32 |
| 2 | TABLE ACCESS FULL| TEST_OBJECT | 50250 | 4563K| 147 (1)| 00:00:02 |
|* 3 | TABLE ACCESS FULL| TEST_OBJ1 | 2 | 38 | 63 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJ
ECT_NAME")))
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
15586427 consistent gets
0 physical reads
0 redo size
2210117 bytes sent via SQL*Net to client
50061 bytes received via SQL*Net from client
4518 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
67744 rows processed
执行计划是nested loops,特别是需要这种大表,效率低下可想而知。没有试验like,因为用它结果会更糟。下面我们采用分片处理的方法,object_name长度大于3的用一种方式处理,小于等于3的用一种方式处理。
SQL> select a.*, b.object_namefrom TEST_OBJECT a, TEST_OBJ1 b
where substr(a.object_name, 1, length(b.object_name)) = b.object_name
and substr(a.object_name, 1, 4) = substr(b.object_name, 1, 4)
and length(b.object_name) > 3
union all
select a.*, b.object_name
from TEST_OBJECT a, TEST_OBJ1 b
where substr(a.object_name, 1, length(b.object_name)) = b.object_name
and length(b.object_name) < 4;
已选择67744行。
已用时间: 00: 00: 08.57
执行计划
----------------------------------------------------------
Plan hash value: 4080738151
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3892 | 425K| 164K(100)| 00:36:27 |
| 1 | UNION-ALL | | | | | |
|* 2 | HASH JOIN | | 39 | 4368 | 214 (2)| 00:00:03 |
|* 3 | TABLE ACCESS FULL| TEST_OBJ1 | 1128 | 21432 | 65 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| TEST_OBJECT | 50250 | 4563K| 147 (1)| 00:00:02 |
| 5 | NESTED LOOPS | | 3853 | 421K| 164K (1)| 00:36:24 |
|* 6 | TABLE ACCESS FULL| TEST_OBJ1 | 1128 | 21432 | 65 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL| TEST_OBJECT | 3 | 279 | 145 (0)| 00:00:02 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(SUBSTR("A"."OBJECT_NAME",1,4)=SUBSTR("B"."OBJECT_NAME",1,4))
filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJE
CT_NAME")))
3 - filter(LENGTH("B"."OBJECT_NAME")>3)
6 - filter(LENGTH("B"."OBJECT_NAME")<4)
7 - filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJE
CT_NAME")))
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
44813 consistent gets
0 physical reads
0 redo size
2310942 bytes sent via SQL*Net to client
50061 bytes received via SQL*Net from client
4518 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
67744 rows processed
可以看到执行计划是复杂了很多,但执行时间缩短了60倍,逻辑读也大大降低。分片的思想是大部分数据处理采用hash join(由于引起了等值条件substr(a.object_name, 1, 4) = substr(b.object_name, 1, 4),所以执行计划改变),少部分数据处理采用nested loops。这种思想对我们的日常调优非常有用。