oracle 外连接改写_第61篇必须通过改写SQL才能提升性能的一些情况

最新推荐文章于 2022-08-26 10:28:57 发布

何普

最新推荐文章于 2022-08-26 10:28:57 发布

阅读量193

点赞数

文章标签： oracle 外连接改写

本文链接：https://blog.csdn.net/weixin_34979632/article/details/113309813

版权

这篇文章介绍了一些需要通过改写才能提高性能的SQL写法，也是对本人以前公众号改写相关文章的一个总结(也有新内容)。同时也对网络上流传的一些不太准确的说法给予纠正。改写的首要任务是等价，其次才是性能的提高，不等价的改写危害更大。

希望能对大家有一些帮助。一家之言，欢迎留言讨论。

1 大结果集标量子查询改外关联

这个改写网上介绍的挺多，改写的两个关键点就是：

1、主查询返回结果集比较大，如果主查询返回结果集小，没有改写必要。

2、改写方式是改成外关联(a left join b on a.id=b.id或 a.id=b.id(+))，而不是不等价的内连接(inner join 或 a.id=b.id)。

示例：

原SQL：

select owner,object_id ,

nvl((select object_name from t2 b where a.object_id=b.object_id),'unknown') as t2_name

from t1 a where owner not in ('SYS');

改写后的SQL：

select a.owner,a.object_id ,

nvl(b.object_name,'unknown') as t2_name

from t1 a ,t2 b

where a.owner not in ('SYS')

and a.object_id=b.object_id(+);

2 not in的写法建议改成not exists(而in 和 exists基本没差别，不需要纠结)

not in：

select object_id,object_name from t2

where object_id not in (select object_id from t1)；

not exists：

select object_id,object_name from t2

where not exists(select object_id from t1 where t1.object_id=t2.object_id);

说明：

1、如果主查询和子查询的关联字段(上面对应object_id)定义都是not null时，not in和not exists是等价的，没有区别；

2、如果主查询或子查询的关联字段可为null时，返回结果集可能不同：

not in：如果子查询结果集有null值(object_id为null)，那整个查询结果就没有返回，这个大部分情况不是想要的结果；如果子查询结果集没有null值，主查询的null值记录也不会返回；

not exists：子查询有null值不会返回空结果集；主查询为null值的记录也会返回。

3、两者的性能差别主要体现在子查询的关联字段定义为null时：

not in：子查询一定要全表扫描；

not exists：子查询不一定要全表扫描，主表小，子查询表大时效率高(nested loop)；

结论：建议使用not exists，不用not in。如果要改写，注意等价性，一般来说not exists返回的是需要的结果。

注意，下图的这种说法是不对的，优化器不会对t2表做两次扫描：

3 两种OR的改写

1、 两个字段谓词条件的or

select object_name,object_type,object_id from t1 where object_name='T1' or object_id<=10;

上面这种情况，如果两个字段的选择性可以，而且都存在索引，不论是oracle还是mysql，优化器都是会自动改写的，上面的sql如果要手工改写，可以这样改：

select object_name,object_type,object_id from t1 where object_name='T1'

union all

select object_name,object_type,object_id from t1 where object_id<=10 and lnnvl(object_name='T1') ;

注意：等价改写是用union all，而不是网上普遍流传的union，既不等价，效率又低；需要使用union all，但不要忘了lnnvl的补充条件，而且注意，不要写成object_name<>'T1'，两种又是不同的。

2、 or exists

select object_name,object_type,object_id from t1 where object_name='T1' or exists (select 1 from t2 where t1.object_id=t2.object_id and t2.object_id<=10);

这种情况优化器就不会自动帮你改写了(oracle 12.2版本及以上可以自动使用or_expand做查询转换)，还是按照上面的思路：

select object_name,object_type,object_id from t1 where object_name='T1'

union all

select object_name,object_type,object_id from t1 where exists (select 1 from t2 where t1.object_id=t2.object_id and t2.object_id<=10) and lnnvl(object_name='T1');

4 取分组后最大(最小)值的那一条记录

下面这个SQL是取每个owner最后创建的对象信息，不但低效，而且结果集也不是想要的。先group by，再自关联：

select t1.owner, object_type,object_name,object_id,created

from t1 ,

(select owner,max(created) as max_createdfrom t1 group by owner) t2 where t1.owner=t2.owner and t1.created=t2.max_created;

如果owner对应最大的created有重复，这样关联后还会返回重复记录。

如果要得到不重复的结果集，这种写法需要把created需要换成object_id:

select t1.owner, object_type,object_name,object_id,created

from t1 ,

(select owner,max(object_id) as max_id from t1 group by owner) t2

where t1.owner=t2.owner and t1.object_id=t2.max_id;

高效而且可以使用created的写法，需要使用row_number分析函数：

select * from

(select owner,object_type,object_name,object_id,created,

row_number() over (partition by owner order by created desc) as RN

from t1

) where RN=1;

这里如果不用row_number()，而是使用max(created)，也会得到和前面的max(created)一样的有重复记录的结果。

5 让like '%ABCDE' (百分号在前)的写法使用索引

这个比较简单，先创建reverse 函数索引，再使用reverse函数改写sql。

原SQL无法使用索引：

select owner,object_name from t1 where object_name like '%ABCDE';

改写的同时，还需要创建reverse函数索引：

create index idx_t1_objectname_rev on t1(reverse(object_name));

select owner,object_name from t1 where reverse(object_name) like reverse('%ABCDE');

6 让like '%ABCDE%' 这样的SQL效率得到一些提升

前提：表字段数较多(两三个字段的表就没必要折腾了)；返回记录数少。

create index idx_t1_object_name on t1(object_name);

原SQL无法使用object_name字段上的索引：

Select object_id,object_type,object_name from t1

where object_name like '%ABCDE%';

改写后的SQL是这样的：

Select object_id ,object_type,object_name from t1

Where object_name in

(select object_name from t1 where object_name like '%ABCDE%');

原理就是利用索引比表小，用索引全扫描(index fast full scan)来代替表的全扫描。

这里要特别说明的是，网上流传着另一种蹩脚的优化方法：

需要先创建一个保存rowid和object_name的中间表：

create table mid_t1 as select rowid as rid , object_name from t1;

sql改写成下面这样：

select object_id ,object_type,object_name from t1

where rowid in (select rid from mid_t1 where object_name like '%ABCDE%');

然后需要在t1表上创建增删改的3种trigger，随时同步数据到mid_t1表。

这个方法的查询效率基本上等同于上面直接使用索引的方法，但是还要同时维护另一个“索引”中间表，这么差的一个方法据说还是从oracle.com.cn流传出来，被多人模仿写在博客和书里，真是让人无语啊。

7 让<> / != / not in (1,2) 这样的SQL也能用上索引

前提是：这些过滤条件应用后，确实能返回较少的记录

当前存在这样的索引：create index idx_t1_status on t1(status);

这里假设t1表的status字段大部分记录都是'VALID',还有其他10几个唯一值，占比相对较少。

原sql，无法使用索引：

select owner,object_name,status from t1

where status<>'VALID';

优化方法，需要先创建函数索引：

create index idx_t1_status_fun on t1(decode(status,'VALID',null,null,'NULL','OTHER'));

注：这个decode函数写法可以自由发挥

再改写SQL：

select owner,object_name,status from t1

where decode(status,'VALID',null,null,'NULL','OTHER')='OTHER';

8 严格区间检索SQL的最佳写法

严格区间的定义：区间没有重叠，最多只返回一条记录。

原SQL，根据ip地址(已经转换成number数值)，找到对应的国家代码：

Select country_code

From COUNTRY_IP_RANGE IP

WHERE IP.Start_IP1 <= :number_ip

AND IP.End_IP1 >= :number_ip;

普通改写：

select country_code

from COUNTRY_IP_RANGE IP

WHERE IP.Start_IP1 <= :number_ip

AND IP.End_IP1 >= :number_ip

And rownum=1;

此时需要创建end_ip1,start_ip1两个字段联合索引。如果一个绑定变量的值匹配不到任何区间，上面这个写法消耗还是比较高，绑定变量值越小，匹配不到区间的消耗越高。

最佳处理：

select case when start_ip1<= :number_ip then COUNTRY_CODE else 'no_match' end

from

(SELECT COUNTRY_CODE, start_ip1,end_ip1

FROM COUNTRY_IP_RANGE

WHERE end_ip1 >= :number_ip order by end_ip1

) where ROWNUM = 1;

这种写法，只需要end_ip1一个字段上的索引，不管能否匹配到区间，都是最小消耗。

返回结果说明：匹配到区间，返回对应的COUNTRY_CODE；超出区间最大值，没有返回；其他匹配不到区间情况(包括低于区间最小值)，返回no_match。

如果需要超出区间最大值也返回no_match时，可以再套一层：

即：select nvl(上面 sql,'no_match') from dual;

9 nvl(expire_date,sysdate)

原SQL：

select count(*) from t1 where nvl(expire_date,sysdate)>=sysdate;

这种SQL没有办法创建函数索引(如果nvl内不是sysdate，而是一个具体的日期，这种情况可以创建nvl函数索引)。

改写：

select count(*) from t1 where expire_date>=sysdate or expire_date is null;

需要创建(expire_date,0) 联合索引。

10 分页查询rownum的位置

11g及以下版本，一般使用rownum做分页查询(12c 有了比较简洁的offset fetch语句)。

低效写法：

SELECT * FROM

( SELECT A.*, ROWNUM RN

FROM

(SELECT owner,created,object_id,object_type FROM t1 where owner='SYS' order by object_id desc) A

) WHERE RN > 10 and RN<=20 ;

高效写法：

SELECT * FROM

( SELECT A.*, ROWNUM RN

FROM

(SELECT owner,created,object_id,object_type FROM t1 where owner='SYS' order by object_id desc) A

WHERE ROWNUM <= 20

) WHERE RN > 10 ;

为了达到最佳性能，上面sql需要配合 owner + object_id 两字段联合索引。

何普

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
oracle 外连接改写_第61篇必须通过改写SQL才能提升性能的一些情况

这篇文章介绍了一些需要通过改写才能提高性能的SQL写法，也是对本人以前公众号改写相关文章的一个总结(也有新内容)。同时也对网络上流传的一些不太准确的说法给予纠正。改写的首要任务是等价，其次才是性能的提高，不等价的改写危害更大。希望能对大家有一些帮助。一家之言，欢迎留言讨论。1 大结果集标量子查询改外关联这个改写网上介绍的挺多，改写的两个关键点就是：1、主...
复制链接

扫一扫