一、简介
本文是总结Oracle查询优化方法与改写技巧的第二篇文章,接着第一篇文章,继续。。。
二、优化技巧
【1】新增插入注意的几点问题:
- 如果insert语句中没有包含默认值的列,则会添加默认值。
- 如果包含默认值得列,必须显式指定default,才会添加默认值,否则不会添加
- 如果已经显式指定了莫列的值为null或值,则不会再加上默认值。
【2】复制数据表的定义和数据
--复制表结构,包括数据
create table emp2222 as select * from emp;
select * from emp2222;
--复制表结构,不包括数据
create table emp222 as select * from emp where 1 = 0;
select * from emp222;
--插入某个表的数据
insert into emp2222 select * from emp
【3】遍历字符串
有时候需要将字符串拆分为单个字符进行一些校验等操作,比如校验下面字符串的拼音首字母是否正确,这时候就需要将字符串拆分为:
with temp as
(select '天天向上' as l, 'TTXS' as v from dual)
select
substr(t.l, level, 1) as first_label,
substr(t.v, level, 1) as first_value
from temp t
connect by level <= length(t.l);
connect by 是oracle树查询的一个子句,看如下示例:
将level循环显示4行。基于以上思想,我们就可以将‘天天向上’进行拆分:
with temp as
(select '天天向上' as l, 'TTXS' as v from dual)
select t.*,
level,
substr(t.l, level, 1) as first_label,
substr(t.v, level, 1) as first_value
from temp t
connect by level <= length(t.l);
【4】字符串中包含引号的使用方法
在字符串中使用引号的话,只需要把单引号变成双引号即可,如下示例:
select 'g''day mate'
from dual
union all
select 'beavers''teeth'
from dual
union all
select ''''
from dual;
至于q-quote界定符,工作中暂时还没遇到过相关需求,用的相对少,此处不做研究,用到的时候再查阅资料即可。
【5】统计字符在字符串中出现的次数
假设,需要统计以上字符串中字符出现的次数。oracle 11g提供了一个regexp_count(str,'分隔符')用于统计各个子串的个数。
--方法1 : 使用regexp_count(),需要加1数量才对
with temp as
(select 'CLARK,KING,MILLER' as names from dual)
select regexp_count(names, ',') + 1 as cnt from temp;
除了以上方法,还可以使用translate(str,fromstring,tostring)实现同样的功能:
--方法2 : 使用translate
with temp as
(select 'CLARK,KING,MILLER' as names from dual)
select length(translate(names, ',' || names, ',')) + 1 as cnt from temp;
但是,如果分隔字符串有多个的话,就要做进一步处理了,要除以分隔符长度。
--如果分隔字符有多个的话,需要除以分隔符长度
--错误写法:
with temp as
(select 'CLARK$#KING$#MILLER' as names from dual)
select length(translate(names, '$#' || names, '$#')) + 1 as cnt from temp;
--正确写法:
with temp as
(select 'CLARK$#KING$#MILLER' as names from dual)
select length(translate(names, '$#' || names, '$#')) / length('$#') + 1 as cnt
from temp;
如果是多个分隔字符,使用regexp_count(str,'分隔字符')就不需要考虑分隔符长度了。
--如果分隔字符串有多个的话,使用regexp_count就不用考虑分隔符长度(注意需要转义特殊字符)
with temp as
(select 'CLARK$#KING$#MILLER' as names from dual)
select regexp_count(names, '\$#') + 1 as cnt from temp;
【6】删除字符串中不必要的字符
要求去掉员工姓名中含有的【AEIOU】元音字母:
(1) 方法一:使用translate结合replace
select r.ename,
translate(r.ename, 'AEIOU', 'aaaaa') as n1,
replace(translate(r.ename, 'AEIOU', 'aaaaa'), 'a', '') as n2
from emp r;
(2) 方法二:直接使用translate
select r.ename, translate(r.ename, '1AEIOU', '1') as n1 from emp r;
(3) 方法三:使用regexp_replace正则函数
select e.ename, regexp_replace(e.ename, '[AEIOU]') as n1 from emp e;
【7】将字符与数字分隔开
假设需要将部门名称与部门编号分隔开:
(1)方法一:使用regexp_replace正则表达式分割
with temp as
(select d.dname || d.deptno as val from dept d)
select t.val,
regexp_replace(t.val, '[1234567890]', '') as dname,
regexp_replace(t.val, '[^0123456789]', '') as deptno
from temp t;
(2)方法二:使用translate进行分割
with temp as
(select d.dname || d.deptno as val from dept d)
select t.val,
translate(t.val, 'a0123456789', 'a') as dname,
translate(t.val, '0123456789' || t.val, '0123456789') as deptno
from temp t;
【8】查询只包含字母或数字的数据
使用正则替换进行过滤:
with temp as
(select '123' as val
from dual
union all
select ' 234sdf' as val
from dual
union all
select '$#234' as val
from dual
union all
select 'adg' as val
from dual)
select t.val from temp t where regexp_like(t.val, '^[0-9a-zA-Z]+$');
下面对regexp_replace相关知识进行扩展:
- regexp_like(t.val,'AB') 表示t.val like '%A%' or t.val like '%B%'
- regexp_like(t.val,'[0-9a-zA-Z]') 表示t.val like '%数字%' or t.val like '%大写字母%' or t.val like '%小写字母%'
- ^:表示字符串开始
- $:表示字符串结束
--对比regexp_like与like
select t.val from temp t where regexp_like(t.val, 'A');
--等价于前后模糊查询
select t.val from temp t where t.val like '%A%';
select t.val from temp t where regexp_like(t.val, '^A');
--等价于‘A’开头模糊查询
select t.val from temp t where t.val like '%A';
select t.val from temp t where regexp_like(t.val, 'A$');
--等价于 ‘A’结尾模糊查询
select t.val from temp t where t.val like 'A%';
select t.val from temp t where regexp_like(t.val, '^A$');
--等价于 ‘A’精确查询
select t.val from temp t where t.val like 'A';
- +:表示匹配前面的子表达式一次或多次
- *:表示匹配前面的子表达式零次或多次
-- +:表示匹配前面的子表达式一次或多次
-- *:表示匹配前面的子表达式零次或多次
with temp as
(select '123' as val from dual union all select '234567' as val from dual)
select t.val from temp t;
with temp as
(select '167' as val from dual union all select '1234567' as val from dual)
select t.val from temp t where regexp_like(t.val, '16+'); --至少匹配6一次
--等价于
--select t.val from temp t where t.val like '16%'
with temp as
(select '167' as val from dual union all select '1234567' as val from dual)
--select t.val from temp t where regexp_like(t.val, '16*'); --匹配6零次或多次
--等价于
select t.val from temp t where t.val like '1%'
- regexp_like(t.val,'^[12]+$')
- --等价于
- t.val like '1' or t.val like '2' or t.val like '11' or t.val like '22' or t.val like '12' or t.val like '21'
- regexp_like(t.val,'^[12]*$')
- --等价于
- t.val like '1' or t.val like '2' or t.val like '11' or t.val like '22' or t.val like '12' or t.val like '21'or t.val like '
【9】按字符串中的数字进行排序
首先,我们可以将字符串中的除数字之外的都替换为空,可以使用translate或者regexp_replace进行替换,然后再进行排序即可。
替换非数字字符:
with temp as
(select d.dname || d.deptno || d.loc as val from dept d)
select regexp_replace(t.val, '[^0-9]', '') as val from temp t order by 1 desc;
.
当然也可以使用translate进行替换:
with temp as
(select d.dname || d.deptno || d.loc as val from dept d)
select to_number(translate(t.val, '0123456789' || t.val, '0123456789')) as val
from temp t
order by 1 desc;
【10】根据表中的行创建一个分隔列表
其实就是listagg聚合函数的使用,或者wm_concat()函数的使用。
--统计各个部门的员工以及总工资
select deptno,
sum(e.sal) as totalsal,
listagg(e.ename, ',') within group(order by e.ename) as names,
wm_concat(e.ename) as names2
from emp e
group by e.deptno
【11】分解IP地址
--分解IP地址
select regexp_substr(t.ip, '[^.]+', 1, 1) as a,
regexp_substr(t.ip, '[^.]+', 1, 2) as b,
regexp_substr(t.ip, '[^.]+', 1, 3) as c,
regexp_substr(t.ip, '[^.]+', 1, 4) as d
from (select '192.168.6.67' as ip from dual) t
【12】将分隔数据转换为可以使用in语句列表
可见另外一篇博客:https://blog.csdn.net/Weixiaohuai/article/details/84789139
【13】常用聚合函数
常用聚合函数有: min() max() avg() count() sum()等,以下是基本使用方法:
select e.deptno,
avg(e.sal) as avgsal,
min(e.sal) as minsal,
max(e.sal) as maxsal,
sum(e.sal) as totalsal,
count(*) as totalcount,
count(e.comm) as commtotalcount,
avg(e.comm) as error_avgcomm, --错误的平均提成算法
avg(nvl(e.comm, 0)) as right_avgcomm, --需要先将comm为空的转化为0再计算平均值
avg(coalesce(e.comm, 0)) as right_avgcomm2
from emp e
group by e.deptno
注意点:聚合函数会忽略空值,对sum()求总和没什么影响,但是对avg()和count()来说可能结果就不一致了,需要特别注意这一点,实际项目中根据具体需求来决定是否将空值转换为0之后再进行聚合操作。
【14】生成累计和
案例:按照进入公司的先后顺序(empno排序)来统计成本累计和:
--成本累计和
select e.empno,
e.ename,
e.sal,
sum(e.sal) over(order by e.empno) as totallv --第一行到当前行所有工资的总和
from emp e
where e.deptno = 30
order by e.empno
【15】计算累计差
首先构造测试数据:
--计算累计差
with temp as
(select 1000 as bh, '预交金额' as xmm, 30000 as val
from dual
union all
select 1001 as bh, '支出一' as xmm, 1000 as val
from dual --支出1000元
union all
select 1002 as bh, '支出二' as xmm, 2000 as val
from dual --支出2000元
union all
select 1003 as bh, '支出三' as xmm, 3000 as val
from dual --支出3000元
)
select t.bh, t.xmm, t.val from temp t;
案例:我们需要计算每一笔支出之后剩余的余额还有多少
思想: 通过对编号进行排序,将支出的金额变为负数之后再进行累计和,就达到本次案例要求
【a】第一步: 根据编号进行排序
--计算累计差
with temp as
(select 1000 as bh, '预交金额' as xmm, 30000 as val
from dual
union all
select 1001 as bh, '支出一' as xmm, 1000 as val
from dual --支出1000元
union all
select 1002 as bh, '支出二' as xmm, 2000 as val
from dual --支出2000元
union all
select 1003 as bh, '支出三' as xmm, 3000 as val
from dual --支出3000元
)
select rownum, t.bh, t.xmm, t.val from temp t order by t.bh;
【b】使用case when将支出的金额变为负数
--计算累计差
with temp as
(select 1000 as bh, '预交金额' as xmm, 30000 as val
from dual
union all
select 1001 as bh, '支出一' as xmm, 1000 as val
from dual --支出1000元
union all
select 1002 as bh, '支出二' as xmm, 2000 as val
from dual --支出2000元
union all
select 1003 as bh, '支出三' as xmm, 3000 as val
from dual --支出3000元
)
select r.rm,
r.bh,
r.xmm,
r.val,
case
when r.rm = 1 then
r.val
else
-r.val
end as newval
from (select rownum as rm, t.bh, t.xmm, t.val from temp t order by t.bh) r;
【c】累计和计算
--计算累计差
with temp as
(select 1000 as bh, '预交金额' as xmm, 30000 as val
from dual
union all
select 1001 as bh, '支出一' as xmm, 1000 as val
from dual --支出1000元
union all
select 1002 as bh, '支出二' as xmm, 2000 as val
from dual --支出2000元
union all
select 1003 as bh, '支出三' as xmm, 3000 as val
from dual --支出3000元
)
select r.rm,
r.bh,
r.xmm,
r.val,
sum(case
when r.rm = 1 then
r.val
else
-r.val
end) over(order by r.rm) as ee
from (select rownum as rm, t.bh, t.xmm, t.val from temp t order by t.bh) r;
【16】更改累计和的数值
首先,构造测试数据:模拟存取款记录数据
with temp as
(select 1 as id, 'cunkuan' as tye, 100 as val
from dual
union all
select 2 as id, 'cunkuan' as tye, 200 as val
from dual
union all
select 3 as id, 'qukuan' as tye, 50 as val
from dual
union all
select 4 as id, 'cunkuan' as tye, 100 as val
from dual
union all
select 5 as id, 'qukuan' as tye, 100 as val
from dual)
select t.* from temp t;
案例:统计金额流水,每次存款、取款之后的剩余余额等
思想:同样根据tye是存款还是取款,将取款的金额变为负数之后再进行累加和计算即可
with temp as
(select 1 as id, 'cunkuan' as tye, 100 as val --存款
from dual
union all
select 2 as id, 'cunkuan' as tye, 200 as val --存款
from dual
union all
select 3 as id, 'qukuan' as tye, 50 as val --取款
from dual
union all
select 4 as id, 'cunkuan' as tye, 100 as val --存款
from dual
union all
select 5 as id, 'qukuan' as tye, 100 as val --取款
from dual)
select t.id,
t.tye,
t.val,
case
when t.tye = 'qukuan' then
-t.val
else
t.val
end as newval, --转换取款金额为负数
sum(case
when t.tye = 'qukuan' then
-t.val
else
t.val
end) over(order by t.id) as yue
from temp t;
【17】返回各部门工资排名前三名的员工
一看到这个需求,就需要搞清楚前三名是要怎么算,并且还要考虑并列成绩相同的情况。
select e.deptno,
e.deptno,
e.sal,
--row_number()遇到数值相同的还是会排 1 , 2 , 3..
row_number() over(partition by e.deptno order by e.sal desc) as row_number,
--rank()遇到数值相同的用相同的数字(下一个会跳跃),类似 1 , 1, 3..
rank() over(partition by e.deptno order by e.sal desc) as rank,
--dense_rank()遇到数值相同的用相同的数字(下一个不会跳跃,相当于有两个第一名的感觉),类似 1 , 1 , 2 ..
dense_rank() over(partition by e.deptno order by e.sal desc) as dense_rank
from emp e
where e.deptno in (20, 30)
order by e.deptno, e.sal desc
基于row_number()/rank()/dense_rank(),在不同的场景下返回的结果也会有所差异:
- 如果使用row_number()返回工资第一名的员工,显然会漏掉一条数据,因为对应相同工资有一个序号排为‘2’了。
- 如果使用dense_rank()返回工资前两名的员工,显然会返回多出一条数据,因为是‘1 ,1 ,2...’。
以下分别使用row_number()/rank()/dense_rank()实现查询工资排名前三名的员工信息:
select r.*
from (select e.deptno,
e.sal,
row_number() over(partition by e.deptno order by e.sal desc) as row_number
from emp e
where e.deptno in (20, 30)) r
where r.row_number <= 3
select r.*
from (select e.deptno,
e.sal,
rank() over(partition by e.deptno order by e.sal desc) as rank
from emp e
where e.deptno in (20, 30)) r
where r.rank <= 3
select r.*
from (select e.deptno,
e.sal,
dense_rank() over(partition by e.deptno order by e.sal desc) as dense_rank
from emp e
where e.deptno in (20, 30)) r
where r.dense_rank <= 3
【18】计算出现次数最多的值
完成这个查询需要三步:
【a】查询各个工资出现的次数:
select e.sal, count(*) as cnt from emp e where e.deptno = 20 group by e.sal
【b】按工资出现的次数进行排序:
select r.sal, r.cnt, dense_rank() over(order by r.cnt desc) as ordernum --排序号
from (select e.sal, count(*) as cnt
from emp e
where e.deptno = 20
group by e.sal) r
【c】取出排序为第一的数据:
select *
from (select r.sal,
r.cnt,
dense_rank() over(order by r.cnt desc) as ordernum --排序号
from (select e.sal, count(*) as cnt
from emp e
where e.deptno = 20
group by e.sal) r) rr
where rr.ordernum = 1 --排第一名
基于上面的示例,如果加上partition by e.deptno按部门分区进行统计,那么就可以统计各个部门中工资排名第一名的工资信息:
select *
from (select r.sal,
r.deptno,
r.cnt,
dense_rank() over(partition by r.deptno order by r.cnt desc) as ordernum --排序号
from (select e.sal, e.deptno, count(*) as cnt
from emp e
group by e.sal, e.deptno) r) rr
where rr.ordernum = 1 --排第一名
【19】求总和的百分比
案例:计算各部门的总工资,并且计算部门总工资占总工资的比例
要完成这个查询,需要三步:
【a】统计各个部门的总工资
select e.deptno, sum(e.sal) as depttotalsal from emp e group by e.deptno
【b】统计总工资金额
select r.deptno, r.depttotalsal, sum(depttotalsal) over() as totalsal --累加和计算总工资
from (select e.deptno, sum(e.sal) as depttotalsal
from emp e
group by e.deptno) r
【c】计算占比
select rr.deptno,
rr.depttotalsal,
round((rr.depttotalsal / rr.totalsal) * 100, 2) as lv --工资占比
from (select r.deptno,
r.depttotalsal,
sum(depttotalsal) over() as totalsal --累加和计算总工资
from (select e.deptno, sum(e.sal) as depttotalsal
from emp e
group by e.deptno) r) rr
除了使用上面的方法,还可以使用ratio_to_report()实现:
select r.deptno,
round(ratio_to_report(r.depttotalsal) over() * 100, 2) as lv
from (select e.deptno, sum(e.sal) as depttotalsal
from emp e
group by e.deptno) r
【20】查询各员工占本部门工资的占比
要实现这个查询,基于上一个示例,再按照e.deptno进行分区统计即可。
select e.deptno,
e.sal,
e.ename,
round(ratio_to_report(e.sal) over(partition by e.deptno) * 100, 2) as lv
from emp e
除了上面的方法,还可以使用下面的sql统计各员工工资占本部门总工资的占比:
select ee.sal,
ee.ename,
ee.deptno,
r.depttotalsal,
round((ee.sal / r.depttotalsal) * 100, 2) as lv --工资占比
from emp ee
left join (select sum(e.sal) as depttotalsal, e.deptno
from emp e
group by e.deptno) r
on r.deptno = ee.deptno
三、总结
这是第二部分的总结以及一些案例,下一篇继续总结。积少成多,keeping....