当一个子查询介于select 和 from 之间,这种子查询就叫标量子查询
例如:
select e.ename , e.sal ,
(select d.dname from dept d where d.deptno=e.deptno) dname
from emp e;
explain analyze
select e.ename , e.sal ,
(select d.dname from dept d where d.deptno=e.deptno) dname
from emp e;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on emp e (cost=0.00..13.72 rows=12 width=98) (actual time=0.657..0.675 rows=12 loops=1)
SubPlan 1
-> Seq Scan on dept d (cost=0.00..1.05 rows=1 width=46) (actual time=0.007..0.007 rows=1 loops=12)
Filter: (deptno = e.deptno)
Rows Removed by Filter: 3
Planning Time: 1.304 ms
Execution Time: 0.691 ms
标量子查询类似一个天然的嵌套循环,而且驱动表固定为主表(EMP)。标量子查询中子查询的表的连接列必须包含在索引中。
主表EMP通过连接列DEPTNO传值给子查询中的表DEPT,执行计划中Filter: 3就表示传值,传值过程一共进行了3次。
应尽量避免使用标量子查询,假如主表返回大量数据,主表的连接列基数很高,那么子查询中的表会被多次扫描,从而严重影响sql性能。如果主表数据量小或者主表连接列的基数低,这时候可以使用标量子查询,但是要记得在连接列上建立索引
标量子查询可以改写成外连接,从而是他们进行HASH连接
为什么是改成外连接而不是内连接,因为标量子查询是一个传值的过程,主表传值给子查询,子查询没有找到数据就会显示NULL。
select d.dname , d.loc ,
(select max(e.sal) from emp e where e.deptno=d.deptno) max_sal
from dept d ;
改成:
select d.dname, d.loc, e.max_sal
from dept d
left join (select max(sal) max_sal , deptno from emp group by deptno) e
on d.deptno = e.deptno;
explain analyze
select d.dname, d.loc, e.max_sal
from dept d
left join (select max(sal) max_sal , deptno from emp group by deptno) e
on d.deptno = e.deptno;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Hash Right Join (cost=2.27..2.55 rows=4 width=122) (actual time=0.059..0.063 rows=4 loops=1)
Hash Cond: (emp.deptno = d.deptno)
-> HashAggregate (cost=1.18..1.30 rows=12 width=44) (actual time=0.012..0.013 rows=3 loops=1)
Group Key: emp.deptno
-> Seq Scan on emp (cost=0.00..1.12 rows=12 width=26) (actual time=0.003..0.004 rows=12 loops=1)
-> Hash (cost=1.04..1.04 rows=4 width=102) (actual time=0.035..0.035 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on dept d (cost=0.00..1.04 rows=4 width=102) (actual time=0.005..0.006 rows=4 loops=1)
Planning Time: 0.121 ms
Execution Time: 0.096 ms
如果连接列是主键列,就没必要改成外连接了,直接改成内连接,因为主键不为NULL
explain analyze
select d.dname, d.loc, e.max_sal
from dept d
inner join (select max(sal) max_sal , deptno from emp group by deptno) e
on d.deptno = e.deptno;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Hash Join (cost=2.27..2.55 rows=4 width=122) (actual time=0.030..0.032 rows=3 loops=1)
Hash Cond: (emp.deptno = d.deptno)
-> HashAggregate (cost=1.18..1.30 rows=12 width=44) (actual time=0.012..0.013 rows=3 loops=1)
Group Key: emp.deptno
-> Seq Scan on emp (cost=0.00..1.12 rows=12 width=26) (actual time=0.002..0.003 rows=12 loops=1)
-> Hash (cost=1.04..1.04 rows=4 width=102) (actual time=0.014..0.014 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on dept d (cost=0.00..1.04 rows=4 width=102) (actual time=0.009..0.010 rows=4 loops=1)
Planning Time: 0.111 ms
Execution Time: 0.059 ms