oracle标量子查询的优势,oracle标量子查询简介和表连接改写

oracle标量子查询简介和表连接改写

之前小鱼就听过了标量子查询,不过对于其中的细节理解还是远远不够,借助一部分资料和自己测试对标量子查询做一点简单的分析和介绍。

Oracle允许在select子句中包含单行子查询,这个也就是oracle的标量子查询,标量子查询有点类似于外连接,当使用到外连接时我们可以灵活的将其转化为标量子查询。

SQL> create table t1 as select * from all_users;

Table created.

SQL> create table t2 as select * from all_objects;

Table created.

SQL> select a.object_id,(select b.username from t1 b where

a.owner=b.username) f

rom t2 a;

49812 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1364172329

--------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time

|

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 53276 | 1560K| 152 (1)| 00:00:02

|

|* 1 | TABLE ACCESS FULL| T1 | 1 | 17 | 2 (0)| 00:00:01

|

| 2 | TABLE ACCESS FULL| T2 | 53276 | 1560K| 152 (1)| 00:00:02

|

--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("B"."USERNAME"=:B1)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

13 recursive calls

0 db block gets

4282 consistent gets

0 physical reads

0 redo size

1176699 bytes sent via SQL*Net to client

37012 bytes received via SQL*Net from client

3322 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

49812 rows processed

标量子查询其实还是一个子查询,那么它究竟是如何查询的:首先走的是外部的查询,比如上一个sql语句执行计划,先全表扫描的T2

a,然后取T2 a表的每一行数据就去和T1

b去过滤,过滤条件是a.owner=b.username,如果符合则返回子查询的值,如果不符合则用null补充。当然这个时候还有个类似的filter去重的运算,对于t2

a中重复的数据行不用再去和t1 b去过滤。

而上面这个标量子查询的sql语句其实是等价于下面外连接sql语句的:

SQL> select a.object_id,b.username from t2 a,t1 b

2 where a.owner=b.username(+) ;

49812 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 535089106

------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time

|

------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 53276 | 2445K| 155 (2)| 00:00:02

|

|* 1 | HASH JOIN RIGHT OUTER| | 53276 | 2445K| 155 (2)|

00:00:02 |

| 2 | TABLE ACCESS FULL | T1 | 23 | 391 | 2 (0)| 00:00:01

|

| 3 | TABLE ACCESS FULL | T2 | 53276 | 1560K| 152 (1)|

00:00:02 |

------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("A"."OWNER"="B"."USERNAME"(+))

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

9 recursive calls

0 db block gets

4043 consistent gets

0 physical reads

0 redo size

1176659 bytes sent via SQL*Net to client

37012 bytes received via SQL*Net from client

3322 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

49812 rows processed

而如果标量子查询中如果主查询的一行对应子查询返回有多个值,这个是不允许的

SQL> select a.username,b.object_id from t1 a,t2 b

2 where a.username=b.owner(+);

29742 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1823443478

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time

|

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 53276 | 2445K| 155 (2)| 00:00:02

|

|* 1 | HASH JOIN OUTER | | 53276 | 2445K| 155 (2)| 00:00:02

|

| 2 | TABLE ACCESS FULL| T1 | 23 | 391 | 2 (0)| 00:00:01

|

| 3 | TABLE ACCESS FULL| T2 | 53276 | 1560K| 152 (1)| 00:00:02

|

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("A"."USERNAME"="B"."OWNER"(+))

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

7 recursive calls

0 db block gets

2726 consistent gets

0 physical reads

0 redo size

654542 bytes sent via SQL*Net to client

22294 bytes received via SQL*Net from client

1984 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

29742 rows processed

SQL> select a.username,(select b.object_id from t2 b where

a.username=b.owner) f

rom t1 a;

select a.username,(select b.object_id from t2 b where

a.username=b.owner) from t

1 a

*

ERROR at line 1:

ORA-01427: single-row subquery returns more than one row

这里由于a.username=b.owner,其中b.owner有多个相同的值,所以这里返回的b.object_id可能有多个值,这里就出现上述的ora-01427错误。

标量子查询中也可以有聚合函数的出现:

SQL> set autotrace traceonly;

SQL> select a.username,max(b.object_id) from t1 a,t2

b

2 where a.username=b.owner(+)

3 group by a.username;

23 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 577572187

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time

|

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 53276 | 2445K| 159 (4)| 00:00:02

|

| 1 | HASH GROUP BY | | 53276 | 2445K| 159 (4)| 00:00:02

|

|* 2 | HASH JOIN OUTER | | 53276 | 2445K| 155 (2)| 00:00:02

|

| 3 | TABLE ACCESS FULL| T1 | 23 | 391 | 2 (0)| 00:00:01

|

| 4 | TABLE ACCESS FULL| T2 | 53276 | 1560K| 152 (1)| 00:00:02

|

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("A"."USERNAME"="B"."OWNER"(+))

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

92 recursive calls

0 db block gets

777 consistent gets

685 physical reads

0 redo size

1169 bytes sent via SQL*Net to client

503 bytes received via SQL*Net from client

3 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

23 rows processed

SQL> select a.username,(select max(b.object_id) from t2 b

where b.owner=a.userna

me) from t1 a;

23 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 367820

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time

|

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 23 | 391 | 2 (0)| 00:00:01 |

| 1 | SORT AGGREGATE | | 1 | 30 | | |

|* 2 | TABLE ACCESS FULL| T2 | 533 | 15990 | 152 (1)| 00:00:02

|

| 3 | TABLE ACCESS FULL | T1 | 23 | 391 | 2 (0)| 00:00:01

|

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("B"."OWNER"=:B1)

Note

-----

- dynamic sampling used for this statement

Statistics

----------------------------------------------------------

9 recursive calls

0 db block gets

15929 consistent gets

0 physical reads

0 redo size

1206 bytes sent via SQL*Net to client

503 bytes received via SQL*Net from client

3 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

23 rows processed

但是我们注意到上述标量子查询却存在一个问题,就是无法将子查询展开为表连接,换句话说无法采用灵活的hash join

outer的关联方式。

关于标量子查询和表关联的性能简介:

如果主查询返回的数据较多,而子查询中又没有高效的索引,关联列对应的主查询表又没有较多的重复值,那么这个标量子查询的执行成本是很大的,如上面的标量子查询和外连接的sql语句中可以看出外连接IO成本要明显小于标量子查询。

但是标量子查询oracle内部确是有优化的,优化器cache了中间的结果,如果结果集不大,子查询中又有高效的索引,那么这个标量子查询可能会比常规的表关联更加高效。

小鱼列出几种常会涉及到的标量子查询和表连接的sql改写:

1 最简单的标量子查询

table :a(a1,a2),b(a1,b2)

select a2,(select b2 from b where b.a1=a.a1) from a

表连接:

select a2,b2 from a,b where a.a1=b.a1(+);

2 子查询中限制返回一行数据

table :a(a1,a2),b(a1,b2)

select a2,(select b2 from b where b.a1=a.a1 and rownum=1) from

a

表连接:

SELECT a2, c.b2

FROM a,

(SELECT b2, ROW_NUMBER () OVER (PARTITION BY a1 ORDER BY a1)

cn

FROM b

WHERE cn = 1) c

WHERE a.a1 = c.b2(+);

3 子查询带有聚合函数

table :a(a1,a2),b(a1,b2)

select a2,(select sum(b2) from b where b.a1 = a.a1) from

a

表连接1:

SELECT a2, bb.sum_value

FROM a,

( SELECT SUM (b2) sum_value, a1

FROM b

GROUP BY a1) bb

WHERE a.a1 = b.a1(+);

表连接2:

SELECT a2, SUM (b2)

FROM a, b

WHERE a.a1 = b.a1(+)

GROUP BY a2;

4 查询中包括好几个标量子查询

table :a(a1,a2),b(a1,b2),c(a1,b2,c2)

SELECT a.a2,

(SELECT c2

FROM b, c

WHERE b.a1 = a.a1 AND b.b2 = c.b2 AND ROWNUM = 1),

(SELECT b2

FROM b

WHERE b.a1 = a.a1 AND ROWNUM = 1),

(SELECT c2

FROM c

WHERE c.a1 = a.a1 AND ROWNUM = 1)

FROM a

表连接:

SELECT a.a2,

bb.c2,

cc.b2,

dd.c2

FROM (SELECT c2,

b.a1,

ROW_NUMBER () OVER (PARTITION BY b.a1 ORDER BY b.a1) cnt

FROM b, c

WHERE b.b2 = c.b2 AND cnt = 1) bb,

(SELECT b2, ROW_NUMBER () OVER (PARTITION BY a1 ORDER BY a1)

cnt

FROM b

WHERE cnt = 1) cc,

(SELECT c2, ROW_NUMBER () OVER (PARTITION BY a1 ORDER BY a1)

cnt

FROM c

WHERE cnt = 1) dd,

a

WHERE a.a1 = bb.a1(+) AND a.a1 = cc.a1(+) AND a.a1 =

dd.a1(+);

SELECT a2,

(SELECT SUM (c2)

FROM b, c

WHERE b.a1 = a.a1 AND b.b2 = c.b2),

(SELECT SUM (b2)

FROM b

WHERE b.a1 = a.a1),

(SELECT SUM (c2)

FROM c

WHERE c.a1 = a.a1)

FROM a

表连接:

SELECT a2,

bb.sum1,

cc.sum2,

dd.sum3

FROM ( SELECT SUM (c2) sum1, b.a1

FROM b, c

WHERE b.b2 = c.b2

GROUP BY b.a1) bb,

( SELECT SUM (b2) sum2, a1

FROM b

GROUP BY a1) cc,

( SELECT SUM (c2) sum3, a1

FROM c

GROUP BY a1) dd,

a

WHERE a.a1 = bb.a1(+) AND a.a1 = cc.a1(+) AND a.a1 =

dd.a1(+);

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
对于 Oracle 标量子查询中的 `LISTAGG` 函数,可以考虑以下几种优化方法: 1. 确保索引的使用:确保子查询中涉及的表上有适当的索引,以加快数据检索速度。如果子查询中使用了过滤条件,可以考虑在相关列上创建索引。 2. 限制结果集大小:如果子查询返回的结果集很大,可以考虑在子查询中添加适当的过滤条件,以限制结果集的大小。这可以减少 `LISTAGG` 函数处理的数据量,提高性能。 3. 使用内联视图:将子查询转换为内联视图(Inline View),可以避免多次执行子查询,从而提高性能。内联视图可以通过 WITH 子句或者嵌套查询的方式实现。 4. 考虑使用分析函数:如果子查询中需要对数据进行排序、分组或其他复杂的操作,可以考虑使用分析函数(Analytic Functions)来替代 `LISTAGG` 函数。分析函数通常比标量子查询更高效。 5. 优化查询计划:使用合适的查询提示(Query Hints)或者调整查询语句的结构,以促使 Oracle 生成更优化的查询计划。可以通过 EXPLAIN PLAN、SQL Trace 或者 SQL Monitoring 等工具来分析和优化查询计划。 6. 考虑使用其他技术:如果标量子查询性能问题仍然存在,可以考虑使用其他技术来实现相同的功能,例如使用连接查询、临时表或者其他编程方式。 请注意,优化标量子查询涉及到具体的数据模型、查询语句和数据库配置等因素,因此需要根据具体情况进行分析和调整。建议在进行任何优化操作之前,先进行性能测试和评估,以确保优化策略的有效性。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值