[PT]Sort Operations

学习Sort Operations的时候希望能看到一些更详细的内容,去google了一把,于是又找到了wanghai的这篇文章,摘录在此。最近摘录了很多别人的文章,但是在阅读这些文章的过程中感觉到自己学到了很多东西,每天都在进步。虽然求职的道路很漫长也很艰难,但是我相信只要我持续地努力,不放下技术,总有成功的一天。

oracle sort 系列一作者:wanghai
链接:http://wzwanghai.spaces.live.com/blog/cns!56626E237AFBD116!156.entry


相信做oracle开发和管理的朋友对sort肯定不会陌生,大家通常都遇到这样那样的排序性能问题,所以我写这一系列关于sort的文章告诉大家在oracle里面sort是怎么一回事以及如果调整sort获得更好的性能。

首先,我们来回顾一下什么情况下需要sort,当取distinct值的时候需要,当进行merge join的时候也需要,当语句包含group by,order by的时候需要,当创建索引的时候需要等等。那么我们来看一下在oracle里面排序分为哪几种方式呢?

一.SORT UNIQUE

sort unique发生在我们需要取distinct值的时候

SQL> desc test
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(30)
OBJECT_NAME VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(18)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)
SQL> set autotrace trace exp;
SQL> select distinct owner from test;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=477 Card=15 Bytes=75
)
1 0 SORT (UNIQUE) (Cost=477 Card=15 Bytes=75)
2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
s=967440)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2649 consistent gets
0 physical reads
0 redo size
564 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
15 rows processed


也会发生在in的子查询里

SQL> select owner from test where object_id in(select object_id from test1 where rownum<10);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=530 Card=145116 Byte
s=3047436)
1 0 HASH JOIN (Cost=530 Card=145116 Bytes=3047436)
2 1 VIEW OF 'VW_NSO_1' (Cost=256 Card=9 Bytes=117)
3 2 SORT (UNIQUE)
4 3 COUNT (STOPKEY)
5 4 TABLE ACCESS (FULL) OF 'TEST1' (Cost=256 Card=1934
88 Bytes=580464)
6 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
s=1547904)



为什么in子查询需要排序呢,因为oracle会把基于in的子查询
"select owner from test where object_id in(select object_id from test1)"
转换为类似
"select a.owner from test a,(select distinct object_id from test1) b where a.object_id=b.object_id"
如果in子查询里面选出的值可以通过主键或unique索引搜索得到,那么将不会进行sort unique的操作。

但自从10g R2开始,sort unique有了一些变化,sort unique变成了hash unique,采用新的hash算法代替了传统的sort unique,据oracle称在某些情况下排序性能是原来的5倍。

注意,sort的统计值将为0
SQL> select distinct owner from test;

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
692 consistent gets
0 physical reads
0 redo size
781 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
19 rows processed

SQL> select distinct owner from test;
Execution Plan
----------------------------------------------------------
Plan hash value: 2203132549
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19 | 95 | 164 (5)| 00:00:02 |
| 1 | HASH UNIQUE | | 19 | 95 | 164 (5)| 00:00:02 |
| 2 | TABLE ACCESS FULL| TEST | 49909 | 243K| 158 (2)| 00:00:02 |
--------------------------------------------------------------------------

SQL> select owner from test where object_id in(select object_id from test1 where rownum<10);
Execution Plan
----------------------------------------------------------
Plan hash value: 2420636264
--------------------------------------------------------------------------------
------------------
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
--------------------------------------------------------------------------------
------------------
| 0 | SELECT STATEMENT | | 5989 | 116K| 284
(2)| 00:00:04 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 665 | 4655 | 14
(0)| 00:00:01 |
| 2 | NESTED LOOPS | | 5989 | 116K| 284
(2)| 00:00:04 |
| 3 | VIEW | VW_NSO_1 | 9 | 117 | 156
(2)| 00:00:02 |
| 4 | HASH UNIQUE | | 9 | 18 |
| |
|* 5 | COUNT STOPKEY | | | |
| |
| 6 | TABLE ACCESS FULL | TEST1 | 49909 | 99818 | 156
(2)| 00:00:02 |
|* 7 | INDEX RANGE SCAN | IND_TEST_OBJECT_ID | 665 | | 2
(0)| 00:00:01 |

二.SORT AGGREGATE
sort aggregate通常发生在使用一些聚合函数的时候,sum(),avg(),min(),max(),count()等等,实际上sort aggregate不做真正的sort,并不会用到排序空间,而是通过一个全局变量+全表或全索引扫描来实现。伪代码如下
max_so_far = -INFINITE;
while (row* r=get_a_row_from_table(test_for_max)) {
if (r->n > max_so_far) max_so_far=n;
}
return max_so_far



SQL> select count(*) from test;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488)

SQL> select sum(object_id) from test;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
s=580464)

SQL> select min(object_id) from test;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 INDEX (FULL SCAN (MIN/MAX)) OF 'IND_TEST_OBJECT_ID' (NON
-UNIQUE) (Cost=257 Card=193488 Bytes=580464)

SQL> select max(object_id) from test;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 INDEX (FULL SCAN (MIN/MAX)) OF 'IND_TEST_OBJECT_ID' (NON
-UNIQUE) (Cost=257 Card=193488 Bytes=580464)

SQL> select avg(object_id) from test;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
s=580464)

SQL> select count(*) from test;

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2649 consistent gets
0 physical reads
0 redo size
381 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


三.SORT GROUP BY
SORT GROUP BY会发生在有group by子句的时候
SQL> select owner from test group by owner;
15 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=477 Card=15 Bytes=75
)
1 0 SORT (GROUP BY) (Cost=477 Card=15 Bytes=75)
2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
s=967440)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2649 consistent gets
0 physical reads
0 redo size
564 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
15 rows processed

在10g R2里又有了变化,sort group by被hash group by所代替
SQL> select owner from test group by owner;
19 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1435881708
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19 | 95 | 164 (5)| 00:00:02 |
| 1 | HASH GROUP BY | | 19 | 95 | 164 (5)| 00:00:02 |
| 2 | TABLE ACCESS FULL| TEST | 49909 | 243K| 158 (2)| 00:00:02 |
---------------------------------------------------------------------------

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
692 consistent gets
0 physical reads
0 redo size
781 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
19 rows processed
联想到上面提到的sort unique也是被hash unique所代替,我们可以肯定10g R2用了更多基于hash算法的东西,我们也期待在数据库其他方面10g R2也带给我们更多的好处。

四.SORT ORDER BY
SORT ORDER BY会发生在有order by子句的时候
SQL> select owner from test order by owner;
193488 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=666 Card=193488 Byte
s=967440)
1 0 SORT (ORDER BY) (Cost=666 Card=193488 Bytes=967440)
2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488 Byte
s=967440)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2649 consistent gets
0 physical reads
0 redo size
2528735 bytes sent via SQL*Net to client
142392 bytes received via SQL*Net from client
12901 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
193488 rows processed



五.SORT JOIN
SORT JOIN发生在出现merge join的情况下,两张关联的表要各自做sort,然后再merge
SQL> select /*+ use_merge(a b)*/ a.owner from test a,test1 b where a.object_id=b.object_id and rownum<10;
9 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1354 Card=9 Bytes=99
)
1 0 COUNT (STOPKEY)
2 1 MERGE JOIN (Cost=1354 Card=3119800512 Bytes=34317805632)
3 2 SORT (JOIN) (Cost=737 Card=193488 Bytes=1547904)
4 3 TABLE ACCESS (FULL) OF 'TEST' (Cost=257 Card=193488
Bytes=1547904)
5 2 SORT (JOIN) (Cost=617 Card=193488 Bytes=580464)
6 5 TABLE ACCESS (FULL) OF 'TEST1' (Cost=256 Card=193488
Bytes=580464)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5251 consistent gets
0 physical reads
0 redo size
444 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
9 rows processed

以下为SQL TUNING(OU)中的解释:

Sort operations result when users specify an operation that requires a sort. Commonly encountered operations include the following:

Sort operations result when users specify an operation that requires a sort. Commonly
encountered operations include the following:
• SORT UNIQUE occurs if a user specifies a DISTINCT clause or if an operation requires unique values for the next step.
• SORT AGGREGATE does not actually involve a sort. It is used when aggregates are being computed across the whole set of rows.
• SORT GROUP BY is used when aggregates are being computed for different groups in the data. The sort is required to separate the rows into different groups.
• SORT JOIN happens during a sort-merge join if the rows need to be sorted by the join key.
• SORT ORDER BY is required when the statement specifies an ORDER BY that cannot be satisfied by one of the indexes.

[@more@]

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/8558093/viewspace-1022313/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/8558093/viewspace-1022313/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值