Oracle 10gR2 group by 不排序问题

 

如何解决10gR2的hash group by的bug

[ 收藏此页] [ 打印]
作者: ITPUB xzh2000  2006-11-12
<script>init_Nav();</script>

    【IT168 技术文档】在 10gR2中,group by由以前的sort group by改成了hash group by,这种算法上的改进,取消了sort group by必须进行的排序操作,即然是用hash算法,就存在碰撞的可能性,itpub的godlessme就碰到这样的问题,应该算是bug吧。

    下面给大家演示一下如何解决这种问题,其实要解决hash group by引起的排序不准确的问题,就是还用以前的sort group by就可以啦,10gR2中引入_gby_hash_aggregation_enabled隐藏参数,该参数默认设置为true,将它改成false即可。

    SQL> select status,count(*) from tmp_object group by status;
    STATUS COUNT(*)
    ---- -----
    INVALID 29
    VALID 10236

    Execution Plan
    -----------------------------
    Plan hash value: 3490974944
    -------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    -------------------------------------
    | 0 | SELECT STATEMENT | | 2 | 12 | 35 (6)| 00:00:01 |
    | 1 | HASH GROUP BY | | 2 | 12 | 35 (6)| 00:00:01 |
    | 2 | TABLE ACCESS FULL| TMP_OBJECT | 10265 | 61590 | 33 (0)| 00:00:01 |
    -------------------------------------
    Statistics
    -----------------------------
    24 recursive calls
    0 db block gets
    136 consistent gets
    0 physical reads
    0 redo size
    522 bytes sent via SQL*Net to client
    385 bytes received via SQL*Net from client
    2 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    2 rows processed

    SQL> col ksppinm format a39
    SQL> col ksppstvl format a39
    SQL> select ksppinm, ksppstvl
    2 from x$ksppi pi, x$ksppcv cv
    3 where cv.indx=pi.indx and pi.ksppinm like '_%' escape ''
    4 and pi.ksppinm like '%&parameter%';
    Enter value for parameter: gby
    old 4: and pi.ksppinm like '%&parameter%'
    new 4: and pi.ksppinm like '%gby%'

    KSPPINM KSPPSTVL
    -------------------- ------------
    _gby_onekey_enabled TRUE
    _gby_hash_aggregation_enabled TRUE

    SQL> alter session set "_gby_hash_aggregation_enabled"=false;
    Session altered.
    SQL> select status,count(*) from tmp_object group by status;
    STATUS COUNT(*)
    ---- -----
    INVALID 29
    VALID 10312

    Execution Plan
    -----------------------------
    Plan hash value: 1360369603
    -------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    -------------------------------------
    | 0 | SELECT STATEMENT | | 10860 | 54300 | 32 (7)| 00:00:01 |
    | 1 | SORT GROUP BY | | 10860 | 54300 | 32 (7)| 00:00:01 |
    | 2 | TABLE ACCESS FULL| TMP_OBJECT | 10860 | 54300 | 30 (0)| 00:00:01 |
    -------------------------------------
    Statistics
    -----------------------------
    0 recursive calls
    0 db block gets
    134 consistent gets
    0 physical reads
    0 redo size
    522 bytes sent via SQL*Net to client
    385 bytes received via SQL*Net from client
    2 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
    2 rows processed

 

原帖地址:http://tech.it168.com/db/o/2006-11-12/200611122129197.shtml

 

Change behavior of GROUP BY clause in Oracle 10g.

After successful upgrade of one of the busiest 9i database to 10g, we have received a very first
change of behavior from the users.

Users complained that after upgrading the database, some of the queries which were returning results in an order are now not returning the data in any order now.

The queries have GROUP BY clause, and after some research, we found out that
starting with 10g, Oracle uses the new HASH GROUP BY mechanism which will not guarantee the result in any order unless ORDER BY clause is specified.

HASH GROUP BY algorithm is explained in the following URL (looks old stuff)
http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0901/en/html/dbugen9/00000428.htm

In a nutshell, in Oracle 9i, GROUP BY clause gives result in sort order, this behavior has
changed from 10g onwards.

The general and Oracle recommendation is, to add the ORDER BY clause in order to have 100% guaranteed order by result.

Following is the test which I have carried out on 9i and 10g database (UNIX PLATFORMS):

>>on 9i database

select owner,count(1) from dba_segments group by owner;

OWNER COUNT(1)
------------------------------ ----------
BARABC 42
BASABC 10
BIPABC 82
CPABC 23
DMIABC 40
DMI_TRGTABC 247


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 VIEW OF 'DBA_OBJECTS'
3 2 UNION-ALL
4 3 FILTER
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'OBJ$'



>>on 10g database

OWNER COUNT(1)
------------------------------ ----------
SYSTEM 353
CISABCD 46
OEABCD 35
SCABCD 71

-----------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU)
-----------------------------------------------------------------------------------
0 SELECT STATEMENT 18020 299K 81 (7)
1 HASH GROUP BY 18020 299K 81 (7)
2 VIEW DBA_OBJECTS 18020 299K 79 (4)


As you can see, 9i query returns the result in the ascending order, where as 10g, didn't.

If the ORDER BY clause is added on 10g database, the explain plan looks like below, an extra sorting is being performaned:

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (ORDER BY)
2 1 SORT (GROUP BY)
3 2 VIEW OF 'DBA_OBJECTS'

The default behavior of GROUP BY clause in 9i, provides an internal sorting when the GROUP BY clause is used, avoiding an extra sorting, which in my opinion could improve the performance of the query.

I don't know the internal behavior of HASH GROUP BY algorithm, but, this will be useful when
you don't want data in any order after the GROUP BY clause.

After further investigations, I have found out that HASH GROUP BY has few bugs which
leads to an inaccurate result. As per Oracle, the bug it is resolved in 11g version.

Metalink : Bug 4604970 WRONG RESULTS WITH 'HASH GROUP BY 'AGGREGATION ENABLED .

Alex had discussed about this behavior at his blog:
http://www.oracloid.com/2006/05/hash-group-by-can-give-wrong-result-in-oracle-102/

Disable HASH GROUP SORT in 10g:

Oracle recommend setting the _gby_hash_aggregation_enabled parameter = false OR
optimizer_features_enabled=9.2.0 to get rid of HASH GROUP BY mechanism.

alter session set "_gby_hash_aggregation_enabled" = FALSE;

Id Operation Name Rows Bytes Cost (%CPU)
---------------------------------------------------------------------------------
0 SELECT STATEMENT 2181 37077 997 (2)
1 SORT GROUP BY 2181 37077 997 (2)
2 VIEW SYS_DBA_SEGS 2181 37077 996 (1)

Julian Dyke has a brief explanation note and a example at his site:

http://julian.dyke.users.btopenworld.com/com/Optimisation/Operations/HashGroupBy.html

My colleague did a small test on 9.2.0.7 database where he found out an interesting point that the ordering of the result is different when ORDER BY clause is not specified with the GROUP BY.

drop table test;
create table test(a varchar2(30));

insert into test values ('ahmed');
insert into test values ('zubair');
insert into test values ('11');
insert into test values ('-11');
insert into test values ('Zubair');
insert into test values ('test');

commit;


SQL> set autotrace on exp
SQL> select a, count(*) from test group by a;

A COUNT(*)
------------------------------ ----------
-11 1
11 1
Zubair 1
ahmed 1
test 1
zubair 1

6 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 TABLE ACCESS (FULL) OF 'TEST'



SQL> select a, count(*) from test group by a order by a;

A COUNT(*)
------------------------------ ----------
-11 1
11 1
ahmed 1
test 1
Zubair 1
zubair 1

Please note that the Schema names are modified to show the test case.

References:

Refer Metalink Note : 345048.1 - 'Group By' Doesn't not Sort If you don't use order by in 10g.
http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1251893,00.html
Update : (14 July)

I was reading the Oracle University Introduction to SQL course PPTs and in the 'Aggregate Data using The Group Functions' it did mention that the rows are sorted by ascending order of the columns included in the GROUP BY list.

I personally think that was a great idea, it indeed avoid an extra sorting (comes using by ORDER BY clause.


Happy Reading,

Jaffar

 

原帖地址:http://jaffardba.blogspot.com/2007/07/change-behavior-of-group-by-clause-in.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值