After successful upgrade of one of the busiest 9i database to 10g, we have received a very first
change of behavior from the users.
Users complained that after upgrading the database, some of the queries which were returning results in an order are now not returning the data in any order now.
The queries have GROUP BY clause, and after some research, we found out that
starting with 10g, Oracle uses the new HASH GROUP BY mechanism which will not guarantee the result in any order unless ORDER BY clause is specified.
HASH GROUP BY algorithm is explained in the following URL (looks old stuff)
http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0901/en/html/dbugen9/00000428.htm
In a nutshell, in Oracle 9i, GROUP BY clause gives result in sort order, this behavior has
changed from 10g onwards.
The general and Oracle recommendation is, to add the ORDER BY clause in order to have 100% guaranteed order by result.
Following is the test which I have carried out on 9i and 10g database (UNIX PLATFORMS):
>>on 9i database
select owner,count(1) from dba_segments group by owner;
OWNER COUNT(1)
------------------------------ ----------
BARABC 42
BASABC 10
BIPABC 82
CPABC 23
DMIABC 40
DMI_TRGTABC 247
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 VIEW OF 'DBA_OBJECTS'
3 2 UNION-ALL
4 3 FILTER
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'OBJ$'
>>on 10g database
OWNER COUNT(1)
------------------------------ ----------
SYSTEM 353
CISABCD 46
OEABCD 35
SCABCD 71
-----------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU)
-----------------------------------------------------------------------------------
0 SELECT STATEMENT 18020 299K 81 (7)
1 HASH GROUP BY 18020 299K 81 (7)
2 VIEW DBA_OBJECTS 18020 299K 79 (4)
As you can see, 9i query returns the result in the ascending order, where as 10g, didn't.
If the ORDER BY clause is added on 10g database, the explain plan looks like below, an extra sorting is being performaned:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (ORDER BY)
2 1 SORT (GROUP BY)
3 2 VIEW OF 'DBA_OBJECTS'
The default behavior of GROUP BY clause in 9i, provides an internal sorting when the GROUP BY clause is used, avoiding an extra sorting, which in my opinion could improve the performance of the query.
I don't know the internal behavior of HASH GROUP BY algorithm, but, this will be useful when
you don't want data in any order after the GROUP BY clause.
After further investigations, I have found out that HASH GROUP BY has few bugs which
leads to an inaccurate result. As per Oracle, the bug it is resolved in 11g version.
Metalink : Bug 4604970 WRONG RESULTS WITH 'HASH GROUP BY 'AGGREGATION ENABLED .
Alex had discussed about this behavior at his blog:
http://www.oracloid.com/2006/05/hash-group-by-can-give-wrong-result-in-oracle-102/
Disable HASH GROUP SORT in 10g:
Oracle recommend setting the _gby_hash_aggregation_enabled parameter = false OR
optimizer_features_enabled=9.2.0 to get rid of HASH GROUP BY mechanism.
alter session set "_gby_hash_aggregation_enabled" = FALSE;
Id Operation Name Rows Bytes Cost (%CPU)
---------------------------------------------------------------------------------
0 SELECT STATEMENT 2181 37077 997 (2)
1 SORT GROUP BY 2181 37077 997 (2)
2 VIEW SYS_DBA_SEGS 2181 37077 996 (1)
Julian Dyke has a brief explanation note and a example at his site:
http://julian.dyke.users.btopenworld.com/com/Optimisation/Operations/HashGroupBy.html
My colleague did a small test on 9.2.0.7 database where he found out an interesting point that the ordering of the result is different when ORDER BY clause is not specified with the GROUP BY.
drop table test;
create table test(a varchar2(30));
insert into test values ('ahmed');
insert into test values ('zubair');
insert into test values ('11');
insert into test values ('-11');
insert into test values ('Zubair');
insert into test values ('test');
commit;
SQL> set autotrace on exp
SQL> select a, count(*) from test group by a;
A COUNT(*)
------------------------------ ----------
-11 1
11 1
Zubair 1
ahmed 1
test 1
zubair 1
6 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 TABLE ACCESS (FULL) OF 'TEST'
SQL> select a, count(*) from test group by a order by a;
A COUNT(*)
------------------------------ ----------
-11 1
11 1
ahmed 1
test 1
Zubair 1
zubair 1
Please note that the Schema names are modified to show the test case.
References:
Refer Metalink Note : 345048.1 - 'Group By' Doesn't not Sort If you don't use order by in 10g.
http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1251893,00.html
Update : (14 July)
I was reading the Oracle University Introduction to SQL course PPTs and in the 'Aggregate Data using The Group Functions' it did mention that the rows are sorted by ascending order of the columns included in the GROUP BY list.
I personally think that was a great idea, it indeed avoid an extra sorting (comes using by ORDER BY clause.
Happy Reading,
Jaffar
原帖地址:http://jaffardba.blogspot.com/2007/07/change-behavior-of-group-by-clause-in.html