前面说完了索引与count(*)的关系,现在来讨论另外一种聚合查询max(),min()与索引的关系,大家觉得这个聚合查询能用的到索引吗? 通过上一小节的学习后,可能有人会回答:“可以用得上,但是索引列必须要建主键或者要写where column is not null就可以用到了。”对于这样的回答应该值得肯定,非常正确!看来前面没白讲了。不过用上了什么样的索引扫描方式呢?上一小节的方式是INDEX FULL SCAN,大家一定有印象,现在如果是要让max()和min()利用上索引,也是走这个INDEX FULL SCAN扫描方式吗? 大家想一想索引的结构是什么样的?索引结构是从root到branch最后到leaf,好象一个金字塔。最下面的叶子层(也就是金字塔的底部)其实是有序的,比如从左到右值是从小到大,或者从大到小。这样一来大家认为取max()或者 min()还需要INDEX FULL SCAN吗,找到头或尾不就找到最大或最小值,还需要遍历leaf吗? 于是ORACLE的另一种索引扫描类型就横空出世了index full scan(max/min)。多了(max/min)的关键字!index full scan(max/min)蕴含着stopkey的机制,从最左边或者最右边的叶子节点开始扫描,读到第一个值后就停止扫描。 查看max()的查询,发现果然是走 INDEX FULL SCAN (MIN/MAX) SQL> explain plan for select max(object_id) from ljb_test where object_id is not null; Explained SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------- Plan hash value: 613051030 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0) | 1 | SORT AGGREGATE | | 1 | 13 | | 2 | FIRST ROW | | 49190 | 624K| 2 (0) |* 3 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0) ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("OBJECT_ID" IS NOT NULL) Note - dynamic sampling used for this statement 19 rows selected 查看min()的查询,发现也走了INDEX FULL SCAN (MIN/MAX) SQL> explain plan for select min(object_id) from ljb_test where object_id is not null; Explained SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------- Plan hash value: 613051030 ----------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) ----------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0) | 1 | SORT AGGREGATE | | 1 | 13 | | 2 | FIRST ROW | | 49190 | 624K| 2 (0) |* 3 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0) ----------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("OBJECT_ID" IS NOT NULL) Note - dynamic sampling used for this statement 19 rows selected 到此大家应该完全明白了max()和min()的时候,执行计划中会走INDEX FULL SCAN (MIN/MAX)的原因了吧,在获取正确的信息后,ORACLE对此类查询自然就会选择这样的扫描方式,希望大家能理解其中选择这样方式扫描的原理!也许有人说,知道这个也没用,ORACLE自己就会选怎么走索引吧,这个NDEX FULL SCAN (MIN/MAX)的知识点知道也没意义。其实我认为,多理解点东西总是有用的,尤其是原理性方面,比如我现在再问这样一个问题:select min(object_id),max(object_id) fromljb_test where object_id is not null 这个语句ORACLE怎么处理?大家怎么回答? 让我实验一下吧(很多人猜还是INDEX FULL SCAN (MIN/MAX)): 下面执行结果出来了,走的索引扫描类型是INDEX FULL SCAN,看不到(MIN/MAX)的关键字,咋回事? SQL> explain plan for select min(object_id),max(object_id) from ljb_test where object_id is not null;
Explained SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------- Plan hash value: 1341606234 ------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time ------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 61 (4)| 00:0 | 1 | SORT AGGREGATE | | 1 | 13 | | |* 2 | INDEX FAST FULL SCAN| IDX_LJB_TEST | 49190 | 624K| 61 (4)| 00:0 ------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("OBJECT_ID" IS NOT NULL) Note ----- - dynamic sampling used for this statement 18 rows selected 原来这样的SQL语句是表示ORACLE要利用该索引同时取到这两个值,INDEX FULL SCAN (MIN/MAX)是无法一次取到两个值的,所以ORACLE不得不选择了INDEX FULL SCAN ,把叶子的索引扫了个遍,同时取到了两个值。 明白了原理,处理起问题就简单了,改写代码如下: select (select max(object_id) from test1) c, (select min(object_id) from test1) b from dual;现在终于走了INDEX FULL SCAN (MIN/MAX)索引了,大家看到这个INDEX FULL SCAN (MIN/MAX)威力还是非常大的,走了两次INDEX FULL SCAN (MIN/MAX),居然代价才4,远远低于一次INDEX FULL SCAN的代价61 SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 3189180828 ----------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 1 | 26 | 4 (0)| 00:00:01 | | 2 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 | | 3 | SORT AGGREGATE | | 1 | 13 | | | | 4 | FIRST ROW | | 49190 | 624K| 2 (0)| 00:00:01 | |* 5 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0)| 00:00:01 | | 6 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 | | 7 | SORT AGGREGATE | | 1 | 13 | | | | 8 | FIRST ROW | | 49190 | 624K| 2 (0)| 00:00:01 | |* 9 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - filter("OBJECT_ID" IS NOT NULL) 9 - filter("OBJECT_ID" IS NOT NULL) PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------- Note - dynamic sampling used for this statement 总结:max() 和 min() 是大家常用的使用频率很高的sql写法,计费项目各种报表中需要这样编写的地方比比皆是!希望大家能对这样的查询建立索引,在保证该列不空的情况下,就有可能利用到INDEX FULL SCAN (MIN/MAX)这个索引扫描方式,能为查询性能带来很大的提高,另外只要善于思考,还可以通过改写SQL的方式,将原本利用不到INDEX FULL SCAN (MIN/MAX)查询方式的语句select min(object_id),max(object_id) from ljb_test where object_id is not null改造后,利用上INDEX FULL SCAN (MIN/MAX)。希望这个能启发开发人员多利用现有的SQL知识,编写出高效的SQL语句。 引申联想:大家记得前面我有提到index full scan(max/min)蕴含着stopkey的机制,有优化基础的朋友一定认识这个stopkey,经常在分页查询的执行计划中,看到有这样的关键字,基本上可以认为这个查询的执行计划是正确的。 比如select * from (select * from table where id= order by name desc) where rownum<11;这样的语句具体的意思就是id为某个值的时候,根据name做排序,然后取前10行.这个语句存在2个部分:id为某个值,name降序。假设我现在存在这一个索引(id,name desc)这个索引的结构也是id相同的情况下按照name的降序排列,这个索引同时满足前面的两个条件,因此就能提高速度,只要从索引中读取出10个rowid,然后根据这10个rowid来回表,这时候速度肯定很快的,因此类似这类的分页语句可以根据sql语句的原意来创建索引,就能提高速度,但是如果where条件里出现非等于的条件,那么不管怎么创建索引都无法满足前面的2个条件(根据索引的结构就很容易明白这点),就必须根据字段的选择性来创建合适的索引. |
索引易错点:索引与max(),min()
最新推荐文章于 2023-04-15 16:28:22 发布