NULL 值与索引(一)

最新推荐文章于 2023-11-06 11:14:05 发布

anjichan4261

最新推荐文章于 2023-11-06 11:14:05 发布

阅读量291

点赞数

文章标签：数据库

原文链接：http://www.cnblogs.com/ajuanabc/archive/2012/04/08/2462704.html

版权

    NULL值是关系数据库系统布尔型(true,false,unknown)中比较特殊类型的一种值，通常称为UNKNOWN或空值，即是未知的，不确定的。由于
NULL存在着无数的可能，因此NULL值也不等于NULL值，所以与NULL值相关的操作同样都为NULL值。正是基于这样一个特性，对于NULL值列上的B
树索引导致了is null/is not null不走索引的情形，下面描述了NULL值与索引以及索引NULL列上的执行计划，如何使得NULL值走索引的情形。
注：本文仅仅讨论的是B树索引上的NULL值，位图索引不在此范围之内。

一、null值与索引的关系
scott@ORCL> create table t1(id number,val varchar2(1)); -->为表t1创建唯一索引 scott@ORCL> create unique index i_t1_id on t1(id); scott@ORCL> insert into t1 select null,'Y' from dual; scott@ORCL> insert into t1 select null,'N' from dual; -->从上面的操作可知，尽管列id上存在唯一索引，但由于null值不等于任一null值，因此能够成功插入 scott@ORCL> commit; -->再次为表添加唯一复合索引，即基于id列与val列 scott@ORCL> create unique index i_t1_id_val on t1(id,val); Index created. -->插入null,'N'的记录时失败，提示违反唯一性约束 scott@ORCL> insert into t1 select null,'N' from dual; insert into t1 select null,'N' from dual * ERROR at line 1: ORA-00001: unique constraint (SCOTT.I_T1_ID_VAL) violated -->插入null,'Y'的记录时同样失败，提示违反唯一性约束 scott@ORCL> insert into t1 select null,'Y' from dual; insert into t1 select null,'Y' from dual * ERROR at line 1: ORA-00001: unique constraint (SCOTT.I_T1_ID_VAL) violated -->插入两个null值成功 scott@ORCL> insert into t1 select null,null from dual; 1 row created. scott@ORCL> insert into t1 select null,null from dual; 1 row created. scott@ORCL> insert into t1 select null,'A' from dual; 1 row created. scott@ORCL> commit; Commit complete. scott@ORCL> set null unknown; scott@ORCL> select * from t1; ID VAL ---------- ------------------------------ unknown Y unknown N unknown unknown unknown unknown unknown A scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T1',cascade=>true); scott@ORCL> select index_name,index_type,blevel,leaf_blocks,num_rows,status,distinct_keys 2 from user_indexes where table_name='T1'; INDEX_NAME INDEX_TYPE BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DISTINCT_KEYS --------------- ---------- ---------- ----------- ---------- -------- ------------- I_T1_ID NORMAL 0 0 0 VALID 0 I_T1_ID_VAL NORMAL 0 1 3 VALID 3 -->从上面的情形可知， -->基于单列的唯一索引，可以多次插入null值，但其索引上并不存储null值。 -->基于多列的复合索引，尽管全为null值的行可以多次插入，但不全为null的重复行则不能被插入(注,非唯一复合索引不存在此限制，此处不演示)。 -->基于多列的复合索引，对于全为null值的索引值也不会被存储。如上面的情形，尽管插入了5条记录，复合索引中只存储了3条。 -->注：对于唯一性约束，null值不等于null值，同样(null,null)也不等同于(null,null)，所以上面的两次null能够被插入。二、null值与执行计划
scott@ORCL> set autot trace exp; scott@ORCL> select * from t1 where id is null; Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 5 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T1 | 5 | 5 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID" IS NULL) -->从上面的测试可知，由于null值是不被存储的，因此当使用id is null作为谓词时，走了全表扫描 scott@ORCL> select * from t1 where id is not null; Execution Plan ---------------------------------------------------------- Plan hash value: 796913935 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 | 0 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 0 (0)| 00:00:01 | |* 2 | INDEX FULL SCAN | I_T1_ID | 1 | | 0 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("ID" IS NOT NULL) -->从上面的测试可知，尽管当前表上id列上的所有值都为null，但不排除后续记录插入的id不为null的列。 -->故当使用id is not null作为谓词时，此时执行计划中走了索引全扫描。 -->下面来看看复合索引的情形 scott@ORCL> select * from t1 where val is null; Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 2 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T1 | 2 | 2 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("VAL" IS NULL) scott@ORCL> select * from t1 where val is not null; Execution Plan ---------------------------------------------------------- Plan hash value: 1931510411 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 3 | 1 (0)| 00:00:01 | |* 1 | INDEX FULL SCAN | I_T1_ID_VAL | 3 | 3 | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("VAL" IS NOT NULL) -->对于复合唯一索引的情形，当使用单列且非前导列谓词时，使用is null与 is not null等同于单列唯一索引的情形。 -->即原理也是一样的，val is null走全表扫描而val is not null走索引。因为null值不会被存储。 -->下面看看两个列都作为谓词的情形 scott@ORCL> select * from t1 where id is null and val is not null; Execution Plan ---------------------------------------------------------- Plan hash value: 1040510552 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 3 | 1 (0)| 00:00:01 | |* 1 | INDEX RANGE SCAN| I_T1_ID_VAL | 3 | 3 | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("ID" IS NULL) filter("VAL" IS NOT NULL) -->从上面的测试可知，尽管两个谓词列上都存在索引，一个为单列唯一索引，一个为复合唯一索引。Oracle 选择了复合索引I_T1_ID_VAL。 scott@ORCL> select * from t1 where id is not null and val is null; Execution Plan ---------------------------------------------------------- Plan hash value: 796913935 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 | 0 (0)| 00:00:01 | |* 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 0 (0)| 00:00:01 | |* 2 | INDEX FULL SCAN | I_T1_ID | 1 | | 0 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("VAL" IS NULL) 2 - filter("ID" IS NOT NULL) -->同样的情形，谓词的顺序与复合索引定义的顺序一样，只不过第一个谓词为id is not null，而第二个谓词为val is null。 -->此时Oracle 选择了单列唯一索引I_T1_ID -->看到此，不知道大家是否已明白，即哪个列为is not null，则会使用该列上的索引，原因还是那句话，索引不存储null值。 -->对于颠倒id列与val列以及id,val列为null或not null的其他不同组合情形不再演示，其执行计划类似。三、使用is null走索引的情形
scott@ORCL> set autot off; --删除原有表上的null值记录 scott@ORCL> delete from t1 where val not in('Y','N') or val is null; 3 rows deleted. scott@ORCL> update t1 set id=1 where val='Y'; 1 row updated. scott@ORCL> update t1 set id=2 where val='N'; 1 row updated. scott@ORCL> commit; Commit complete. -->对原有记录更新后的情形 scott@ORCL> select * from t1; ID VAL ---------- ------------------------------ 1 Y 2 N scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T1',cascade=>true); PL/SQL procedure successfully completed. -->修改表列id使之具有not null约束的特性 scott@ORCL> alter table t1 modify(id not null); Table altered. scott@ORCL> set autot trace exp; scott@ORCL> select * from t1 where id is null; Execution Plan ---------------------------------------------------------- Plan hash value: 3160894736 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 5 | 0 (0)| | |* 1 | FILTER | | | | | | | 2 | INDEX FULL SCAN| I_T1_ID_VAL | 2 | 10 | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(NULL IS NOT NULL) -->从上面的执行计划中可知，当表t1列id上具有not null 约束时，此时使用id is null选择了索引范围扫描 -->下面来看看列val is null 的情形 scott@ORCL> select * from t1 where val is null; Execution Plan ---------------------------------------------------------- Plan hash value: 48744011 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 5 | 2 (0)| 00:00:01 | |* 1 | INDEX FAST FULL SCAN| I_T1_ID_VAL | 1 | 5 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("VAL" IS NULL) -->尽管val列上允许null值存在，但由于列id上具有not null 约束，且id列与val列存在复合唯一索引，因此此时选择了索引快速全扫描 -->其余不同组合情形大致相同，不再演示 -->为表t1新增一条val为null的记录 scott@ORCL> insert into t1 select 3,null from dual; 1 row created. scott@ORCL> commit; Commit complete. scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T1',cascade=>true); PL/SQL procedure successfully completed. -->下面的查询中可以看出尽管只有列id有not null约束，当所有的索引值都被存储 scott@ORCL> select index_name,index_type,blevel,leaf_blocks,num_rows,status,distinct_keys 2 from user_indexes where table_name='T1'; INDEX_NAME INDEX_TYPE BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DISTINCT_KEYS --------------- ---------- ---------- ----------- ---------- -------- ------------- I_T1_ID NORMAL 0 1 3 VALID 3 I_T1_ID_VAL NORMAL 0 1 3 VALID 3 -->Author : Robinson Cheng -->Blog : http://blog.csdn.net/robinson_0612 四、总结
   无论是单列唯一索引或复合唯一索引，对于可以为null的列或复合null值，Oracle不会为其存储索引值。
   故在基于单列创建B树唯一索引或多列创建B树复合唯一索引的情形下，
   当列上允许为null值时
       where子句使用了基于is null的情形，其执行计划走全表扫描。
       where子句使用了基于is not null的情形，其执行计划走索引扫描(索引范围扫描或索引全扫描)。
   当列上不允许为null值时，存在非null约束
       where子句使用了基于is null的情行，其执行计划走索引扫描。
       where子句使用了基于is not null的情形，其执行计划也是走索引扫描。
   注：此在Oracle 10g R2(linux)下的情形，不同的优化器版本可能会有偏差。

五、更多参考

dbms_xplan之display_cursor函数的使用

dbms_xplan之display函数的使用

执行计划中各字段各模块描述

使用 EXPLAIN PLAN 获取SQL语句执行计划

转载于:https://www.cnblogs.com/ajuanabc/archive/2012/04/08/2462704.html

anjichan4261

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
NULL 值与索引(一)

NULL值是关系数据库系统布尔型(true,false,unknown)中比较特殊类型的一种值，通常称为UNKNOWN或空值，即是未知的，不确定的。由于NULL存在着无数的可能，因此NULL值也不等于NULL值，所以与NULL值相关的操作同样都为NULL值。正是基于这样一个特性，对于NULL值列上的B树索引导致了is null/is not null不走索引的情形，下面描述了NU...
复制链接

扫一扫