数据库性能问题：当表字段重复记录过多，索引可能失去意义

最新推荐文章于 2024-06-02 01:58:49 发布

匿_名_用_户

最新推荐文章于 2024-06-02 01:58:49 发布

阅读量3.8k

点赞数

分类专栏： DB2 文章标签： DB2 索引性能

本文链接：https://blog.csdn.net/qingsong3333/article/details/84947005

版权

DB2 专栏收录该内容

181 篇文章 9 订阅

订阅专栏

今天遇到一个DB2的问题：两条同样的SQL语句，只是where条件指定的字段的值不一样，结果访问计划中，一条SQL走了索引，另一条没有，类似如下：

SQL1： select id, name from t1 where id = 3000 and .... ->使用了索引

SQL2：select id, name from t1 where id = 10001 and ....->没有使用索引

于是研究了一下，发现原因是id = 10001的记录太多，导致DB2认为走索引并非最优执行计划。重现问题的过程如下：

$ for i in {1..10000}; do echo "$i,'aaa'" >> 1.txt; done
$ for i in {1..10000}; do echo "10001,'bbb'" >> 1.txt; done
$ db2 "create table t1(id int, name char(20))"
$ db2 "load from 1.txt of del insert into t1 nonrecoverable"
$ db2 "create index idx1 on t1(id)"
$ db2 "runstats on table t1 and indexes all"
$ db2expln -d dba -g -statement "select id, name from t1 where id = 3000" -terminal

..<skip>..

Statement:
  
  select id, name 
  from t1 
  where id =3000


Section Code Page = 1208

Estimated Cost = 13.542770
Estimated Cardinality = 1.000000

Access Table Name = INST105.T1  ID = 2,4
|  Index Scan:  Name = INST105.IDX1  ID = 1
|  |  Regular Index (Not Clustered)
|  |  Index Columns:
|  |  |  1: ID (Ascending)
|  #Columns = 1
|  Skip Inserted Rows
|  Avoid Locking Committed Data
|  Currently Committed for Cursor Stability
|  Evaluate Predicates Before Locking for Key
|  #Key Columns = 1
|  |  Start Key: Inclusive Value
|  |  |  1: 3000 
|  |  Stop Key: Inclusive Value
|  |  |  1: 3000 
|  Data Prefetch: Sequential(1), Readahead
|  Index Prefetch: Sequential(1), Readahead
|  Lock Intents
|  |  Table: Intent Share
|  |  Row  : Next Key Share
|  Sargable Predicate(s)
|  |  Return Data to Application
|  |  |  #Columns = 2
Return Data Completion

End of section


Optimizer Plan:

        Rows   
      Operator 
        (ID)   
        Cost   
              
         1    
      RETURN  
       ( 1)   
      13.5428 
        |     
         1    
       FETCH  
       ( 2)   
      13.5428 
     /       \
    1      20000  
 IXSCAN   Table:  
  ( 3)    INST105 
 6.77555  T1      
   |     
  10001  
 Index:  
 INST105 
 IDX1    
 
$ db2expln -d dba -g -statement "select id, name from t1 where id = 10001" -terminal

..<skip>..

Statement:
  
  select id, name 
  from t1 
  where id =10001


Section Code Page = 1208

Estimated Cost = 168.612518
Estimated Cardinality = 10000.000000

Access Table Name = INST105.T1  ID = 2,4
|  #Columns = 1
|  Skip Inserted Rows
|  Avoid Locking Committed Data
|  Currently Committed for Cursor Stability
|  May participate in Scan Sharing structures
|  Scan may start anywhere and wrap, for completion
|  Fast scan, for purposes of scan sharing management
|  Scan can be throttled in scan sharing management
|  Relation Scan
|  |  Prefetch: Eligible
|  Lock Intents
|  |  Table: Intent Share
|  |  Row  : Next Key Share
|  Sargable Predicate(s)
|  |  #Predicates = 1
|  |  Return Data to Application
|  |  |  #Columns = 2
Return Data Completion

End of section


Optimizer Plan:

   Rows   
 Operator 
   (ID)   
   Cost   
         
  10000  
 RETURN  
  ( 1)   
 168.613 
   |     
  10000  
 TBSCAN  
  ( 2)   
 168.613 
   |     
  20000  
 Table:  
 INST105 
 T1

测试环境为LINUX，建了一张表，共20000条记录，前10000条记录的ID字段是不同的。后面10000条记录ID字段都为10001，所以当where条件为ID=10001时，有一半的记录都符合条件。这时候DB2认为索引效率可能比全表扫要低，就走了表扫。

匿_名_用_户

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
数据库性能问题：当表字段重复记录过多，索引可能失去意义

今天遇到一个DB2的问题：两条同样的SQL语句，只是where条件指定的字段的值不一样，结果访问计划中，一条SQL走了索引，另一条没有，类似如下：SQL1： select id, name from t1 where id = 3000 and .... -&gt;使用了索引SQL2：select id, name from t1 where id = 10001 and ....-&g...
复制链接

扫一扫

专栏目录