Oracle composite index column ordering

转载 2013年12月05日 15:14:26

Question:  I have a SQL with multiple columns in my where clause.  I know that Oracle can only choose one index, and I know about multi-column composite indexes, but I do not know how to determine the optimal column order for a composite index with multiple column values.  What is the secret for creating a composite index with the columns in the proper sequence?

Answer: You are correct that the column sequence matters!  This is an empirical question, and you need to run diagnostic scripts against your SQL workload (STATSPACK or AWR) to examine how frequently a specific index column was needed by SQL.  Remember, it's the SQL workload that drives your choice of composite indexes, and the order of the columns within the index.

See these important scripts to display multi-column index usage using AWR.

  • In general, when using a multi-column index, you want to put the most restrictive column value first (the column with the highest unique values) because this will trim-down the result set.
  • Because Oracle can only access one index, your job is to examine your historical SQL workload and build a single composite index that satisfies the majority of the SQL queries.
  • The Oracle optimizer may try to make single column  indexes behave as-if they were a single composite index.  Prior to 10g, this could be done with the "and_equal" hint.
  • Beware that indexes have overhead and see my notes on detecting duplicate index columns. 
  • You can run scripts to monitor the invocation count for each column in a multiple column composite index (see counting column usage from a SQL workload) 

I have more complete details on composite index usage monitoring in my bookAdvanced Oracle SQL Tuning: The Definitive Reference.  Also, see my related notes on tuning with composite bitmap indexes and my scripts to monitor which columns of a composite index are used, and counting index column usage from AWR and STATSPACK.   

Large Multi-column Composite Indexes

Multi-column indexes with more than 3 columns may not provide more efficient access than a two-column index.  The objective of the index is to reduce the amount of rows returned from a table access.  Therefore each added column must substantially reduce the number of returned rows to be effective.  For example, assuming a large table, on a query with 5 or more WHERE (AND) clauses using a 5-column index may return only 1 row.  However using a 3-column index may return only 50 rows.  A two-column index returns 200 rows.  The time it takes to extract the one row from the 200 rows using nested-loops is negligible. 

Thus the two-column index may be almost as efficient (fast) as the 5-column index.  The key is to index the most restrictive columns.  Another tradeoff is a table with multiple column indexes where the leading column(s) are the same.  For instance, a table with four 3-column indexes where the leading two columns are the same may work very efficiently on select statements but cause a heavy penalty on inserts and updates.  Just one 2-column index on the leading two columns may provide acceptable query performance while greatly improving DML.

Small tables with two or three columns may benefit by being rebuilt as an Index Organized Table (IOT).  A 2-column table with a primary key and a two-column index has 1.5 times the data in indexes that are in the table.  Making the table an Index Organized Table reduced the need for indexes because the table is the index.  Also IOTs can have indexes on non-leading columns if required.   Again this has to be balanced with the overhead of maintaining the IOT.

Lastly, do not be afraid to use temporary indexes.  If you run a nightly report that requires 6 hours to run, but will run in 30 mins with a specific index, you might want to create the index before running the report and drop it upon completion.  I work with clients that drop certain indexes to expedite the bill run, then recreate then for the normal application.  They create indexes each night and drop them in the morning.  There is nothing wrong with dynamically changing you database to respond to varying tasks if it results in efficiency.


Script for tracking composite index column usage

These scripts will only track SQL that you have directed Oracle to capture via your threshold settings in AWR or STATSPACK. STATSPACK and AWR will not collect "transient SQL" that did not appear in v$sql at snapshot time.  Hence, not all SQL will appear in these reports.  See my notes here on adjusting the SQL capture thresholds.


col c1 heading ‘Begin|Interval|time’ format a20
col c2 heading ‘Search Columns’      format 999
col c3 heading ‘Invocation|Count’    format 99,999,999
break on c1 skip 2
accept idxname char prompt ‘Enter Index Name: ‘
ttitle ‘Invocation Counts for index|&idxname’
   to_char(sn.begin_interval_time,'yy-mm-dd hh24')  c1,
   p.search_columns                                 c2,
   count(*)                                         c3
   dba_hist_snapshot  sn,
   dba_hist_sql_plan   p,
   dba_hist_sqlstat   st
   st.sql_id = p.sql_id
   sn.snap_id = st.snap_id   
   p.object_name = ‘&idxname'
group by

The query will produce an output showing a summary count of the index specified during the snapshot interval. This can be compared to the number of times that a table was invoked from SQL.  Here is a sample of the output from the script.

Invocation Counts for cust_index
Interval                             Invocation
time                 Search Columns       Count
-------------------- -------------- -----------
04-10-21 15                       1           3
04-10-10 16                       0           1
04-10-10 19                       1           1
04-10-11 02                       0           2
04-10-11 04                       2           1
04-10-11 06                       3           1
04-10-11 11                       0           1
04-10-11 12                       0           2
04-10-11 13                       2           1
04-10-11 15                       0           3
04-10-11 17                       0          14
04-10-11 18                       4           1
04-10-11 19                       0           1
04-10-11 20                       3           7
04-10-11 21                       0           1

For more complete details on creating a custom composite index monitoring infrastructure, see my latest book Advanced Oracle SQL Tuning: The Definitive Reference.

深入理解Oracle索引(14):Composite Index 两大原理解析

声明:虽然题目是Oracle、但同样适合MySQL InnoDB索引          在大多数情况下、复合索引比单字段索引好     很多系统就是靠新建一些合适的复合索引、使效率大幅度提高     ...

Struts2开发实例-http status 500 - /index.jsp (line: 2, column: 42) File "/struts-tags" not found

开发环境:win 8 服务器:tomcat 7 开发工具:myeclipse 10 Struts2版本号:struts- JDK版本:JDK 1.6 1.工程建立以及web.xml配置...

Column store index 列数据如何匹配成行数据?

那你SQL Server 2012引入了列存储索引,对每列的数据进行分组和存储,然后联接所有列以完成整个索引。这不同于传统索引,传统索引对每行的数据进行分组和存储,然后联接所有行以完成整个索引。   ...

每日MySQL之011:MySQL和DB2中的Multiple-Column Index(复合索引)


DB2数据库查询过程(Query Processing)----复合索引的匹配索引扫描(Matching Index Scans with Composite Indexs)

在《DB2数据库查询过程(Query Processing)----简单索引访问(Simple Indexed Access)》一文中已经对索引访问的各种形式作了详细介绍,本文重点讨论匹配索引扫描...
  • idber
  • idber
  • 2012年11月17日 12:58
  • 3866

PostgreSQL index scan enlarge heap page scans when index and column correlation small

今天在讲解PostgreSQL 性能优化的 成本因子校准时发现一个奇异的问题, 索引扫描带来了巨大的heap page scan数目. 视频如下 :

Oracle get the Primary and foreign Key Column

获取单个表主键 SELECT cols.table_name, cols.column_name, cols.position, cons.status, cons.owner FROM all_co...

Oracle column、pagesize、linesize命令,简单报表及计算

使用column命令 对列设置显示效果,包括列标题和列数据 column [{column_name|ALIAS alias}[options]] option选项 FORMAT format...


创建表的语法 GLOBAL TEMPORARY 说明该表为临时表。行的有效期有 ON COMMIT 子句决定。临时表对于说有会话都可见,但是这些行则是特定于某个会话的。 const...
您举报文章:Oracle composite index column ordering