oracle索引群集因子,Oracle索引群集因子

weixin_39604819

于 2021-04-06 00:14:58 发布

阅读量54

点赞数

文章标签： oracle索引群集因子

cost =

blevel +

ceiling(leaf_blocks * effective index selectivity) +

ceiling(clustering_factor * effective table selectivity)

Historically it has been common practice to say that a good index has a low clustering_factor, and a bad

index has a high clustering_factor.

There is obviously a degree of truth in this comment, especially in the light of what the clustering_factor

represents. However, I have always had an aversion to words like low, high, small, large, and expressions like close

to zero, when talking about Oracle. After all, is 10,000 a low clustering_factor or a high clustering_

factor? It’s low if you have 10,000 blocks in your table, and high if you have 100 blocks in your table. So you

might want to write a couple of little scripts that join user_tables to user_indexes (and other scripts for

partitioned tables, etc.) so that you can compare the critical figures.

In fact, for reasons I describe in Chapter 5, I often use the column avg_data_blocks_per_key to get

an idea of how good Oracle thinks the index is.

一个良好的cluster的index的clustering_factor应该是越接近table的block数越好

In Figure 4-1, we have a table with four blocks and 20 rows, and an index on the column V1,

whose values are shown. If you start to walk across the bottom of the index, the first rowid

points to the third row in the first block. We haven’t visited any blocks yet, so this is a new

block, so we count 1. Take one step along the index, and the rowid points to the fourth row of

the second block—we’ve changed block, so increment the count. Take one step along the

index, and the rowid points to the second row of the first block—we’ve changed block again, so

increment the count again. Take one step along the index, and the rowid points to the fifth row

of the first block—we haven’t changed blocks, so don’t increment the count.

In the diagram, I have put a number against each row of the table—this is to show the

value of the counter as the walk gets to that row. By the time we get to the end of the index, we

have changed table blocks ten times, so the clustering factor is 10.

Notice how small clumps of data stop the clustering_factor from growing—look at block 2

where the value 8 appears four times because four consecutive entries in the index point to the

same block; the same effect shows up in block 3 to give three rows the value 6.

The table doesn’t have to be completely sorted for this type of thing to happen; it only

needs to have little clumps (or clusters) of rows that are nearly sorted—hence the term

clustering_factor, rather than sort_factor.

Given the way the clustering_factor is calculated, you will appreciate that the smallest

possible value has to be the same as the number of blocks in the table, and the largest possible

value has to be the same as the number of rows in the table—provided you have computed

statistics.

If there are lots of blocks like block 2 in the table, the clustering_factor will turn out to be

quite close to the number of blocks in the table, but if the data in the table is randomly scattered,

the clustering_factor will tend to come out close to the number of rows in the table.

关注