Gathering Statistics for the Cost Based Optimizer (Pre 10g) (Doc ID 114671.1)

Gathering Statistics for the Cost Based Optimizer (Pre 10g) (Doc ID 114671.1)

APPLIES TO:

Oracle Database - Enterprise Edition - Version 8.0.3.0 to 9.2.0.8 [Release 8.0.3 to 9.2]
Information in this document applies to any platform.
***Checked for relevance on 01-Aug-2013***

PURPOSE

Give the available methods for gathering statistics for the Cost Based Optimizer for Database Administrators.

QUESTIONS AND ANSWERS

Gathering Statistics

For recommendations on later versions see:

Document 1226841.1 How To: Gather Statistics for the Cost Based Optimizer 

The cost-based optimization approach uses statistics to estimate 
the cost of each execution plan. You should gather statistics periodically 
for objects where the statistics become stale over time because of changing
data volumes or changes in column values.
For more information about when to gather new statistics, see 
Note:44961.1 Gathering statistics frequency and strategy guidelines

There are 2 ways to collect statistics on schema objects:

1. DBMS_STATS (Oracle 8.1.5 and higher)
1.1 GATHER_DATABASE_STATS Statistics for all objects in a database
1.2 GATHER_SCHEMA_STATS Statistics for all objects in a schema
1.3 GATHER_TABLE_STATS Table, column, and index statistics
1.4 GATHER_INDEX_STATS Index statistics

2. ANALYZE or PL/SQL package with Recursive ANALYZE statement (old fashion)
2.1 ANALYZE command
2.2 DBMS_UTILITY
2.3 DBMS_DDL


There are new enhancements to dbms_stats but not to analyze.


****************************************
1. DBMS_STATS package (8.1.5 and higher)
****************************************

- Oracle Corporation strongly recommends that you use the DBMS_STATS package 
rather than ANALYZE to collect optimizer statistics.

- The DBMS_STATS package can gather statistics on indexes, tables, columns, 
and partitions, as well as statistics on all schema objects in a schema 
or database. 

- It does not gather cluster statistics, but you can use DBMS_STATS to gather 
statistics on the individual tables instead of the whole cluster. 

- The statistics-gathering operations can run either serially or in parallel. 
Whenever possible, DBMS_STATS calls a parallel query to gather statistics 
with the specified degree of parallelism; 
otherwise, it calls a serial query or the ANALYZE statement. 

o For example ,With 8.1.x DBMS_STATS uses recursive ANALYZE to gather statistics on
- All Index Statistics
- All Histograms

o For example ,With 9.2.x DBMS_STATS uses recursive ANALYZE to gather statistics only on
* CLUSTER INDEX
* DOMAIN INDEX
* JOIN INDEX

- By default DBMS_STATS will generate statistics for the table and not it's 
indexes (By default CASCADE => FALSE).

- Global statistics Note:236935.1

o For partitioned tables and indexes, DBMS_STATS can gather separate 
statistics for each partition as well as global statistics for the entire 
table or index.
 

o Similary, for composite partitioning DBMS_STATS can gather separate 
statistics for subpartitions, partitions, and the entire table or index. 

o Depending on the SQL statement being optimized, the optimizer may choose 
to use either the partition (or subpartition) statistics or the global 
statistics. Unless the query predicate narrows the query to a single 
partition, the optimizer uses the global statistics. 
Because most queries are not likely to be this restrictive, 
it is most important to have accurate global statistics. 


o Therefore, actually gathering global statistics with the DBMS_STATS package 
is highly recommended, rather than calculating them with the ANALYZE statement.

- It can collect statistics only on the "stale" objects when monitoring is 
enabled. See: Note:102334.1

- In 9i and above, it can also do 
o Estimate statistics with automatically determined adequate sample size, 
o Create histograms only on the columns they would be useful. 


1.1 DBMS_STATS.GATHER_DATABASE_STATS
====================================
Can gather statistics on all the tables and indexes in a database.

The GATHER_DATABASE_STATS procedure - some parameters - :

- estimate_percent 
o NULL means compute
o The valid range is [0.000001,100]. 
o DBMS_STATS.AUTO_SAMPLE_SIZE so Oracle determines the 
best sample size for good statistics (Recommended value 9i and above). 

- block_sample 
o Only pertinent when doing an estimate statistics.
o Use or not random block sampling instead of random row sampling.
o Random block sampling is more efficient, 
but if the data is not randomly distributed on disk, 
then the sample values may be somewhat correlated. 

- method_opt 
o DEFAULT (prior to 10g 'FOR ALL COLUMNS SIZE 1', starting with 10g 'FOR ALL COLUMNS SIZE AUTO')
o FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
. INDEXED can be used when predicate columns are all indexed
. HIDDEN for columns in Function-Based Indexes
. ALL when all columns appear in predicates
o FOR COLUMNS [size clause] column|attribute [size_clause] 
[,column|attribute [size_clause]...], 
o size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY} 
. integer Number of histogram buckets. Valid Range [1,254]
. REPEAT Collects histograms only on the columns that 
already have histograms (9i and above). 
. AUTO Oracle determines the columns to collect histograms 
based on data distribution and the workload of the 
columns (9i and above). 
. SKEWONLY Oracle determines the columns to collect histograms 
based on the data distribution of the columns (9i and above).
Note:252361.1 DBMS_STATS always sets AVG_ROW_LEN=100 or left the old value
- degree 
o Degree of parallelism. 
o NULL means use the table default value 
o DBMS_STATS.DEFAULT_DEGREE means value based on the init parameters (9i and above).

- granularity 
o Only pertinent if the table is partitioned. 
o DEFAULT: Gather global- and partition-level statistics. 
o GLOBAL: Gather global statistics. 
o PARTITION: Gather partition-level statistics. 
o SUBPARTITION: Gather subpartition-level statistics. 
o ALL: Gather all (subpartition, partition, and global)stats.
o Analyze table does not replace statistics
o Global statistics - DBMS_STATS versus ANALYZE

- cascade 
o Default (CASCADE => FALSE). No Index Gathering.
o Gathers statistics on the indexes as well (TRUE). 
o Index statistics gathering is not parallelized prior to 9iR2
o Equivalent to gather_index_stats procedure on each of the indexes in 
the database in addition to gathering table and column statistics.
Note:159374.1 DBMS_STATS.GATHER_SCHEMA_STATS don't analyze indexes 

- options 
o GATHER: Default. Gathers statistics on all objects. 
o GATHER AUTO: Gathers all necessary statistics automatically. 
Oracle implicitly determines which objects need new statistics, 
and determines how to gather those statistics (9i and above).
o GATHER STALE: Gathers statistics on stale objects as determined by 
looking at the *_tab_modifications views. 
o GATHER EMPTY: Gathers statistics on objects which currently have no 
statistics. 
Note:102334.1 How to Automate Change Based Statistic Gathering - Monitoring Table
Note:228186.1 Differences between GATHER STALE and GATHER AUTO 

- gather_sys 
o Gathers statistics on the objects owned by the 'SYS' user (TRUE)


1.2 DBMS_STATS.GATHER SCHEMA_STATS
==================================
Can gather statistics on all the tables and indexes in a schema. 

The GATHER SCHEMA_STATS procedure - some parameters - :
o ownname => Schema to analyze (NULL means current schema).
o See GATHER_DATABASE_STATS except GATHER_SYS


1.3 DBMS_STATS.GATHER_TABLE_STATS
=================================
Can gather statistics on a table and its indexes.

The GATHER_TABLE_STATS procedure - some parameters - :
o ownname => Schema to analyze (NULL means current schema).
o tabname => Name of table.
o partname => Name of partition. 
o See DBMS_STATS.GATHER SCHEMA_STATS


1.4 DBMS_STATS.GATHER_INDEX_STATS
=================================
Can gather statistics on an index.

The GATHER_INDEX_STATS procedure - some parameters - :
o ownname => Schema to analyze (NULL means current schema).
o indname => Name of Index.
o partname => Name of partition. 
o See DBMS_STATS.GATHER SCHEMA_STATS





***********
2. ANALYZE
***********

===================
2.1 ANALYZE Command
===================

- Old fashion
- Oracle Corporation strongly recommends that you use the DBMS_STATS package 
rather than ANALYZE to collect optimizer statistics.
- The ANALYZE command gathers statistics in a serial manner. 
- It can be used with an ESTIMATE or COMPUTE statistics. 
- If you specify ANALYZE TABLE ESTIMATE STATISTICS without a sample size, 
Oracle will only gather statistics based on 1064 rows.
 
When using the ANALYZE command, you should always specify a sample size so that 
the default does not take place.
(i.e. analyze table test estimate statistics sample size 20 percent;)
- The COMPUTE statistics option will guarantee that the optimizer has the best 
statistics available in order to determine an execution path for a given query.
(i.e. analyze table test compute statistics;)

- ANALYZE collects only partition level statistics and derives the entire 
table level statistics by aggregation.

- By default the ANALYZE TABLE command will generate statistics for the table 
and all it's indexes provided that the FOR clause is not used.


========================
2.2 DBMS_UTILITY Package
========================

The DBMS_UTILITY package provides 2 procedures to assist in the gathering of 
statistics. It generates ANALYZE command for different objects.

2.2.1. ANALYZE_SCHEMA 
=====================
Gathers statistics on all the tables, clusters and indexes in a schema.

The ANALYZE_SCHEMA procedure accepts 5 arguments:
- schema => the schema to be analyzed
- method => ESTIMATE, COMPUTE or DELETE. 
If ESTIMATE, then either
. estimate_rows or 
. estimate_percent must be non-zero. 
- estimate_rows => Number of rows to estimate. 
- estimate_percent => Percentage of rows to estimate. 
If estimate_rows is specified, 
then this parameter is ignored. 
- method_opt => [ FOR TABLE ] [ FOR ALL [INDEXED] COLUMNS] [SIZE n] 
[ FOR ALL INDEXES ] 

For more information on the ANALYZE_SCHEMA procedure Note:67615.1
For more information on the ANALYZE_SCHEMA procedure Note:1011835.102


2.2.2. ANALYZE_DATABASE 
=======================

Gathers statistics on all the tables, clusters and indexes in a database.

The ANALYZE_DATABASE procedure accepts 4 arguments:
- method => ESTIMATE, COMPUTE or DELETE. 
If ESTIMATE, then either
. estimate_rows or 
. estimate_percent must be non-zero. 
- estimate_rows => Number of rows to estimate. 
- estimate_percent => Percentage of rows to estimate. 
If estimate_rows is specified, 
then this parameter is ignored. 
- method_opt => [ FOR TABLE ] [ FOR ALL [INDEXED] COLUMNS] [SIZE n] 
[ FOR ALL INDEXES ] 

For more information on the ANALYZE_DATABASE procedure Note:67616.1


====================
2.3 DBMS_DDL Package
====================

The DBMS_DDL package provides one procedure which gathers also statistics:

2.3.1 ANALYZE_OBJECT 
==================== 

Equivalent to SQL 

"ANALYZE 
TABLE|CLUSTER|INDEX [.]
[] STATISTICS 
[SAMPLE [ROWS|PERCENT]]"

For more information on the ANALYZE_OBJECT procedure Note:77283.1


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值