How to Gather Optimizer Statistics on 10g (Doc ID 605439.1)

最新推荐文章于 2022-06-06 11:21:49 发布

songyb

最新推荐文章于 2022-06-06 11:21:49 发布

阅读量655

点赞数

分类专栏： MOS Gather Optimizer Statistic

MOS Gather Optimizer Statistic 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

In this Document

Goal

Solution

References

APPLIES TO:

Oracle Database - Personal Edition - Version 10.1.0.2 to 10.2.0.5 [Release 10.1 to 10.2]
Oracle Database - Standard Edition - Version 10.1.0.2 to 10.2.0.5 [Release 10.1 to 10.2]
Oracle Database - Enterprise Edition - Version 10.1.0.2 to 10.2.0.5 [Release 10.1 to 10.2]
Information in this document applies to any platform.
***Checked for relevance on 29-Jul-2013***

GOAL

This Document outline the recommended method to gather a standard set of optimizer statistics for use by the Cost Based Optimizer under Oracle 10g. For other versions see:

Document 1226841.1 How To: Gather Statistics for the Cost Based Optimizer

SOLUTION

Quick Recreate Recommendation

To achieve a quick delete and recreate of the statistics on an individual table and it's indexes (adding column statistics for any skewed columns) and following the recommendations in this article use:

  exec dbms_stats.delete_table_stats(ownname=>'user_name',- 
 
   tabname=>'table_name',cascade_indexes=>true); 
 
 exec dbms_stats.gather_table_stats(ownname=>'user_name',- 
 
    tabname=>'table_name',- 
 
    estimate_percent => 100,- 
 
    cascade=>true,- 
 
    method_opt=>'for all columns size skewonly');

For explanation of these recommendations, see below. For more usage examples see the end of this article.
Note that, from 10gR2 statistics can be restored using:

Document 452011.1 * Restoring table statistics in 10G onwards

IMPORTANT: PLEASE NOTE:

These recommendations apply to the majority of databases.
The recommendations aim to generate statistics with as much statistical accuracy as possible. To this end 100% sample sizes are suggested since any reduction in sample size is always a concession to accuracy. It is acknowledged that such 100% samples are potentially time consuming and consideration needs to be made to fit the statistics gathering activities within the existing maintenance window.
Gathering new optimizer statistics should maintain or improve existing execution plans, but it is possible that some queries performance may degrade. Note that from 10gR1 previous copies of statistics are maintained by default for the last 30 days and can be restored in the case of problems. See:
Document 452011.1 Restoring table statistics in 10G onwards
Gathering new optimizer statistics may invalidate cursors in the shared pool so it is prudent to restrict all gathering operations execution to periods of low activity in the database, such as the scheduled maintenance windows.
Apart from object statistics, we also recommend that statistics should be gathered on dictionary objects. See:

Document 457926.1 How to Gather Statistics on SYS Objects and 'Fixed' Objects?
For very large systems, the gathering of statistics can be a very time consuming and resource intensive activity. In this environment sample sizes need to be carefully controlled to ensure that gathering completes within acceptable timescale and resource constraints and within the maintenance window. For guidance on this topic See:

Document 44961.1 Statistics Gathering: Frequency and Strategy Guidelines

In these environments, it is also recommended to utilise change based statistics gathering to avoid re-gathering information unnecessarily.Please see:

Document 237901.1 Gathering Schema or Database Statistics Automatically - Examples
Document 377152.1 Best Practices for automatic statistics collection on Oracle 10g

Gathering Object statistics

The Cost Based Optimizer (CBO) uses statistics to determine the execution plan for a particular query. Potentially, with reduced sample sizes, sampling could produce different statistics due to chance groupings of data that may be the result of differing loading methods etc.

On 10g it is recommended to:

Gather statistics using scheduled statistics gathering scripts. In most cases the default scripts provide an adequate level of sampling taking into account the following recommendations:
Use sample size that is large enough. On 10g support suggests an estimate sample size of 100% (if it is possible for this to fit within the maintenance window), even if that means that statistics are gathered on a reduced frequency. If 100% is not feasible, try using at least an estimate of 30%. Generally, the accuracy of the statistics overall outweighs the day to day changes in most applications. This setting is because the default AUTO_SAMPLE_SIZE uses a very small estimate percentage which can result in poor estimates.
Ensure all objects (tables and indexes) have stats gathered. An easy way to achieve this is to use the CASCADE parameter.
Ensuring that any columns with skewed data distribution have histograms collected, and at sufficient resolution using the METHOD_OPT parameter. Support recommends a conservative and more plan-stable approach of "adding a histogram only if it is known to be needed" rather than collecting column statistics on all columns. This can be achieved manually or by using the SKEWONLY option to automatically add column statistics to columns that contain data with a non-uniform distribution. Using the default column statistics setting of AUTO which means that DBMS_STATS will decide which columns to add histogram to where it believes that they may help to produce a better plan. If statistics are not completely up to date then the presence of Histograms can cause trouble when parsing values are out of range, or between values for "frequency" histograms. In these circumstances the optimizer has to make guesses which may be inaccurate and, on occasion, cause poor plans.

Note that in earlier versions the default setting for the METHOD_OPT parameter was "FOR ALL COLUMNS SIZE 1" which would collect only a high and a low value and effectively meant that there were no detailed column statistics. It is known that in some cases, the effect of a histogram is adverse to the generation of a better plan so users moving between versions may initially wish to set this parameter to its pre-upgrade release value, and later adjust to the post-upgrade release default value. See:
Document 465787.1 Managing CBO Stats during an upgrade to 10g or 11g

As ever, testing different values with the application will yield the best results.

Note also that it only makes sense to collect column statistics (histograms) if your application has the facility to use them. Specifically, if you use bind variables in the application but do not peek the values (for example with _OPTIM_PEEK_USER_BINDS = FALSE ), then the optimizer will not have any value information to lookup column data from and will be unable to utilize column statistic information to improve cardinality estimates.
If partitions are in use, gather global statistics if possible due to time constraints. Global stats are very important but gathering is often avoided due to the sizes involved and length of time to required. If 100% samples are not possible then support would recommend going for a minimum of 1%. Gathering with small sample sizes (e.g. 0.001, 0.0001, 0.00001 etc. ) can be very effective but equally, a large proportion of the data will not be examined which could prove decisive to the optimizer's plan choices. Note that the available range for the ESTIMATE_PERCENT parameter is a very flexible [0.000001 -> 100] which can use very small sample sizes suitable for huge partitioned tables. Testing will reveal the most suitable settings for each system.
See:
Document 236935.1 Global statistics - An Explanation
Gather system statistics to reflect the CPU loading of the system and to improve the accuracy of the CBO's estimates by providing the CBO with CPU cost estimates in addition to the normal I/O cost estimates. See:
Document 470316.1 Using Actual System Statistics (Collected CPU and IO information
Document 149560.1 Collect and Display System Statistics (CPU and IO) for CBO usage
Document 153761.1 Scaling the System to Improve CBO optimizer

Note that the defaults for statistics gathering on different versions of Oracle are not necessarily the same, for example:

ESTIMATE_PERCENT: defaults:
- 9i : 100%
- 10g : DBMS_STATS.AUTO_SAMPLE_SIZE (using very small estimate percentage)
- 11g : DBMS_STATS.AUTO_SAMPLE_SIZE (using larger estimate percentage - 100%)
METHOD_OPT: defaults:
- 9i : "FOR ALL COLUMNS SIZE 1" effectively no detailed column statistics.
- 10g and 11g : "FOR ALL COLUMNS SIZE AUTO" - This setting means that DBMS_STATS decides which columns to add histogram to where it believes that they may help to produce a better plan.

Note that on 11g, although using auto size for ESTIMATE_PERCENT tends to default to 100% ,because this is an auto sample, the engine may still decide to use a different sample size for tables and columns. This means that Column statistics could still be gathered with a small sample size and create a histogram that is missing key values. When ESTIMATE_PERCENT is set to a specific numeric value, that value will be used for both the table and columns.

Additionally, even though a 100% sample is collected, the gathering process is really fast since a new hashing algorithm is used to compute the statistics rather than sorting (in 9i and 10g the "slow" part was typically the sorting). In 10g, support experience has shown that the default ESTIMATE_PERCENT sample size was extremely small which often resulted in poor statistics and is therefore not recommended. <<<< 看来100%收集也不是全表扫描，而是使用hashing algorithm

SAMPLE STATISTIC GATHERING COMMANDS

Gathering statistics an individual table

  exec dbms_stats.gather_table_stats(  - 
 
        ownname => '  Schema_name ', - 
 
        tabname => '  Table_name  ', - 
 
        estimate_percent => 100,  - 
 
        cascade => TRUE,  - 
 
        method_opt => 'FOR ALL COLUMNS SIZE 1' );

N.B. replace ' Schema_name ' and ' Table_name ' with the name of the schema
and table to gather statistics for respectively.

Gathering statistics for all objects in a schema

  exec dbms_stats.gather_schema_stats( - 
 
  ownname => '  Schema_name ', - 
 
 cascade => TRUE, - 
 
 method_opt => 'FOR ALL COLUMNS SIZE 1' );

N.B. replace '  Schema_name ' with the name of the desired schema.

Gathering statistics for all objects in the database:

  exec dbms_stats.gather_database_stats( - 
 
 cascade => TRUE, - 
 
 method_opt => 'FOR ALL COLUMNS SIZE 1' ); 

NOTE:

For cases where column data is known to be skewed and column statistics are known to be beneficial, Replace:

method_opt => 'FOR ALL COLUMNS SIZE 1'

with

method_opt => 'FOR ALL COLUMNS SIZE AUTO'

or with

method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY'

to automatically add column statistics to columns that contain data with a non-uniform distribution

songyb

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
How to Gather Optimizer Statistics on 10g (Doc ID 605439.1)

In this Document Goal Solution ReferencesAPPLIES TO:Oracle Database - Personal Edition - Version 10.1.0.2 to 10.2.0.5 [Release 10.1 to 10.2]
复制链接

扫一扫