Managing Statistics
Understanding Statistics
Optimizer statistics are a collection of data that describe more details about the database and the objects in the database.These statistics are used by the query optimizer to choose the best excution plan for each SQL statement.Optimizer statistics include following:
*Table statistics
-Number of rows
-Number of blocks
-Average row length
*Column statistics
-Number of distinct values(NDV)in column
-Number of nulls in column
-Data distribution(histogram)数据分布直方图
-Extended statistics
*Index statistics
-Number of leaf blocks
-Levels
-Clustering factor
*System statistics(相对其他比较容易变化)
-I/O performance and utilization
-CPU performance and utilization
When statistics are updated for a database object,Oracle invalidates(使无效) any currently parsed(已解析) SQL statements that access the object.The next time such a statement executes,the statement is re-parsed and the optimizer automatically chooses a new execution plan based on the new statistics.Distributed statements accessing objects with new statistics on remote database are not invalidated.The new statistics take effect the next time the SQL statement is parsed.
Managing Statistics
Use the dbms_stats package:
gather_table_stats收集某个表的统计信息
gather_index_stats
gather_schema_stats
gather_database_stats
gather_stale_stats收集不新鲜的统计信息
实验一:收集表的统计信息
1.创建一个表t,插入几条记录
2.desc dba_tables
3.select num_rows,blocks from dba_tables owner='HR' and tablename='T';发现没有显示统计信息,因为刚创建完这个表,还没收集统计信息。
4.exec dbms_stats.gather_table_stats('HR','T');收集统计信息
5.再执行第三步,就有统计信息放在dba_tables这个数据字典里了。
When to Gather Statistics
When gathering statistics manually,you not only need to determine how to gather statistics,but also when and how often to gather new statistics.
手动收集信息,需要自己决定何时收集,怎么收集等信息。
For an appliation in which tables are being incrementally modified,you may only need to gather new statistics every week or every month.The simplest way to gather statistics in these enviroment is to use a script or job scheduling tool to regularly run the GATHER_SCHEMA_STATS and GATHER_DATABASE_STATS procedures.The frequency of collection intervals should balance the task of providing accurate statistics for the optimizer against the processing overhead incurred by the statistics collection process.
对于增量的修改,例如对一个几万行的数据,有几十行的增加,你只需要定期用一个脚本或者过程收集新的统计信息,太频繁执行虽然会带来最新的统计信息,但是也会有一定的额外负载,需要权衡这个收集的频率。
For tables which are being substantially modified in batch operations,such as with bulk loads,statistics should be gathered on those tables as part of the batch operation.The DBMS_STATS procedure should be called as soon as the load operation completes.
对于大规模的数据修改,例如批量导入,那在每次这种大批量操作后都应该收集一次统计信息。
For partitioned tables,there are often cases in which only a single partition is modified.In those cases,statistics can be gathered only on those partitions rather than gathering statistics for the entire table.Howerver,gathering global statistics for the partitioned table may still be necessary.
对于分区表,有时候只需要收集改动了的那个分区的统计信息,而不用所有分区都去收集。有时候表的全局的统计信息也是很重要的。
Understanding Statistics
Optimizer statistics are a collection of data that describe more details about the database and the objects in the database.These statistics are used by the query optimizer to choose the best excution plan for each SQL statement.Optimizer statistics include following:
*Table statistics
-Number of rows
-Number of blocks
-Average row length
*Column statistics
-Number of distinct values(NDV)in column
-Number of nulls in column
-Data distribution(histogram)数据分布直方图
-Extended statistics
*Index statistics
-Number of leaf blocks
-Levels
-Clustering factor
*System statistics(相对其他比较容易变化)
-I/O performance and utilization
-CPU performance and utilization
When statistics are updated for a database object,Oracle invalidates(使无效) any currently parsed(已解析) SQL statements that access the object.The next time such a statement executes,the statement is re-parsed and the optimizer automatically chooses a new execution plan based on the new statistics.Distributed statements accessing objects with new statistics on remote database are not invalidated.The new statistics take effect the next time the SQL statement is parsed.
Managing Statistics
Use the dbms_stats package:
gather_table_stats收集某个表的统计信息
gather_index_stats
gather_schema_stats
gather_database_stats
gather_stale_stats收集不新鲜的统计信息
实验一:收集表的统计信息
1.创建一个表t,插入几条记录
2.desc dba_tables
3.select num_rows,blocks from dba_tables owner='HR' and tablename='T';发现没有显示统计信息,因为刚创建完这个表,还没收集统计信息。
4.exec dbms_stats.gather_table_stats('HR','T');收集统计信息
5.再执行第三步,就有统计信息放在dba_tables这个数据字典里了。
When to Gather Statistics
When gathering statistics manually,you not only need to determine how to gather statistics,but also when and how often to gather new statistics.
手动收集信息,需要自己决定何时收集,怎么收集等信息。
For an appliation in which tables are being incrementally modified,you may only need to gather new statistics every week or every month.The simplest way to gather statistics in these enviroment is to use a script or job scheduling tool to regularly run the GATHER_SCHEMA_STATS and GATHER_DATABASE_STATS procedures.The frequency of collection intervals should balance the task of providing accurate statistics for the optimizer against the processing overhead incurred by the statistics collection process.
对于增量的修改,例如对一个几万行的数据,有几十行的增加,你只需要定期用一个脚本或者过程收集新的统计信息,太频繁执行虽然会带来最新的统计信息,但是也会有一定的额外负载,需要权衡这个收集的频率。
For tables which are being substantially modified in batch operations,such as with bulk loads,statistics should be gathered on those tables as part of the batch operation.The DBMS_STATS procedure should be called as soon as the load operation completes.
对于大规模的数据修改,例如批量导入,那在每次这种大批量操作后都应该收集一次统计信息。
For partitioned tables,there are often cases in which only a single partition is modified.In those cases,statistics can be gathered only on those partitions rather than gathering statistics for the entire table.Howerver,gathering global statistics for the partitioned table may still be necessary.
对于分区表,有时候只需要收集改动了的那个分区的统计信息,而不用所有分区都去收集。有时候表的全局的统计信息也是很重要的。