表分区性能分析
===========================================
基于成本的优化的统计信息收集/ DBMS_STATS与ANALYZE
基于成本的方法依赖于统计数据,如果基于成本
使用method,然后应合并所有表,群集和SQL语句访问的所有索引类型的统计信息。 如果表的大小和数据分布经常变化,请定期生成统计信息以确保统计信息准确表示表中的数据。
从测试分区中选择*;
这使用全局统计信息,但没有谓词
从s.amount_of_shift> 1000的测试中选择*;
这对多个分区使用谓词,并且可能使用全局统计信息
从测试分区(sep2009)中选择*,其中s.amount_of_shift> 1000;
这使用全局统计信息并断言为一个分区。
使用DBMS_STATS软件包收集全局统计信息更为有用,因为ANALYZE总是串行运行。 DBMS_STATS可以串行或并行运行。 只要有可能,DBMS_STATS就会调用一个并行查询来收集具有指定并行度的统计信息。 否则,它将调用串行查询或ANALYZE语句。 索引统计信息不是并行收集的。
ANALYZE收集各个分区的统计信息,然后根据分区统计信息计算全局统计信息。 DBMS_STATS可以收集每个分区的独立统计信息以及整个表或索引的全局统计信息。 根据要优化的SQL语句,优化器可以选择使用分区(或子分区)统计信息或全局统计信息。
CREATE TABLE PARTTAB(
ordid NUMBER,
PARTCOL DATE,
DETAILS NUMBER,
AMOUNT NUMBER)
PARTITION BY RANGE(PARTCOL)
SUBPARTITION BY HASH(DETAILS) SUBPARTITIONS 2
(PARTITION q1 VALUES LESS THAN(TO_DATE('01-04-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE1,
PARTITION q2 VALUES LESS THAN(TO_DATE('01-07-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE2,
PARTITION q3 VALUES LESS THAN(TO_DATE('01-10-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE3,
PARTITION q4 VALUES LESS THAN(TO_DATE('01-12-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE4
);
A local non prefixed index will be associated with it:
CREATE INDEX IDX_PARTTAB ON PARTTAB (ordid) LOCAL;
The PARTTAB table has been populated before to start the following examples.
GATHER_TABLE_STATS
-------------------
Collects table, column, and index statistics.
Compute, serial mode, without histograms, Default granularity.
============================================================
SQL> execute dbms_stats.gather_table_stats(-
>ownname => 'test',-
>tabname => 'PARTTAB',-
>partname => null,- --> Gather stats on all partitions.
>estimate_percent => null,- --> Compute mode
>block_sample => false,- --> Default value. No Sense in Compute mode
>method_opt => 'FOR ALL COLUMNS SIZE 1',- --> Table and columns statistics. No histogram generated
>degree => null,- --> default parallel degree based on DOP set on PARTTAB.
>granularity => 'default',- --> Gather Global and Partition statistics
>cascade => true ,- --> with index stats generated
>stattab => null,- --> The statistics will be stored in the dictionary.
>statid => null,-
>statown => null);
PL/SQL procedure successfully completed.
Index Statistics won't be calculated by default if CASCADE=>TRUE .
SQL> select table_name, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, AVG_ROW_LEN
, GLOBAL_STATS, USER_STATS, sample_size from user_tables
where table_name = 'PARTTAB';
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN GLOBAL_STATS USER_STATS SAMPLE_SIZE
---------- -------- ------ ------------ --------- ----------- ------------ ---------- -----------
PARTTAB 400 8 0 0 11 YES NO 400
Now that the statistics have been updated.
The column GLOBAL_STATS has been also initialized.
SQL> select partition_name "Partition", NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, AVG_ROW_LEN,
SAMPLE_SIZE, global_stats, user_stats
from user_tab_partitions
where table_name = 'PARTTAB'
order by partition_position
/
Partition NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN GLOBAL_STATS USER_STATS SAMPLE_SIZE
---------- -------- ------ ------------ ---------- ----------- ------------ ---------- -----------
Q1 100 2 0 0 11 YES NO 100
Q2 100 2 0 0 11 YES NO 100
Q3 100 2 0 0 11 YES NO 100
Q4 100 2 0 0 11 YES NO 100
The statistics are again obtained at the table level with the GLOBAL_STATS .
SQL> select partition_name "Partition", subpartition_name "Subpartition", NUM_ROWS, BLOCKS, EMPTY_BLOCKS
SAMPLE_SIZE, global_stats, user_stats
from user_tab_subpartitions
where table_name = 'PARTTAB'
order by partition_name, subpartition_position
/
Partition Subpartition NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN SAMPLE_SIZE GLOBAL_STATS USER_STATS
---------- ------------- -------- ------ ------------ --------- ----------- ----------- ------------ ----------
Q1 SYS_SUBP10365 NO NO
Q1 SYS_SUBP10366 NO NO
Q2 SYS_SUBP10367 NO NO
Q2 SYS_SUBP10368 NO NO
Q3 SYS_SUBP10369 NO NO
Q3 SYS_SUBP10370 NO NO
Q4 SYS_SUBP10371 NO NO
Q4 SYS_SUBP10372 NO NO
The statistics aren't computed at the subpartition level which is in phase
with the 'DEFAULT' granularity.
SQL>select COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED from
user_tab_col_statistics where table_name = 'PARTTAB'
/
COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS LAST_ANALYZED
--------------- ------------ ---------- ---------- ----------- -------------
ORDID 0 0 400 1 12-DEC-02
PARTCOL 4 .25 0 1 12-DEC-02
DETAILS 100 .01 0 1 12-DEC-02
AMOUNT 0 0 400 1 12-DEC-02
The NUM_BUCKETS is set to 1 as there is no histogram generation. but, the column
statistics are well initialized
The same result is showed below on each partition columns:
SQL>select partition_name, COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED from user_part_col_statistics
where table_name = 'PARTTAB'
/
PARTITION_ COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS LAST_ANALYZED
---------- --------------- ------------ ---------- ---------- ----------- -------------
Q1 ORDID 0 0 100 1 12-DEC-07
Q1 PARTCOL 1 1 0 1 12-DEC-07
Q1 DETAILS 100 .01 0 1 12-DEC-07
Q1 AMOUNT 0 0 100 1 12-DEC-07
Q2 ORDID 0 0 100 1 12-DEC-07
Q2 PARTCOL 1 1 0 1 12-DEC-07
Q2 DETAILS 100 .01 0 1 12-DEC-07
Q2 AMOUNT 0 0 100 1 12-DEC-07
Q3 ORDID 0 0 100 1 12-DEC-07
Q3 PARTCOL 1 1 0 1 12-DEC-07
Q3 DETAILS 100 .01 0 1 12-DEC-07
Q3 AMOUNT 0 0 100 1 12-DEC-07
Q4 ORDID 0 0 100 1 12-DEC-07
Q4 PARTCOL 1 1 0 1 12-DEC-07
Q4 DETAILS 100 .01 0 1 12-DEC-07
Q4 AMOUNT 0 0 100 1 12-DEC-07
the statistics loaded for subpartitions of the PARTTAB table are displayed below:
SQL> select subpartition_name "Subpartition", COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_NULLS,
NUM_BUCKETS from dba_subpart_col_statistics where table_name = 'PARTTAB'
order by column_name
/
Subpartition COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
--------------- --------------- ------------ ---------- ---------- -----------
SYS_SUBP10365 PARTCOL
SYS_SUBP10365 ORDID
SYS_SUBP10365 DETAILS
SYS_SUBP10365 AMOUNT
SYS_SUBP10366 PARTCOL
SYS_SUBP10366 ORDID
SYS_SUBP10366 DETAILS
SYS_SUBP10366 AMOUNT
SYS_SUBP10367 PARTCOL
SYS_SUBP10367 ORDID
SYS_SUBP10367 DETAILS
SYS_SUBP10367 AMOUNT
SYS_SUBP10368 PARTCOL
SYS_SUBP10368 ORDID
SYS_SUBP10368 DETAILS
SYS_SUBP10368 AMOUNT
SYS_SUBP10369 PARTCOL
SYS_SUBP10369 ORDID
SYS_SUBP10369 DETAILS
SYS_SUBP10369 AMOUNT
SYS_SUBP10370 PARTCOL
SYS_SUBP10370 ORDID
SYS_SUBP10370 DETAILS
SYS_SUBP10370 AMOUNT
SYS_SUBP10371 PARTCOL
SYS_SUBP10371 ORDID
SYS_SUBP10371 DETAILS
SYS_SUBP10371 AMOUNT
SYS_SUBP10372 PARTCOL
SYS_SUBP10372 ORDID
SYS_SUBP10372 DETAILS
SYS_SUBP10372 AMOUNT
没有统计信息加载到子分区的列上。
分区的对象可以包含多个统计信息。 这是因为可以为整个对象,分区或子分区生成统计信息。
感谢和问候,
维诺德·萨达南丹(Vinod Sadanandan)
甲骨文数据库管理员
From: https://bytes.com/topic/oracle/insights/745896-table-partition-performance-analysis