ORACLE查询优化及gather_plan_statistics hint

最新推荐文章于 2024-10-22 10:00:00 发布

wanglipo

最新推荐文章于 2024-10-22 10:00:00 发布

阅读量1.1k

点赞数 1

本文介绍如何使用gather_plan_statistics hint优化SQL查询，并通过实际案例展示了如何改善执行计划，减少资源消耗。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

转自：http://blog.csdn.net/indexman/article/details/42293007

查询优化手段和gather_plan_statistics hint：

在10g以后我们可以通过利用gather_plan_statistics提示来了解更多的SQL执行统计信息，具体使用方法如下：
如果在statistics_level参数设置为ALL，或者执行的sql使用了gather_plan_statistics hint，则在sql执行后，会在v$sql_plan_statistics_all视图中查到SQL的执行统计信息，例如逻辑读，物理读，基数等等。这些数据对于性能诊断有着非常大的帮助。同时v$sql_plan中的执行计划，与通过EXPLAINPLAN得到的执行计划相比，前者是oracle执行sql时真正使用的执行计划，而后者则可能不是真正的执行计划；同时有的时候，执行过的sql使用了绑定变量，而oracle在解析sql时通常会进行绑定变量窥探，这个时候我们不能使用EXPLAIN PLAN来得到那个sql的执行计划，就算得到的跟那个sql的真实的执行计划是不一样的，所以有时我们更愿意直接从v$sql_plan中得到执行计划。
下面结合tom的文章实例讲一下这个hint得用法：原文：http://www.oracle.com/technetwork/issue-archive/2014/14-nov/o64asktom-2298498.html
---------------------------------------------------------------------------------
一个查询调优的例子：
SQL> create table t
as
select case when mod(rownum,200000) = 0 then 5
else mod(rownum,4)
end X,
rpad( 'x', 100, 'x' ) data
from dual
connect by level <= 1000000
/

--此处为了迷惑优化器而制造了分布不均的数据（skew data）
SQL> create index t_idx on t(x);
Index created.

SQL> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.

SQL> select x, count(*)
from t
group by x
order by x
/

X COUNT(*)
——————————— ———————————
0 249995
1 250000
2 250000
3 250000
5 5

SQL> select /*+ gather_plan_statistics */
count(data)
from t
where x = 5;
/

--查看上面查询执行计划
SQL> select *
from table(
dbms_xplan.display_cursor( format=> 'allstats last' )
)
/

PLAN_TABLE_OUTPUT
——————————————————————————————————
SQL_ID cdwn5mqb0cpg1, child number 0
——————————————————————————————————
select /*+ gather_plan_statistics */
count(data)
from t
where x = 5

Plan hash value: 2966233522

———————————————————————————————————————————————————————————————————————————
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
———————————————————————————————————————————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.08 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.08 |
|* 2 | TABLE ACCESS FULL| T | 1 | 200K| 5 |00:00:00.08 |
———————————————————————————————————————————————————————————————————————————

Predicate Information (identified by operation id):
———————————————————————————————————————————————————————————————————————————

2 - filter("X"=5)

20 rows selected.

--可以看到执行计划走了全表扫描并且E-Rows，即预估基数为200K而A-rows，即实际返回基数为5差距相当大（a huge difference）。原因就是优化器对X列上分布不均的数据没有察觉。
--此时，我们需要提供X列上的直方图（histogram）让优化器掌控全面的基数信息。
SQL> select histogram
from user_tab_columns
where table_name = 'T'
and column_name = 'X';

HISTOGRAM
---------------
NONE

SQL> exec dbms_stats.gather_table_stats( user, 'T', no_invalidate=>false );

SQL> select histogram
from user_tab_columns
where table_name = 'T'
and column_name = 'X';

HISTOGRAM
---------------
FREQUENCY

----此处是用NO_INVALIDATE参数使之前的依赖游标失效，保证下次运行该查询执行硬解析（hard parse）
--再次执行前面的查询操作，查看运行时计划信息
PLAN_TABLE_OUTPUT
——————————————————————————————————
SQL_ID cdwn5mqb0cpg1, child number 0
——————————————————————————————————
select /*+ gather_plan_statistics */
count(data)
from t
where x = 5

Plan hash value: 1789076273

————————————————————————————————————————————————————————————————————————
| Id | Operation | Name | Starts | E-Rows | A-Rows |
————————————————————————————————————————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 1 | 182 | 5 |
|* 3 | INDEX RANGE SCAN | T_IDX | 1 | 182 | 5 |
————————————————————————————————————————————————————————————————————————
--------------
| A-Time |
--------------
|00:00:00.01 |
|00:00:00.01 |
|00:00:00.01 |
|00:00:00.01 |
Predicate Information (identified by operation id):
————————————————————————————————————————————————————————————————————————

3 - access("X"=5)

21 rows selected.

--此时执行计划走的是索引范围扫描且E-Rows已经很接近A-Rows，执行时间也大大缩减。
--这里你可能会犯嘀咕：咋第一次收集表信息时没有生成直方图？
看下tom的原话:
This change is likely due to the default value of the METHOD_OPT parameter used with DBMS_STATS. In Oracle Database 10g, the METHOD_OPT parameter defaults to a SIZE AUTO. After you ran a query, the database remembered the predicates and updated a dictionary table, SYS.COL_USAGE$. Then, the next time you ran DBMS_STATS to gather statistics on these tables, DBMS_STATS queried that table to find out what columns should have histograms collected automatically, based on past query workload. It looked at your predicates and said, "Hmm, these columns are candidates for histograms based on the queries the end users have been running."
听起来很神奇吧？事实却是如此！

△补充1：关于管道函数dbms_xplan.display_cursor( format=> 'allstats last' )要解释下几个参数的取值含义：
--- function display_cursor(sql_id varchar2 default null,
--- cursor_child_no integer default 0,
--- format varchar2 default 'TYPICAL')
---
--- - sql_id:
指定SQL_ID取值为V$SQL.SQL_ID, V$SESSION.SQL_ID,
或者V$SESSION.PREV_SQL_ID，如果不指定则默认为最后执行语句SQL_ID
--- - cursor_child_no:
指定sql游标的子号，取值为V$SQL.CHILD_NUMBER 或者 in V$SESSION.SQL_CHILD_NUMBER,V$SESSION.PREV_CHILD_NUMBER，
此参数只有指定sql_id情况下才有效。如果不指定，则指定sql_id下所有子游标都显示。
--- - format:
--- 指定输出列，值取自视图：v$sql_plan_statistics_all.
---
--- IOSTATS:
假设基本计划统计在SQL执行时已收集（使用gather_plan_statistics提示或设置statistics_level参数为ALL）
--- 此格式将展示所有游标执行的（或仅最后执行游标）IO统计。
--- MEMSTATS:
--- 假设PGA内存管理开启（例如：pga_aggregate_target参数设置为非0值），此格式允许展示内存管理统计
（例如：操作执行模式，内存使用，溢出到磁盘字节数）
--- ALLSTATS: 指定'IOSTATS MEMSTATS'的快捷命令
---
--- LAST:
--- 此格式就是指定仅显示最后执行sql的统计
---
--- Also, the following two formats are still supported for backward
--- compatibility: 另外以下2个参数为了向后兼容而保留
---
--- 'RUNSTATS_TOT': Same as 'IOSTATS', i.e. displays IO statistics
--- for all executions of the specified cursor.
--- 'RUNSTATS_LAST': Same as 'IOSTATS LAST', i.e. displays the runtime
--- statistics for the last execution of the cursor.
--- 权限要求:
必须具有：SELECT ON V$SQL_PLAN_STATISTICS_ALL,V$SQL, 和 V$SQL_PLAN权限。

△补充2：还可以使用另一路径查询SQL运行时执行计划：
HR@orcl> select /*+ gather_plan_statistics emp_plan*/count(1) from employees;

COUNT(1)
----------
107
HR@orcl> SELECT SQL_ID,CHILD_NUMBER FROM V$SQL WHERE SQL_TEXT LIKE '%emp_plan%' and SQL_TEXT NOT LIKE '%V$SQL%';

SQL_ID CHILD_NUMBER
------------- ------------
2c5q2d8489ybt 0

HR@orcl> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('2c5q2d8489ybt',0,'ALL IOSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------

SQL_ID 2c5q2d8489ybt, child number 0
-------------------------------------
select /*+ gather_plan_statistics emp_plan*/count(1) from employees

Plan hash value: 3580537945

----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| | 1 |00:00:00.02 | 1 | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | | | 1 |00:00:00.02 | 1 | 1 |
| 2 | INDEX FULL SCAN| EMP_EMAIL_UK | 1 | 107 | 1 (0)| 00:00:01 | 107 |00:00:00.02 | 1 | 1 |
----------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
2 - SEL$1 / EMPLOYEES@SEL$1

Column Projection Information (identified by operation id):

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------

1 - (#keys=0) COUNT(*)[22]

已选择25行。

---------------------------------------

Dylan Presents.