ORACLE SQL 优化相关概念

[oracle@MaxwellDBA admin]$ sqlplus sys/sys as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Tue Oct 18 07:13:55 2022
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

sys@cdb$root:orclcdb> 

sys@cdb$root:orclcdb> select * from v$version where rownum=1;

BANNER                                                                           BANNER_FULL
-------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------
BANNER_LEGACY                                                                        CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production           Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
                                                                                 Version 19.3.0.0.0
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production                    0


1 row selected.

sys@cdb$root:orclcdb> show user
USER is "SYS"
sys@cdb$root:orclcdb> grant dba to scott;
grant dba to scott
             *
ERROR at line 1:
ORA-01917: user or role 'SCOTT' does not exist


sys@cdb$root:orclcdb> ALTER SESSION SET CONTAINER=ORCLPDB1;

Session altered.

sys@cdb$root:orclcdb> grant dba to SCOTT;

Grant succeeded.

sys@cdb$root:orclcdb> conn SCOTT/TIGER@ORCLPDB1;
Connected.
scott@orclpdb1:orclcdb> create table test as select * from dba_objects;

Table created.

scott@orclpdb1:orclcdb> 

概念1.基数(CARDINALITY)

某个列唯一键(Distinct Keys)的数量叫作基数。

主键列的基数等于表的总行数。基数的高低影响列的数据分布。

返回表中5%以内的数据走索引,返回超过5%的数据走全表扫描。

scott@orclpdb1:orclcdb> select count(*) from test where owner = 'SYS';

  COUNT(*)
----------
     52430

1 row selected.

scott@orclpdb1:orclcdb> select 52430/73516*100 "percent" from dual;

   percent
----------
71.3178084

1 row selected.

scott@orclpdb1:orclcdb> select count(*) from test where owner = 'SCOTT';

  COUNT(*)
----------
        30

1 row selected.

scott@orclpdb1:orclcdb> select 30/73516*100 "percent" from dual;

   percent
----------
.040807443

1 row selected.

scott@orclpdb1:orclcdb> 

如何查看数据分布:

select column_name, count(*) from table_name group by column_name order by 2 desc;

概念2: 选择性(Selectivity)

基数/总函数 * 100% 就是某个列的选择性

为了查看选择性,必须先收集统计信息。

以下为收集统计信息的脚本:

BEGIN
  DBMS_STATS.gather_table_stats(ownname => 'SCOTT',
                                tabname => 'TEST',
                                estimate_percent => 100,
                                method_opt => 'for all columns size 1',
                                no_invalidate => FALSE,
                                degree => 1,
                                cascade => TRUE);
END;
/

查看表中每个列的基数与选择性:

/*查看表中列的基数与选择性*/

select a.column_name,
       b.num_rows,
       a.num_distinct Cardinality,
       round(a.num_distinct/ b.num_rows * 100, 2) selectivity,
       a.histogram,
       a.num_buckets
from dba_tab_col_statistics a,dba_tables b
where a.owner = b.owner
and a.table_name = b.table_name
and a.owner = 'SCOTT'
and a.table_name = 'TEST';

优化思想1: 什么样的列必须要创建索引呢? 当一个列出现在where 条件中,该列没有创建索引并且选择性大于20%,那么该列就必须创建索引,从而优化SQL查询性能。

SQL优化的核心观点1: 只有大表才会产生性能问题。

抓住必须创建索引的列(调整后,可用于prod environment)

抓出表中where条件中的列可通过以下两种方法:

  • V$SQL_PLAN
  • 自动化脚本如下:
select r.name owner,
       o.name table_name,
       c.name column_name,
       equality_preds, ---等值过滤
       equijoin_preds, ---等值JOIN 比如 where a.id = b.id
       nonequijoin_preds, ---不等JOIN
       range_preds,  --- 范围过滤次数 > >= < <= between and
       like_preds, --- LIKE 过滤
       null_preds, --- NULL 过滤
       timestamp
from sys.col_usage$ u,sys.obj$ o, sys.col$ c, sys.user$ r
where o.obj# = u.obj#
  and c.obj# = u.obj#
  and c.col# = u.intcol#
  and r.name = 'SCOTT'
  and o.name = 'TEST';

自动化脚本抓取必须要建索引的列:

select owner,
       column_name,
       num_rows,
       Cardinality,
       Selectivity,
       'Need index' as notice
  from (select b.owner,
               a.column_name,
               b.num_rows,
               a.num_distinct Cardinality,
               round(a.num_distinct / b.num_rows * 100, 2) selectivity
          from dba_tab_col_statistics a, dba_tables b
         where a.owner = b.owner
           and a.table_name = b.table_name
           and a.owner = 'SCOTT'
           and a.table_name = 'TEST')
 where selectivity >= 20
   and column_name not in (select column_name
                        from dba_ind_columns
                       where table_owner = 'SCOTT'
                         and table_name = 'TEST')
   and column_name in
       (select c.name
          from sys.col_usage$ u, sys.obj$ o, sys.col$ c, sys.user$ r
         where o.obj# = u.obj#
           and c.obj# = u.obj#
           and c.col# = u.intcol#
           and r.name = 'SCOTT'
           and o.name = 'TEST')

概念3 直方图(Histogram)

如果没有对基数低的列收集直方图统计信息,基于成本的优化器(CBO)会认为该列数据分布是均衡的。

/*收集统计信息 for all columns size 1 表示对所有列都不收集直方图*/

BEGIN
  DBMS_STATS.gather_table_stats(ownname => 'SCOTT',
                                tabname => 'TEST',
                                estimate_percent => 100,
                                method_opt => 'for all columns size 1',
                                no_invalidate => FALSE,
                                degree => 1,
                                cascade => TRUE);
END;
/

Histogram 为none表示没有收集直方图

-------------查看表中每个列的基数和选择性---------------------------------

select a.column_name,
       b.num_rows,
       a.num_distinct Cardinality,
       round(a.num_distinct/ b.num_rows * 100, 2) selectivity,
       a.histogram,
       a.num_buckets
from dba_tab_col_statistics a,dba_tables b
where a.owner = b.owner
and a.table_name = b.table_name
and a.owner = 'SCOTT'
and a.table_name = 'TEST';

 

sys@cdb$root:orclcdb> conn SCOTT/TIGER@ORCLPDB1;
Connected.
scott@orclpdb1:orclcdb> set autot trace
scott@orclpdb1:orclcdb> 
scott@orclpdb1:orclcdb> select * from test where owner='SCOTT';

30 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1935 |   249K|   398   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST |  1935 |   249K|   398   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OWNER"='SCOTT')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1432  consistent gets
          0  physical reads
          0  redo size
       4947  bytes sent via SQL*Net to client
        411  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         30  rows processed

scott@orclpdb1:orclcdb> create index idx_owner on test(owner);

Index created.

scott@orclpdb1:orclcdb> select * from test where owner='SCOTT';

30 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3011311971

-------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |           |  1935 |   249K|    57   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST      |  1935 |   249K|    57   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IDX_OWNER |  1935 |       |     5   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OWNER"='SCOTT')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         10  consistent gets
          1  physical reads
          0  redo size
       6244  bytes sent via SQL*Net to client
        411  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         30  rows processed

scott@orclpdb1:orclcdb> select * from test where owner='SYS';

52430 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3011311971

-------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |           |  1935 |   249K|    57   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST      |  1935 |   249K|    57   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IDX_OWNER |  1935 |       |     5   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OWNER"='SYS')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       8218  consistent gets
        111  physical reads
          0  redo size
    8096719  bytes sent via SQL*Net to client
      38843  bytes received via SQL*Net from client
       3497  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      52430  rows processed
scott@orclpdb1:orclcdb> set autot off
scott@orclpdb1:orclcdb> select round(73516/38) from dual;

ROUND(73516/38)
---------------
           1935

1 row selected.

scott@orclpdb1:orclcdb> 

执行计划里面的Rows是假的。在做SQL优化的时候,经常需要做的工作就是帮助CBO计算比较准确的Rows.

如果CBO每次都能计算得到精确的Rows,那么我们只需要关心业务逻辑、表设计、SQL写法以及如果建立索引了。不用担心SQL会走错执行计划了。

scott@orclpdb1:orclcdb> 
scott@orclpdb1:orclcdb> 
scott@orclpdb1:orclcdb> BEGIN
  2    DBMS_STATS.gather_table_stats(ownname => 'SCOTT',
                                tabname => 'TEST',
  4                                  estimate_percent => 100,
  5                                  method_opt => 'for all columns size skewonly',
  6                                  no_invalidate => FALSE,
                                degree => 1,
  8                                  cascade => TRUE);
  9  END;
 10  /

PL/SQL procedure successfully completed.

scott@orclpdb1:orclcdb> 
scott@orclpdb1:orclcdb> select a.column_name,
  2         b.num_rows,
       a.num_distinct Cardinality,
       round(a.num_distinct / b.num_rows * 100, 2) selectivity,
  5         a.histogram,
  6         a.num_buckets
  from dba_tab_col_statistics a, dba_tables b
  8   where a.owner = b.owner
  9     and a.table_name = b.table_name
 10     and a.owner = 'SCOTT'
 11     and a.table_name = 'TEST';

COLUMN_NAME                                                                                                                        NUM_ROWS CARDINALITY SELECTIVITY HISTOGRAM       NUM_BUCKETS
-------------------------------------------------------------------------------------------------------------------------------- ---------- ----------- ----------- --------------- -----------
OWNER                                                                                                                                 73516          38         .05 FREQUENCY                38
OBJECT_NAME                                                                                                                           73516       61420       83.55 HEIGHT BALANCED         254
SUBOBJECT_NAME                                                                                                                        73516         417         .57 HEIGHT BALANCED         254
OBJECT_ID                                                                                                                             73516       73514         100 NONE                      1
DATA_OBJECT_ID                                                                                                                        73516        6585        8.96 HEIGHT BALANCED         254
OBJECT_TYPE                                                                                                                           73516          47         .06 FREQUENCY                47
CREATED                                                                                                                               73516         990        1.35 HEIGHT BALANCED         254
LAST_DDL_TIME                                                                                                                         73516        1260        1.71 HEIGHT BALANCED         254
TIMESTAMP                                                                                                                             73516        1178         1.6 HEIGHT BALANCED         254
STATUS                                                                                                                                73516           2           0 FREQUENCY                 2
TEMPORARY                                                                                                                             73516           2           0 FREQUENCY                 2

COLUMN_NAME                                                                                                                        NUM_ROWS CARDINALITY SELECTIVITY HISTOGRAM       NUM_BUCKETS
-------------------------------------------------------------------------------------------------------------------------------- ---------- ----------- ----------- --------------- -----------
GENERATED                                                                                                                             73516           2           0 FREQUENCY                 2
SECONDARY                                                                                                                             73516           2           0 FREQUENCY                 2
NAMESPACE                                                                                                                             73516          23         .03 FREQUENCY                23
EDITION_NAME                                                                                                                          73516           0           0 NONE                      0
SHARING                                                                                                                               73516           4         .01 FREQUENCY                 4
EDITIONABLE                                                                                                                           73516           2           0 FREQUENCY                 2
ORACLE_MAINTAINED                                                                                                                     73516           2           0 FREQUENCY                 2
APPLICATION                                                                                                                           73516           1           0 FREQUENCY                 1
DEFAULT_COLLATION                                                                                                                     73516           1           0 FREQUENCY                 1
DUPLICATED                                                                                                                            73516           1           0 FREQUENCY                 1
SHARDED                                                                                                                               73516           1           0 FREQUENCY                 1

COLUMN_NAME                                                                                                                        NUM_ROWS CARDINALITY SELECTIVITY HISTOGRAM       NUM_BUCKETS
-------------------------------------------------------------------------------------------------------------------------------- ---------- ----------- ----------- --------------- -----------
CREATED_APPID                                                                                                                         73516           0           0 NONE                      0
CREATED_VSNID                                                                                                                         73516           0           0 NONE                      0
MODIFIED_APPID                                                                                                                        73516           0           0 NONE                      0
MODIFIED_VSNID                                                                                                                        73516           0           0 NONE                      0

26 rows selected.

scott@orclpdb1:orclcdb> set autot trace
scott@orclpdb1:orclcdb> select * from test where owner='SCOTT';

30 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3011311971

-------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |           |    30 |  3960 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST      |    30 |  3960 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IDX_OWNER |    30 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OWNER"='SCOTT')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       6244  bytes sent via SQL*Net to client
        411  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         30  rows processed

scott@orclpdb1:orclcdb> select * from test where owner='SYS';

52430 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 52430 |  6758K|   399   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST | 52430 |  6758K|   399   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OWNER"='SYS')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       4856  consistent gets
          0  physical reads
          0  redo size
    3453492  bytes sent via SQL*Net to client
      38843  bytes received via SQL*Net from client
       3497  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      52430  rows processed

scott@orclpdb1:orclcdb> 

直方图是用来帮助CBO在对基数很低、数据分布不均衡的列进行Rows估算的时候,可以得到更精确的Rows就够了。

什么样的列需要收集直方图呢?当列出现在where条件中,列的选择性小于1%并且该列没有收集过直方图,这样的列就应该收集直方图

注意:千万不能对没有出现在where条件中的列收集直方图。对没有出现在where条件中的列收集直方图完全是做无用功,浪费数据库资源。

抓出必须创建直方图的列:

select a.owner,
       a.table_name,
       a.column_name,
       b.num_rows,
       a.num_distinct Cardinality,
       trunc(a.num_distinct / b.num_rows * 100, 2) selectivity,
       'Need Gather Histogram' notice
  from dba_tab_col_statistics a, dba_tables b
 where a.owner = b.owner
   and a.table_name = b.table_name
   and a.owner = 'SCOTT'
   and a.table_name = 'TEST'
   and a.table_name = b.table_name
   and num_distinct / num_rows < 0.01
   and (a.owner, a.table_name, a.column_name) in
       (select r.name owner, o.name table_name, c.name column_name
          from sys.col_usage$ u, sys.obj$ o, sys.col$ c, sys.user$ r
         where o.obj# = u.obj#
           and c.obj# = u.obj#
           and c.col# = u.intcol#
           and r.name = 'SCOTT'
           and o.name = 'TEST')
   and a.histogram = 'NONE';

概念4 回表(TABLE ACCESS BY INDEX ROWID)

当对一个列创建索引之后,索引会包含该列的键值以及键值所对应行所在的rowid.

回表:通过索引中记录的rowid访问回表中的数据

回表属于单块读,回表次数太多会严重影响SQL性能,如果回表次数太多,就不应该走索引扫描,应该走全表扫描。

在进行SQL优化的时候,一定要注意回表次数,特别是要注意回表的物理I/O次数。

scott@orclpdb1:orclcdb> select * from test where owner='SYS';

52430 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3011311971

-------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |           |  1935 |   249K|    57   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST      |  1935 |   249K|    57   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IDX_OWNER |  1935 |       |     5   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OWNER"='SYS')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       8218  consistent gets
        111  physical reads
          0  redo size
    8096719  bytes sent via SQL*Net to client
      38843  bytes received via SQL*Net from client
       3497  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      52430  rows processed

scott@orclpdb1:orclcdb>

 TABLE ACCESS BY INDEX ROWID BATCHED 就是回表,索引返回多少行数据,回表就要回多少次,每次回表都是单块读。该SQL语句返回了52430行数据,那么回表就回了52430行数据。

 SQL的性能确实大部分损失在回表中,是否能回答返回表中5%内的数据走索引,超过表中5%的数据走全表扫描?根本原因在于回表

如何消除回表?

通过建立组合索引

概念5 集群因子(CLUSTER FACTOR)

集群因子用于判断索引回表需要消耗的物理I/O次数。

当索引范围扫描,索引全扫描不回表或者返回数据量很少的时候,不管集群因子多大,对SQL查询性能几乎没有任何影响。再次强调一遍,在进行SQL优化的时候,往往回建立合适的组合索引消除回表。或者建立组合索引尽量减少回表次数。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值