Trafodion执行计划之SCAN Operator

Trafodion对数据的扫描方式主要有五种:

  1. 全表扫描
  2. 基于主键的扫描
  3. 基于主键的MDAM扫描
  4. 索引扫描
  5. 基于主键的VSBB扫描

全表扫描

全表扫描,即要扫描表的全部数据,非基于Key的过滤扫描,也非基于索引的过滤扫描。典型的情况是基于表上的非主键列或索引列的条件过滤。
比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where CREATE_NUM=0;
其中,CREATE_NUM字段为表中非主键且非索引字段,其详细执行计划如下,

TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ......... 4,682,005
EST_OPER_COST .......... 230.72
EST_TOTAL_COST ......... 230.72
DESCRIPTION
  max_card_est ........... 4.682e+06
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length ...... 1,678
  scan_type .............. subset scan of table TRAFODION.DAAS_GX.DMA_ENTTYPE_S
                             TAT_NOPRAT
  object_type ............ Trafodion
  columns ................ all
  begin_keys(incl)
  end_keys(incl)
  cache_size ......... 1,024
  probes ................. 1
  rows_accessed .......... 5.03481e+06
  column_retrieved ....... #1:1
  key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
  executor_predicates .... (CREATE_NUM = %(0))

简化的执行计划如下,

SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where CREATE_NUM=0;

 
LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD   
---- ---- ---- --------------------  --------  --------------------  ---------
 
1    .    2    root                                                  4.68E+006
.    .    1    trafodion_scan                  DMA_ENTTYPE_STAT_NOP  4.68E+006

基于主键的扫描

基于Key的扫描,即基于Trafodion表的主键/聚集键的扫描,通常是基于主键/聚集键的第一个字段(如果有多个字段的话)的条件过滤。
比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON_YEAR=‘2018’;
其中,STAT_MON_YEAR为表的主键的第一个字段,其详细执行计划如下,

TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ........... 600,600
EST_OPER_COST ........... 27.53
EST_TOTAL_COST .......... 27.53
DESCRIPTION
  max_card_est ..... 600,600
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length ...... 1,672
  scan_type .............. subset scan of table TRAFODION.DAAS_GX.DMA_ENTTYPE_S
                             TAT_NOPRAT
  object_type ............ Trafodion
  cache_size ......... 1,024
  probes ................. 1
  rows_accessed .... 600,600
  column_retrieved ....... #1:1
  key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
  begin_key .............. (STAT_MON_YEAR = %('2018')), (STAT_MON = '<min>'),
                             (DATA_ID = <min>)
  end_key ................ (STAT_MON_YEAR = %('2018')), (STAT_MON = '<max>'),
                             (DATA_ID = <max>)

简化的执行计划如下,

SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON_YEAR='2018';

 
LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD   
---- ---- ---- --------------------  --------  --------------------  ---------
 
1    .    2    root                                                  6.00E+005
.    .    1    trafodion_scan                  DMA_ENTTYPE_STAT_NOP  6.00E+005

基于主键的MDAM扫描

MDAM扫描,本质上也是基于Key的扫描的一种形式,MDAM是Trafodion特有的一项专利技术,关于MDAM的介绍,读者可参考https://blog.csdn.net/Post_Yuan/article/details/79277974及https://blog.csdn.net/Post_Yuan/article/details/79280406中有关于MDAM的一点介绍。与上述基于Key的扫描不同的是,MDAM扫描一般是基于主键的非前导列(非第一个主键列)
比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON=‘201801’;
其中,STAT_MON是表的主键的第二个字段,详细执行计划如下,

TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ............ 76,200
EST_OPER_COST ............ 3.78
EST_TOTAL_COST ........... 3.78
DESCRIPTION
  max_card_est ...... 76,200
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length ...... 1,688
  scan_type .............. subset scan limited by mdam of table
                             TRAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT
  object_type ............ Trafodion
  cache_size ......... 1,024
  probes ................. 1
  rows_accessed ..... 76,200
  column_retrieved ....... #1:1
  key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
  mdam_disjunct .......... (STAT_MON = %('201801'))

简化的执行计划如下,

SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON='201801';

 
LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD   
---- ---- ---- --------------------  --------  --------------------  ---------
 
1    .    2    root                                                  7.61E+004
.    .    1    trafodion_scan                  DMA_ENTTYPE_STAT_NOP  7.61E+004

索引扫描

索引扫描,即基于表中的索引列做条件过滤,比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;
ENTTYPE字段上建有相应的索引IDX_ENTTYPE,详细执行计划如下,

NESTED_JOIN ===============================  SEQ_NO 3        CHILDREN 1, 2
REQUESTS_IN .............. 1
ROWS_OUT ............ 35,963
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ......... 618.64
DESCRIPTION
  max_card_est ...... 35,963
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length ...... 1,688
  join_type .............. inner
  join_method ............ nested
 
 
TRAFODION_VSBB_SCAN =======================  SEQ_NO 2        NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ................. 1
EST_OPER_COST .......... 607.28
EST_TOTAL_COST ......... 607.28
DESCRIPTION
  max_card_est ........... 5.03481e+06
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length ...... 1,604
  scan_type .............. subset scan of table TRAFODION.DAAS_GX.DMA_ENTTYPE_S
                             TAT_NOPRAT
  object_type ............ Trafodion
  cache_size ......... 1,024
  probes ............ 35,963
  successful_probes   35,963
  unique_probes ..... 35,963
  rows_accessed ..... 35,963
  column_retrieved ....... #1:1
  key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
  begin_key .............. (STAT_MON_YEAR = TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_
                             MON_YEAR), (STAT_MON = TRAFODION.DAAS_GX.IDX_ENTTY
                             PE.STAT_MON), (DATA_ID =
                             TRAFODION.DAAS_GX.IDX_ENTTYPE.DATA_ID)
  end_key ................ (STAT_MON_YEAR = TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_
                             MON_YEAR), (STAT_MON = TRAFODION.DAAS_GX.IDX_ENTTY
                             PE.STAT_MON), (DATA_ID =
                             TRAFODION.DAAS_GX.IDX_ENTTYPE.DATA_ID)
 
 
TRAFODION_INDEX_SCAN ======================  SEQ_NO 1        NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ............ 35,963
EST_OPER_COST ........... 11.35
EST_TOTAL_COST .......... 11.35
DESCRIPTION
  max_card_est ...... 35,963
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length ......... 84
  scan_type .............. subset scan of index TRAFODION.DAAS_GX.IDX_ENTTYPE(T
                             RAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT)
  object_type ............ Trafodion
  columns ................ all
  begin_keys(incl)
  end_keys(incl)
  cache_size ......... 1,024
  probes ................. 1
  rows_accessed .......... 5.03481e+06
  column_retrieved ....... #1:1
  key_columns ............ TRAFODION.DAAS_GX.IDX_ENTTYPE.ENTTYPE,
                             TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_MON_YEAR,
                             TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_MON,
                             TRAFODION.DAAS_GX.IDX_ENTTYPE.DATA_ID
  executor_predicates .... (cast(TRAFODION.DAAS_GX.IDX_ENTTYPE.ENTTYPE) =
                             cast(1234))

简化的执行计划如下,

SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;

 
LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD   
---- ---- ---- --------------------  --------  --------------------  ---------
 
3    .    4    root                                                  3.59E+004
1    2    3    nested_join                                           3.59E+004
.    .    2    trafodion_vsbb_scan             DMA_ENTTYPE_STAT_NOP  1.00E+000
.    .    1    trafodion_index_scan            IDX_ENTTYPE           3.59E+004

基于主键的VSBB扫描

基于主键的VSBB扫描,一般发生在两表采用NESTED_JOINS关联方式且关联字段为主键的情况下,也时常发生在通常索引扫描并回表的情况下(如上述INDEX扫描中所示的执行计划)。例如如下语句,

>>create table tw (a int not null primary key);

--- SQL operation complete.
>>control query shape nested_join(cut,cut) ;

--- SQL operation complete.
>>explain options 'f' select * from tw a, tw b where a.a = b.a ;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

3    .    4    root                                                  1.00E+002
1    2    3    nested_join                                           1.00E+002
.    .    2    trafodion_vsbb_scan             TW                    1.00E+000
.    .    1    trafodion_scan                  TW                    1.00E+002

--- SQL operation complete.

通过以下执行计划可以发现,对TW表做了trafodion_vsbb_scan。详细的执行计划如下,

TRAFODION_VSBB_SCAN =======================  SEQ_NO 2        NO CHILDREN
TABLE_NAME ............... TW
REQUESTS_IN .............. 1
ROWS_OUT ................. 1
EST_OPER_COST ............ 0.02
EST_TOTAL_COST ........... 0.02
DESCRIPTION
  max_card_est ......... 100
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  record_length .......... 0
  scan_type .............. subset scan of table TRAFODION.SEABASE.TW A
  object_type ............ Trafodion
  cache_size ........... 100
  small_scanner .......... ON
  probes ............... 100
  successful_probes .... 100
  unique_probes ........ 100
  rows_accessed ........ 100
  column_retrieved ....... #1:1
  key_columns ............ A
  begin_key .............. (A = A)
  end_key ................ (A = A)

以上几种执行计划的结果显示,前三种执行计划(包括全表扫描、基于主键的扫描和MDAM扫描)的简化执行计划都显示的是trafodion_scan,而具体属于哪一种需要通过详细的执行计划才能看得出。第四种执行计划(索引扫描)区别较为明显,它是trafodion_index_scan和trafodion_vsbb_scan做nested_join的形式,意味着数据库会基于索引字段直接从索引表扫描找到对应的行集,并基于这些行集中的主键列信息快速定位到主表中的相应行信息。
对于索引扫描这个执行计划,由于语句是select *,需要获取表的所有字段信息,因此需要通过索引与主表的结合才能找出所有字段信息,如若把语句改为以下,
select ENTTYPE from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;
那么执行计划就可以不必再回溯到主表,因为索引中已经包含了查询需要的所有信息,

SQL>explain options 'f' select ENTTYPE from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;

 
LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD   
---- ---- ---- --------------------  --------  --------------------  ---------
 
1    .    2    root                                                  3.59E+004
.    .    1    trafodion_index_scan            IDX_ENTTYPE           3.59E+004

以下附属表的DDL结构,

SQL>showddl DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT;

 
CREATE TABLE TRAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT
  ( 
    DATA_ID                          LARGEINT NO DEFAULT NOT NULL NOT DROPPABLE
      NOT SERIALIZED
  , STAT_MON                         VARCHAR(8 CHARS) CHARACTER SET UTF8
      COLLATE DEFAULT NO DEFAULT NOT NULL NOT DROPPABLE NOT SERIALIZED
  , STAT_MON_YEAR                    CHAR(4 CHARS) CHARACTER SET UTF8 COLLATE
      DEFAULT NO DEFAULT NOT NULL NOT DROPPABLE NOT SERIALIZED
  , ENTTYPE                          CHAR(6 CHARS) CHARACTER SET UTF8 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , ENTTYPESUB                       CHAR(1 CHAR) CHARACTER SET UTF8 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , INDUSTRYPHY                      VARCHAR(128 CHARS) CHARACTER SET UTF8
      COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
  , DOMDISTRICT                      VARCHAR(128 CHARS) CHARACTER SET UTF8
      COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
  , ENT_SCALE                        VARCHAR(128 CHARS) CHARACTER SET UTF8
      COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
  , END_NUM                          LARGEINT DEFAULT NULL NOT SERIALIZED
  , CREATE_NUM                       LARGEINT DEFAULT NULL NOT SERIALIZED
  , EXIT_NUM                         LARGEINT DEFAULT NULL NOT SERIALIZED
  , REGCAP                           DECIMAL(18, 6) DEFAULT NULL NOT SERIALIZED
  , PRIMARY KEY (STAT_MON_YEAR ASC, STAT_MON ASC, DATA_ID ASC)
  )
 ATTRIBUTES ALIGNED FORMAT NAMESPACE 'TRAF_1500000' 
  HBASE_OPTIONS 
  ( 
    DATA_BLOCK_ENCODING = 'FAST_DIFF',
    COMPRESSION = 'SNAPPY',
    MEMSTORE_FLUSH_SIZE = '1073741824' 
  ) 
;

CREATE INDEX IDX_ENTTYPE ON TRAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT
  ( 
    ENTTYPE ASC
  )
 ATTRIBUTES ALIGNED FORMAT 
  HBASE_OPTIONS 
  ( 
    DATA_BLOCK_ENCODING = 'FAST_DIFF',
    COMPRESSION = 'SNAPPY',
    MEMSTORE_FLUSH_SIZE = '1073741824' 
  ) 
;
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

数据源的港湾

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值