Trafodion对数据的扫描方式主要有五种:
- 全表扫描
- 基于主键的扫描
- 基于主键的MDAM扫描
- 索引扫描
- 基于主键的VSBB扫描
全表扫描
全表扫描,即要扫描表的全部数据,非基于Key的过滤扫描,也非基于索引的过滤扫描。典型的情况是基于表上的非主键列或索引列的条件过滤。
比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where CREATE_NUM=0;
其中,CREATE_NUM字段为表中非主键且非索引字段,其详细执行计划如下,
TRAFODION_SCAN ============================ SEQ_NO 1 NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ......... 4,682,005
EST_OPER_COST .......... 230.72
EST_TOTAL_COST ......... 230.72
DESCRIPTION
max_card_est ........... 4.682e+06
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length ...... 1,678
scan_type .............. subset scan of table TRAFODION.DAAS_GX.DMA_ENTTYPE_S
TAT_NOPRAT
object_type ............ Trafodion
columns ................ all
begin_keys(incl)
end_keys(incl)
cache_size ......... 1,024
probes ................. 1
rows_accessed .......... 5.03481e+06
column_retrieved ....... #1:1
key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
executor_predicates .... (CREATE_NUM = %(0))
简化的执行计划如下,
SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where CREATE_NUM=0;
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
1 . 2 root 4.68E+006
. . 1 trafodion_scan DMA_ENTTYPE_STAT_NOP 4.68E+006
基于主键的扫描
基于Key的扫描,即基于Trafodion表的主键/聚集键的扫描,通常是基于主键/聚集键的第一个字段(如果有多个字段的话)的条件过滤。
比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON_YEAR=‘2018’;
其中,STAT_MON_YEAR为表的主键的第一个字段,其详细执行计划如下,
TRAFODION_SCAN ============================ SEQ_NO 1 NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ........... 600,600
EST_OPER_COST ........... 27.53
EST_TOTAL_COST .......... 27.53
DESCRIPTION
max_card_est ..... 600,600
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length ...... 1,672
scan_type .............. subset scan of table TRAFODION.DAAS_GX.DMA_ENTTYPE_S
TAT_NOPRAT
object_type ............ Trafodion
cache_size ......... 1,024
probes ................. 1
rows_accessed .... 600,600
column_retrieved ....... #1:1
key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
begin_key .............. (STAT_MON_YEAR = %('2018')), (STAT_MON = '<min>'),
(DATA_ID = <min>)
end_key ................ (STAT_MON_YEAR = %('2018')), (STAT_MON = '<max>'),
(DATA_ID = <max>)
简化的执行计划如下,
SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON_YEAR='2018';
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
1 . 2 root 6.00E+005
. . 1 trafodion_scan DMA_ENTTYPE_STAT_NOP 6.00E+005
基于主键的MDAM扫描
MDAM扫描,本质上也是基于Key的扫描的一种形式,MDAM是Trafodion特有的一项专利技术,关于MDAM的介绍,读者可参考https://blog.csdn.net/Post_Yuan/article/details/79277974及https://blog.csdn.net/Post_Yuan/article/details/79280406中有关于MDAM的一点介绍。与上述基于Key的扫描不同的是,MDAM扫描一般是基于主键的非前导列(非第一个主键列)
比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON=‘201801’;
其中,STAT_MON是表的主键的第二个字段,详细执行计划如下,
TRAFODION_SCAN ============================ SEQ_NO 1 NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ............ 76,200
EST_OPER_COST ............ 3.78
EST_TOTAL_COST ........... 3.78
DESCRIPTION
max_card_est ...... 76,200
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length ...... 1,688
scan_type .............. subset scan limited by mdam of table
TRAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT
object_type ............ Trafodion
cache_size ......... 1,024
probes ................. 1
rows_accessed ..... 76,200
column_retrieved ....... #1:1
key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
mdam_disjunct .......... (STAT_MON = %('201801'))
简化的执行计划如下,
SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where STAT_MON='201801';
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
1 . 2 root 7.61E+004
. . 1 trafodion_scan DMA_ENTTYPE_STAT_NOP 7.61E+004
索引扫描
索引扫描,即基于表中的索引列做条件过滤,比如以下语句,
select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;
ENTTYPE字段上建有相应的索引IDX_ENTTYPE,详细执行计划如下,
NESTED_JOIN =============================== SEQ_NO 3 CHILDREN 1, 2
REQUESTS_IN .............. 1
ROWS_OUT ............ 35,963
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ......... 618.64
DESCRIPTION
max_card_est ...... 35,963
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length ...... 1,688
join_type .............. inner
join_method ............ nested
TRAFODION_VSBB_SCAN ======================= SEQ_NO 2 NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ................. 1
EST_OPER_COST .......... 607.28
EST_TOTAL_COST ......... 607.28
DESCRIPTION
max_card_est ........... 5.03481e+06
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length ...... 1,604
scan_type .............. subset scan of table TRAFODION.DAAS_GX.DMA_ENTTYPE_S
TAT_NOPRAT
object_type ............ Trafodion
cache_size ......... 1,024
probes ............ 35,963
successful_probes 35,963
unique_probes ..... 35,963
rows_accessed ..... 35,963
column_retrieved ....... #1:1
key_columns ............ STAT_MON_YEAR, STAT_MON, DATA_ID
begin_key .............. (STAT_MON_YEAR = TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_
MON_YEAR), (STAT_MON = TRAFODION.DAAS_GX.IDX_ENTTY
PE.STAT_MON), (DATA_ID =
TRAFODION.DAAS_GX.IDX_ENTTYPE.DATA_ID)
end_key ................ (STAT_MON_YEAR = TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_
MON_YEAR), (STAT_MON = TRAFODION.DAAS_GX.IDX_ENTTY
PE.STAT_MON), (DATA_ID =
TRAFODION.DAAS_GX.IDX_ENTTYPE.DATA_ID)
TRAFODION_INDEX_SCAN ====================== SEQ_NO 1 NO CHILDREN
TABLE_NAME ............... DMA_ENTTYPE_STAT_NOPRAT
REQUESTS_IN .............. 1
ROWS_OUT ............ 35,963
EST_OPER_COST ........... 11.35
EST_TOTAL_COST .......... 11.35
DESCRIPTION
max_card_est ...... 35,963
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length ......... 84
scan_type .............. subset scan of index TRAFODION.DAAS_GX.IDX_ENTTYPE(T
RAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT)
object_type ............ Trafodion
columns ................ all
begin_keys(incl)
end_keys(incl)
cache_size ......... 1,024
probes ................. 1
rows_accessed .......... 5.03481e+06
column_retrieved ....... #1:1
key_columns ............ TRAFODION.DAAS_GX.IDX_ENTTYPE.ENTTYPE,
TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_MON_YEAR,
TRAFODION.DAAS_GX.IDX_ENTTYPE.STAT_MON,
TRAFODION.DAAS_GX.IDX_ENTTYPE.DATA_ID
executor_predicates .... (cast(TRAFODION.DAAS_GX.IDX_ENTTYPE.ENTTYPE) =
cast(1234))
简化的执行计划如下,
SQL>explain options 'f' select * from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 3.59E+004
1 2 3 nested_join 3.59E+004
. . 2 trafodion_vsbb_scan DMA_ENTTYPE_STAT_NOP 1.00E+000
. . 1 trafodion_index_scan IDX_ENTTYPE 3.59E+004
基于主键的VSBB扫描
基于主键的VSBB扫描,一般发生在两表采用NESTED_JOINS关联方式且关联字段为主键的情况下,也时常发生在通常索引扫描并回表的情况下(如上述INDEX扫描中所示的执行计划)。例如如下语句,
>>create table tw (a int not null primary key);
--- SQL operation complete.
>>control query shape nested_join(cut,cut) ;
--- SQL operation complete.
>>explain options 'f' select * from tw a, tw b where a.a = b.a ;
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 1.00E+002
1 2 3 nested_join 1.00E+002
. . 2 trafodion_vsbb_scan TW 1.00E+000
. . 1 trafodion_scan TW 1.00E+002
--- SQL operation complete.
通过以下执行计划可以发现,对TW表做了trafodion_vsbb_scan。详细的执行计划如下,
TRAFODION_VSBB_SCAN ======================= SEQ_NO 2 NO CHILDREN
TABLE_NAME ............... TW
REQUESTS_IN .............. 1
ROWS_OUT ................. 1
EST_OPER_COST ............ 0.02
EST_TOTAL_COST ........... 0.02
DESCRIPTION
max_card_est ......... 100
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
record_length .......... 0
scan_type .............. subset scan of table TRAFODION.SEABASE.TW A
object_type ............ Trafodion
cache_size ........... 100
small_scanner .......... ON
probes ............... 100
successful_probes .... 100
unique_probes ........ 100
rows_accessed ........ 100
column_retrieved ....... #1:1
key_columns ............ A
begin_key .............. (A = A)
end_key ................ (A = A)
以上几种执行计划的结果显示,前三种执行计划(包括全表扫描、基于主键的扫描和MDAM扫描)的简化执行计划都显示的是trafodion_scan,而具体属于哪一种需要通过详细的执行计划才能看得出。第四种执行计划(索引扫描)区别较为明显,它是trafodion_index_scan和trafodion_vsbb_scan做nested_join的形式,意味着数据库会基于索引字段直接从索引表扫描找到对应的行集,并基于这些行集中的主键列信息快速定位到主表中的相应行信息。
对于索引扫描这个执行计划,由于语句是select *,需要获取表的所有字段信息,因此需要通过索引与主表的结合才能找出所有字段信息,如若把语句改为以下,
select ENTTYPE from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;
那么执行计划就可以不必再回溯到主表,因为索引中已经包含了查询需要的所有信息,
SQL>explain options 'f' select ENTTYPE from DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT where ENTTYPE=1234;
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
1 . 2 root 3.59E+004
. . 1 trafodion_index_scan IDX_ENTTYPE 3.59E+004
以下附属表的DDL结构,
SQL>showddl DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT;
CREATE TABLE TRAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT
(
DATA_ID LARGEINT NO DEFAULT NOT NULL NOT DROPPABLE
NOT SERIALIZED
, STAT_MON VARCHAR(8 CHARS) CHARACTER SET UTF8
COLLATE DEFAULT NO DEFAULT NOT NULL NOT DROPPABLE NOT SERIALIZED
, STAT_MON_YEAR CHAR(4 CHARS) CHARACTER SET UTF8 COLLATE
DEFAULT NO DEFAULT NOT NULL NOT DROPPABLE NOT SERIALIZED
, ENTTYPE CHAR(6 CHARS) CHARACTER SET UTF8 COLLATE
DEFAULT DEFAULT NULL NOT SERIALIZED
, ENTTYPESUB CHAR(1 CHAR) CHARACTER SET UTF8 COLLATE
DEFAULT DEFAULT NULL NOT SERIALIZED
, INDUSTRYPHY VARCHAR(128 CHARS) CHARACTER SET UTF8
COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
, DOMDISTRICT VARCHAR(128 CHARS) CHARACTER SET UTF8
COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
, ENT_SCALE VARCHAR(128 CHARS) CHARACTER SET UTF8
COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
, END_NUM LARGEINT DEFAULT NULL NOT SERIALIZED
, CREATE_NUM LARGEINT DEFAULT NULL NOT SERIALIZED
, EXIT_NUM LARGEINT DEFAULT NULL NOT SERIALIZED
, REGCAP DECIMAL(18, 6) DEFAULT NULL NOT SERIALIZED
, PRIMARY KEY (STAT_MON_YEAR ASC, STAT_MON ASC, DATA_ID ASC)
)
ATTRIBUTES ALIGNED FORMAT NAMESPACE 'TRAF_1500000'
HBASE_OPTIONS
(
DATA_BLOCK_ENCODING = 'FAST_DIFF',
COMPRESSION = 'SNAPPY',
MEMSTORE_FLUSH_SIZE = '1073741824'
)
;
CREATE INDEX IDX_ENTTYPE ON TRAFODION.DAAS_GX.DMA_ENTTYPE_STAT_NOPRAT
(
ENTTYPE ASC
)
ATTRIBUTES ALIGNED FORMAT
HBASE_OPTIONS
(
DATA_BLOCK_ENCODING = 'FAST_DIFF',
COMPRESSION = 'SNAPPY',
MEMSTORE_FLUSH_SIZE = '1073741824'
)
;