GBase 8a 日志之 trc 日志详解

gcluster 日志分为三类: trc 日志记录查询计划; express.log 记录 SQL执行过程以及执行过程中的警告和错误; system.log 主要记录 gcluster 的启停以及 crash 信息,日志存放路径为: /opt/gcluster/log/gcluster。  

       当需要分析 SQL 的查询计划时请查看 trc 日志, trc 日志详细记录了 SQL的查询和执行计划,通过分析这些计划来确定 SQL 的计划是否是最优的。  

       在 gcluster 配置文件中设置
       gbase_sql_trace = 1
       gbase_sql_trace_level = 3  

       后,查询计划可输出到.trc 文件中。  

日志内容  

       

gbase> SELECT b,SUM(a) FROM t1 GROUP BY b HAVING(SUM(a)>100);
//按照 b 进行动态 hash 的查询计划如下。
==============================================
QueryPlan:
++++++++++++++++++++++++++++++++++++++++++++++
Step: 0
Leaf type: REGULAR STEP
Need combiner: false
Is Broadcast: false
Target temp table:
_tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68
Temp table definition: CREATE TABLE
`gctmpdb`._tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68(`_col_1`
VARCHAR(100), `_col_2` DECIMAL(18,0)) ENGINE = express nolock
Optimization: {hash redist}
Hash Redist Indexes: 1
Query String: SELECT `zsy1.t1`.`b` AS `_col_1`,
SUM(`zsy1.t1`.`a`) AS `_col_2` FROM `zsy1`.t1 AS `zsy1.t1` GROUP BY
`_col_1`
++++++++++++++++++++++++++++++++++++++++++++++
Step: 1
Leaf type: REGULAR STEP
Need combiner: false
Is Broadcast: false
Target temp table: _tmp_n1_t12_2_356618_0x45a00b90_0x12eebd68
Temp table definition: CREATE TABLE
`gctmpdb`._tmp_n1_t12_2_356618_0x45a00b90_0x12eebd68(`b`
VARCHAR(100), `sum(a)` DECIMAL(18,0)) ENGINE = express nolock
Optimization:
Query String: SELECT _col_1 AS `b`, SUM(_col_2) AS `sum(a)`
FROM `gctmpdb`._tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68 GROUP
BY `b` HAVING (SUM(_col_2) > 100)
==============================================
2012-07-18 19:06:09.356 [M: 0B, 0B,D: 0B] [DC: 0,
0]
CExecStep
--------------------------
isFinalStep = 0
m_nRequestId = 0
StepNo = 0
DestType = 1
aStepDetail
---------------
isProducer = 1
isConsumer = 1
isGroupHashRedist = 0
isAllTableAreHashTmpDist = 0
queryString = SELECT `zsy1.t1`.`b` AS `_col_1`, SUM(`zsy1.t1`.`a`)
AS `_col_2` FROM `zsy1`.t1 AS `zsy1.t1` GROUP BY `_col_1`
targetTable =
`gctmpdb`._tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68
targetSchema = CREATE TABLE
`gctmpdb`._tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68(`_col_1`
VARCHAR(100), `_col_2` DECIMAL(18,0)) ENGINE = express nolock
DropList =
CExecStep
--------------------------
isFinalStep = 0
m_nRequestId = 0
StepNo = 1
DestType = 0
aStepDetail
---------------
isProducer = 1
isConsumer = 0
isGroupHashRedist = 0
isAllTableAreHashTmpDist = 1
queryString = SELECT _col_1 AS `b`, SUM(_col_2) AS `sum(a)` FROM
`gctmpdb`._tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68 GROUP BY `b`
HAVING (SUM(_col_2) > 100)
DropList =
2012-07-18 19:06:09.419 [M: 0B, 0B,D: 0B] [DC: 0, 0]
Executor success.

日志项目  

      ( 一)中间表的命名方式:

       _tmp_n1_t11_1_405747_0x45a002a0_0x12eebd68  

       _tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68  

       对于一个集群临时表名, _tmp_rht 前缀表示这是一个动态重分布临时表,仅有_tmp 前缀时,表示这是一个普通临时表(复制表)。其余部分,动态重分布临时表和普通临时表一样,因此这里以动态重分布产生的临时表名_tmp_rht_n1_t12_1_356508_0x45a00ad0_0x12eebd68 来统一说明。  

       _tmp_rht:动态重分布临时表
       n1:发起这条查询的节点后缀,这里表示此查询是从 n1 节点发起的
       t12: 查询计数, gcluster server 启动后,执行的第 n 条查询, n 从 1 开始。
       1:该查询产生的第 m 个临时表名, m 从 1 开始(因为查询会分解为多步执行,因此可能会产生多个临时表)
       356508: 产生临时表名时,系统时间的微秒部分
       0x45a00ad0:产生临时表名时,时间结构的内存地址
       0x12eebd68:产生临时表名的 thd 对象的内存地址  

      (二) Optimization
       A. {single hash node}
       Select * from t where hash_col = '123';
       B. {hash group by}
       Select * from t1 group by a, hash_col, c;  

       C. {distinct group hash}
       select count(distinct hash_col) from x1;
       select sum(distinct hash_col) from x1 group by id3;
       D. {distinct hash}
       select distinct a, hash_col, c from x1;
       select distinct id2, hash_col, id3 from x1 group by id3
       E. {one step hash join complex}  

       在多步计划中,其中一步可以使用 hash join complex 优化(主查询与相关子查询,或非相关子查询存在 hash 关系)
       测试用例如下:(id4 是 hash 列)  

gbase> SELECT x3.* FROM x1, x3 WHERE x1.id3 = x3.id3 AND x1.id4 IN (SELECT id4
FROM x2);
==============================================
2012-07-19 18:31:14.243
JOIN ----test.x3
|
test.x1
2012-07-19 18:31:14.243
==============================================
2012-07-19 18:31:14.244
JOIN {(`test.x1`.`id3` =
`test.x3`.`id3`)}---------------------------------------------------
--test.x3{}
|
test.x1{`test.x1`.`id4` IN ( SELECT `test.x2`.`id4` as `id4` From
`test`.`x2` as `test.x2`)}
2012-07-19 18:31:14.244 >>>>> using one step hash join complex
optimization
2012-07-19 18:31:14.244
==============================================
==============================================
QueryPlan:
++++++++++++++++++++++++++++++++++++++++++++++
Step: 0
Leaf type: REGULAR STEP
Need combiner: false
Is Broadcast: false
Target temp table: _tmp_n1_t15_1_244477_0x45cfca40_0x13c53788
Temp table definition: CREATE TABLE
`gctmpdb`._tmp_n1_t15_1_244477_0x45cfca40_0x13c53788(`id3` INT)
ENGINE = express nolock
Optimization: {one step hash join complex}
Query String: SELECT `test.x1`.`id3` AS `id3` FROM
`test`.x1 AS `test.x1` WHERE `test.x1`.`id4` IN ( SELECT `test.x2`.`id4`
as `id4` From `test`.`x2` as `test.x2`)
++++++++++++++++++++++++++++++++++++++++++++++
Step: 1
Leaf type: REGULAR STEP
Need combiner: false
Is Broadcast: false
Target temp table: _tmp_n1_t15_2_244595_0x45cfcaf0_0x13c53788
Temp table definition: CREATE TABLE
`gctmpdb`._tmp_n1_t15_2_244595_0x45cfcaf0_0x13c53788(`entry_id` INT,
`id2` INT, `id3` INT, `id4` INT) ENGINE = express nolock
Optimization:
Query String: SELECT `test.x3`.`entry_id` AS `entry_id`,
`test.x3`.`id2` AS `id2`, `test.x3`.`id3` AS `id3`, `test.x3`.`id4` AS
`id4` FROM `gctmpdb`._tmp_n1_t15_1_244477_0x45cfca40_0x13c53788 INNER
JOIN `test`.x3 AS `test.x3` ON
(`_tmp_n1_t15_1_244477_0x45cfca40_0x13c53788`.`id3` =
`test.x3`.`id3`)
==============================================
2012-07-19 18:31:14.244
CExecStep
--------------------------
isFinalStep = 0
m_nRequestId = 0
StepNo = 0
DestType = 1
aStepDetail
---------------
isProducer = 1
isConsumer = 1
isGroupHashRedist = 0
isAllTableAreHashTmpDist = 0
queryString = SELECT `test.x1`.`id3` AS `id3` FROM `test`.x1 AS
`test.x1` WHERE `test.x1`.`id4` IN ( SELECT `test.x2`.`id4` as `id4`
From `test`.`x2` as `test.x2`)
targetTable =
`gctmpdb`._tmp_n1_t15_1_244477_0x45cfca40_0x13c53788
targetSchema = CREATE TABLE
`gctmpdb`._tmp_n1_t15_1_244477_0x45cfca40_0x13c53788(`id3` INT)
ENGINE = express nolock
DropList =
CExecStep
--------------------------
isFinalStep = 0
m_nRequestId = 0
StepNo = 1
DestType = 0
aStepDetail
---------------
isProducer = 1
isConsumer = 0
isGroupHashRedist = 0
isAllTableAreHashTmpDist = 0
queryString = SELECT `test.x3`.`entry_id` AS `entry_id`,
`test.x3`.`id2` AS `id2`, `test.x3`.`id3` AS `id3`, `test.x3`.`id4` AS
`id4` FROM `gctmpdb`._tmp_n1_t15_1_244477_0x45cfca40_0x13c53788 INNER
JOIN `test`.x3 AS `test.x3` ON
(`_tmp_n1_t15_1_244477_0x45cfca40_0x13c53788`.`id3` =
`test.x3`.`id3`)
DropList =
2012-07-19 18:31:15.171 Executor success.
F. {hash redist}
Hash Redist Indexes: 1

       hash 重分布步骤, Hash Redist Indexes 表示按 projection 的第 n 列进行
       hash 重分布(n 从 1 开始)
       例如: select 执行结果按照 projection 的第 1 列(`zsy1.t1`.`b` AS`_col_1`)进行动态 hash 后,把结果发送到各个节点,日志如下:  

Optimization: {hash redist}
Hash Redist Indexes: 1
Query String: SELECT `zsy1.t1`.`b` AS `_col_1`,
SUM(`zsy1.t1`.`a`) AS `_col_2`
FROM `zsy1`.t1 AS `zsy1.t1` GROUP BY `_col_1`

       (三) DestType(中间结果的汇聚方式)
       DEST_TYPE_NOSET = 0, ///< not set  

       DEST_TYPE_BROADCAST = 1,///< 广播到所有的 Node : 8a -> 8a 或者是 Cluster -> 8a
       DEST_TYPE_COORD = 2, ///< 发送结果到 Cluster
       DEST_TYPE_ONE = 3, ///< 发送结果到一个 Node 目前为当前Cluster 所在的 8a  

       (四) isProducer
       生产者
       (五) isConsumer
       消费者
       (六) JOIN 树
       Inner join 评估把较小的表放在左面,较大的表放到右面,如果要拉复制表,则把左面的表拉成复制表, 如果是 left join 则不能改变顺序,只能把右边的表拉成复制表。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值