Greenplum explain analyze 解读 + 深度明细开关

最新推荐文章于 2023-08-24 10:31:11 发布

weixin_34195546

最新推荐文章于 2023-08-24 10:31:11 发布

阅读量1.8k

点赞数

文章标签：数据库 runtime

原文链接：https://yq.aliyun.com/articles/277981

版权

背景

解读Greenplum 的explain analyze输出。同时介绍两个深度明细开关explain_memory_verbosity , gp_enable_explain_allstat 。

例子

如果不打开详细开关，那么统计信息中，rows out后面的明细，实际上输出的是返回行数最多的SEGMENT，并不一定是最慢的segment（请务必注意）。因此我们看到上面的offset甚至有一些比下面的offset更小的，原因就在此。

解释请参考如下：

其中，start offset 表示从query dispatcher(QD)发起，到返回这个NODE的第一条有效tuple的时间。

可以说明几个问题：

1、query dispatch下发执行计划到segment，启动segment 的worker process。这之间有一定的时间差，体现在start offset层面。

开始时间是qd下发分布式执行计划，如果有一些offset比较大，可能是worker process fork较慢引起。

2、结束时间是获取到当前NODE的第一条有效记录是结束点，如果有一些offset比较大，也可能是当前worker处理较慢，获得第一条有效记录时间较久。

3、又或者，数据较倾斜，得到第一条有效记录，需要过滤很多无效记录。

postgres=# explain analyze select count(*),c1 from a group by c1 limit 1;  
                                                                            QUERY PLAN                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=1561155.53..1561155.57 rows=1 width=12)  
   Rows out:  1 rows with 0.003 ms to first row, 0.031 ms to end, start offset by 522 ms.  
   ->  Gather Motion 48:1  (slice2; segments: 48)  (cost=1561155.53..1561155.57 rows=1 width=12)  
         Rows out:  1 rows at destination with 3994 ms to end, start offset by 522 ms.  
         ->  Limit  (cost=1561155.53..1561155.55 rows=1 width=12)  
               Rows out:  Avg 1.0 rows x 41 workers.  Max 1 rows (seg0) with 0.006 ms to end, start offset by 538 ms.  
               ->  HashAggregate  (cost=1561155.53..1561156.80 rows=3 width=12)  
                     Group By: a.c1  
                     Rows out:  Avg 1.0 rows x 41 workers.  Max 1 rows (seg0) with 0.003 ms to first row, 1252 ms to end, start offset by 538 ms.  
                     ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=1561152.00..1561154.02 rows=3 width=12)  
                           Hash Key: a.c1  
                           -- 平均返回多少行，返回最多行的节点，Rows out:  Avg 118.2 rows x 41 workers at destination.  Max 192 rows  
			   -- 从这个worker第一次Entry to this plan node时间到返回这个NODE的第一条有效tuple，with xxx ms to first row   
			   -- 从这个worker第一次Entry to this plan node时间到返回这个NODE的最后一条有效tuple，(seg9) with 2669 ms to end  
			   -- 从query dispatcher(QD)发起，到返回这个NODE的第一条有效tuple，start offset by 595 ms.  
			   Rows out:  Avg 118.2 rows x 41 workers at destination.  Max 192 rows (seg9) with 2669 ms to end, start offset by 595 ms.  
                           ->  HashAggregate  (cost=1561152.00..1561152.00 rows=3 width=12)  
                                 Group By: a.c1  
                                -- appendStringInfo(str,  
                                --                               "Rows in:  Avg %.1f rows x %d workers."  
                                --                               "  Max %.0f rows%s",  
                                --                               ntuples_avg,  
                                --                               ns->ntuples.vcnt,  
                                --                               ns->ntuples.vmax,  
                                --                               segbuf);  
			        -- Rows out:  {noRowRequested} rows{ns->ntuples.vmax}    
			        -- with {Time from this worker's first InstrStartNode() to its first result row} to first row  
				-- {Time from this worker's first InstrStartNode() to end of its results} to end  
				-- start offset by {Time from start of query on qDisp to this worker's first result row}   
				 Rows out:  Avg 101.0 rows x 48 workers.  Max 101 rows (seg0) with 0.005 ms to first row, 2706 ms to end, start offset by 572 ms.  
                                 ->  Append-only Columnar Scan on a  (cost=0.00..1061152.00 rows=2083334 width=4)  
                                       -- 不同的planstate->type，输出的内容可能不一样，参考 cdbexplain_showExecStats   
				       -- Rows out:  {noRowRequested} rows{ns->ntuples.vmax}    
				       -- with {Time from this worker's first InstrStartNode() to end of its results} to end  
				       -- start offset by {Time from start of query on qDisp to this worker's first result row}  
				       Rows out:  0 rows (seg0) with 34 ms to end, start offset by 568 ms.  
 -- 每一个分片的内存使用情况, 一个分片指在整个执行计划中，在segment内可以独立运行的连续的步骤集，后面有术语介绍  
 Slice statistics:  
   (slice0)    Executor memory: 327K bytes.  
   (slice1)    Executor memory: 828K bytes avg x 48 workers, 828K bytes max (seg0).  
   (slice2)    Executor memory: 353K bytes avg x 48 workers, 356K bytes max (seg0).  
 -- 当gp_resqueue_memory_policy != RESQUEUE_MEMORY_POLICY_NONE时，打印内存统计信息  
 Statement statistics:  
   --   /* What is the memory reserved for this query's execution? */  
   --    uint64          query_mem;  
   --   PlannedStmt @ pgsql/src/include/nodes/plannodes.h   
   Memory used: 128000K bytes  
 -- 设置了哪些影响执行计划的非默认参数  
 Settings:  enable_bitmapscan=off; enable_seqscan=off; optimizer=off  
 Optimizer status: legacy query optimizer  
 -- 总执行时长  
 Total runtime: 4518.864 ms  
(26 rows)

以上解释，详见代码 cdbexplain_showExecStats

最低0.47元/天解锁文章

weixin_34195546

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Greenplum explain analyze 解读 + 深度明细开关

标签PostgreSQL , Greenplum , explain analyze , 解读 , 深度明细 , explain_memory_verbosity , gp_enable_explain_allstat背景解读Greenplum 的explain analyze输出。同时介绍两个深度明细开关explain_memory_verb...
复制链接

扫一扫