在本博客系列的第3篇,我将仍然关注查询计划(Query Plan)和执行摘要(Execution Summary)。
现在,让我们看看Profile的Planner Timeline和Query Timeline部分:
Planner Timeline: 21.773ms - Analysis finished: 1.432ms (1.432ms) - Equivalence classes computed: 1.478ms (46.487us) - Single node plan created: 21.005ms (19.526ms) - Runtime filters computed: 21.020ms (14.599us) - Distributed plan created: 21.096ms (76.103us) - Lineage info computed: 21.156ms (59.924us) - Planning finished: 21.773ms (617.260us) Query Timeline: 14m8s - Query submitted: 250.799us (250.799us) - Planning finished: 24.359ms (24.108ms) - Submit for admission: 25.375ms (1.016ms) - Completed admission: 25.565ms (189.521us) - Ready to start on 6 backends: 26.997ms (1.432ms) - All 6 execution backends (7 fragment instances) started: 33.022ms (6.024ms) - Rows available: 14m8s (14m8s) - First row fetched: 14m8s (695.112ms) - Unregister query: 14m8s (5.330ms) - ComputeScanRangeAssignmentTimer: 110.741us |
每行的信息都很容易理解,我们可以看到运行查询计划花费了21.773ms,从25.375ms开始向admission提交查询计划,从33.022ms开始在工作节点上执行查询计划,在14m8s时准备好最后一行数据并在14m8s时第一行数据被客户端获取(fetch)。这使你可以很清楚地了解每个阶段所花的时间,如果任何阶段都很慢,那将是非常明显的,然后我们可以开始进一步深入研究,以了解可能发生了什么。
从查询时间轴(Query Timeline)中,我们可以清楚地看到,从开始执行(一共6个执行后端(All 6 execution backends))到数据可用(可用行(Rows available))几乎花费了14分钟(14m8s)。由于这里只是count(*),所以没有最后一行。
比如下面的案例:(另一个profile)
Query Compilation: 16.268ms - Metadata of all 1 tables cached: 1.786ms (1.786ms) - Analysis finished: 6.162ms (4.376ms) - Value transfer graph computed: 6.537ms (374.918us) - Single node plan created: 7.955ms (1.417ms) - Runtime filters computed: 8.274ms (318.815us) - Distributed plan created: 8.430ms (156.307us) - Lineage info computed: 9.664ms (1.234ms) - Planning finished: 16.268ms (6.603ms) Query Timeline: 35m46s - Query submitted: 0.000ns (0.000ns) - Planning finished: 22.001ms (22.001ms) - Submit for admission: 23.001ms (1.000ms) - Completed admission: 23.001ms (0.000ns) - Ready to start on 2 backends: 24.001ms (1.000ms) - All 2 execution backends (2 fragment instances) started: 36.001ms (12.000ms) - Rows available: 5m51s (5m51s) - First row fetched: 5m52s (950.045ms) - Last row fetched: 35m46s (29m53s) - Released admission control resources: 35m46s (1.000ms) - Unregister query: 35m46s (30.001ms) - ComputeScanRangeAssignmentTimer: 0.000ns |
Impala查询运行了很长时间,客户想要找出原因。从查询时间轴(Query Timeline)中,我们可以清楚地看到,从开始执行(一共两个执行后端(All 2 execution backends))到数据可用(可用行(Rows available))几乎花费了6分钟(5m51s)。这6分钟的执行可能是正常的,就像有很多大数据集的连接(join)一样,查询运行几分钟是很常见的。
但是,我们可以注意到Impala花了30分钟将数据传递回客户端,因为第一行在第6分钟获取,而最后一行在第36分钟获取。因此,从这里,我们可以怀疑Impala协调器(coordinator)和客户端之间可能存在一些网络问题(当从客户端,如impala-shell或Hue,到Impala协调器主机获取数据时)。另一种可能性是客户端可能在获取结果时也在执行其他操作,如在屏幕上打印,因为返回的数据可能很大,该操作可能很耗时。
因此,这部分概要信息可以指引我们找到寻找瓶颈的正确方向。
几个关键的信息:
1)反向阅读,从下到上,因为这是Impala的操作顺序
2)比较“Avg Time”和“Max Time”列的数据
3)比较“Rows”和“Est. #Rows”列的数据
4)检查“Detail”列以查看每个操作的Join类型
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ---------------------------------------------------------------------------------------------------------------------------------- 03:AGGREGATE 1 316.053us 316.053us 1 1 40.00 KB 10.00 MB FINALIZE 02:EXCHANGE 1 14m8s 14m8s 6 1 0 0 UNPARTITIONED 01:AGGREGATE 6 2.746ms 5.205ms 6 1 67.00 KB 10.00 MB 00:SCAN KUDU 6 2m33s 14m8s 6.10M -1 299.00 KB 0 bms_ldgdb.ldg_690marketization_ledger t |
很快,我就注意到扫描KUDU(SCAN KUDU)操作的“Avg Time”和“Max Time”的差别很大,平均时间为2分33秒,但是6个主机中的一个主机的最长时间为14分8秒。
因此,我的下一个想法是确定哪些主机的执行速度比其他主机慢,以及是否来自同一主机。为此,我使用字符串“id=0”来搜索Profile文件,“0”是Summary部分“00:SCAN KUDU”中每一行开头的操作编号。
我从文件开头搜索到了“id=0”部分:这里我只过滤出我想要的内容。
…… Instance 2945a77ff619defe:b658730000000005 (host=rrs-hdp-dn04:22000):(Total: 14m8s, non-child: 2.048ms, % non-child: 0.00%) … KUDU_SCAN_NODE (id=0):(Total: 14m8s, non-child: 14m8s, % non-child: 100.00%) BytesRead(16s000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesRead: 0 - KuduRemoteScanTokens: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 299.00 KB (306176) - RowsRead: 1.02M (1016239) - RowsReturned: 1.02M (1016239) - RowsReturnedRate: 1.20 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 1 (1) - ScannerThreadsTotalWallClockTime: 14m8s - MaterializeTupleTime(*): 14m8s - ScannerThreadsSysTime: 49.992ms - ScannerThreadsUserTime: 313.952ms - ScannerThreadsVoluntaryContextSwitches: 1.67K (1669) - TotalKuduScanRoundTrips: 1.56K (1560) - TotalReadThroughput: 0.00 /sec …… Instance 2945a77ff619defe:b658730000000002 (host=rrs-hdp-dn13:22000):(Total: 39s554ms, non-child: 3.704ms, % non-child: 0.01%) … KUDU_SCAN_NODE (id=0):(Total: 39s477ms, non-child: 39s477ms, % non-child: 100.00%) BytesRead(1s000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesRead: 0 - KuduRemoteScanTokens: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 122.00 KB (124928) - RowsRead: 1.02M (1016532) - RowsReturned: 1.02M (1016532) - RowsReturnedRate: 25.75 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 3 (3) - ScannerThreadsTotalWallClockTime: 39s478ms - MaterializeTupleTime(*): 39s474ms - ScannerThreadsSysTime: 6.998ms - ScannerThreadsUserTime: 137.979ms - ScannerThreadsVoluntaryContextSwitches: 100 (100) - TotalKuduScanRoundTrips: 89 (89) - TotalReadThroughput: 0.00 /sec …… Instance 2945a77ff619defe:b658730000000006 (host=rrs-hdp-dn05:22000):(Total: 10s311ms, non-child: 2.554ms, % non-child: 0.02%) … KUDU_SCAN_NODE (id=0):(Total: 10s235ms, non-child: 10s235ms, % non-child: 100.00%) BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesRead: 0 - KuduRemoteScanTokens: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 122.00 KB (124928) - RowsRead: 1.02M (1017341) - RowsReturned: 1.02M (1017341) - RowsReturnedRate: 99.39 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 2 (2) - ScannerThreadsTotalWallClockTime: 10s235ms - MaterializeTupleTime(*): 10s231ms - ScannerThreadsSysTime: 20.996ms - ScannerThreadsUserTime: 107.983ms - ScannerThreadsVoluntaryContextSwitches: 34 (34) - TotalKuduScanRoundTrips: 31 (31) - TotalReadThroughput: 0.00 /sec …… Instance 2945a77ff619defe:b658730000000001 (host=rrs-hdp-dn03:22000):(Total: 9s700ms, non-child: 2.187ms, % non-child: 0.02%) … KUDU_SCAN_NODE (id=0):(Total: 9s632ms, non-child: 9s632ms, % non-child: 100.00%) BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesRead: 0 - KuduRemoteScanTokens: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 122.00 KB (124928) - RowsRead: 1.02M (1017071) - RowsReturned: 1.02M (1017071) - RowsReturnedRate: 105.58 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 1 (1) - ScannerThreadsTotalWallClockTime: 9s632ms - MaterializeTupleTime(*): 9s628ms - ScannerThreadsSysTime: 10.998ms - ScannerThreadsUserTime: 110.983ms - ScannerThreadsVoluntaryContextSwitches: 33 (33) - TotalKuduScanRoundTrips: 31 (31) - TotalReadThroughput: 0.00 /sec …… Instance 2945a77ff619defe:b658730000000004 (host=rrs-hdp-dn11:22000):(Total: 8s165ms, non-child: 2.354ms, % non-child: 0.03%) … KUDU_SCAN_NODE (id=0):(Total: 8s090ms, non-child: 8s090ms, % non-child: 100.00%) BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesRead: 0 - KuduRemoteScanTokens: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 122.00 KB (124928) - RowsRead: 1.02M (1018748) - RowsReturned: 1.02M (1018748) - RowsReturnedRate: 125.92 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 2 (2) - ScannerThreadsTotalWallClockTime: 8s091ms - MaterializeTupleTime(*): 8s085ms - ScannerThreadsSysTime: 9.998ms - ScannerThreadsUserTime: 114.982ms - ScannerThreadsVoluntaryContextSwitches: 31 (31) - TotalKuduScanRoundTrips: 28 (28) - TotalReadThroughput: 0.00 /sec …… Instance 2945a77ff619defe:b658730000000003 (host=rrs-hdp-dn08:22000):(Total: 6s024ms, non-child: 2.396ms, % non-child: 0.04%) … KUDU_SCAN_NODE (id=0):(Total: 5s952ms, non-child: 5s952ms, % non-child: 100.00%) BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesRead: 0 - KuduRemoteScanTokens: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 122.00 KB (124928) - RowsRead: 1.01M (1014934) - RowsReturned: 1.01M (1014934) - RowsReturnedRate: 170.50 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 0 (0) - ScannerThreadsTotalWallClockTime: 5s952ms - MaterializeTupleTime(*): 5s948ms - ScannerThreadsSysTime: 8.998ms - ScannerThreadsUserTime: 88.986ms - ScannerThreadsVoluntaryContextSwitches: 25 (25) - TotalKuduScanRoundTrips: 22 (22) - TotalReadThroughput: 0.00 /sec |
这里你会看到有6个id=0的KUDU_SCAN_NODE,为什么是6个,因为这里的hosts为6
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ---------------------------------------------------------------------------------------------------------------------------------- 03:AGGREGATE 1 316.053us 316.053us 1 1 40.00 KB 10.00 MB FINALIZE 02:EXCHANGE 1 14m8s 14m8s 6 1 0 0 UNPARTITIONED 01:AGGREGATE 6 2.746ms 5.205ms 6 1 67.00 KB 10.00 MB 00:SCAN KUDU 6 2m33s 14m8s 6.10M -1 299.00 KB 0 bms_ldgdb.ldg_690marketization_ledger t |
现在我可以清楚地看到哪个服务器节点是瓶颈,它是主机rrs-hdp-dn04,用时14分钟8秒,而其他所有主机用时约5秒至40秒。而这6个host平均时间:正好是2m33s
然后我得出结论,该服务器rrs-hdp-dn04出现了一些问题,需要修复。
下面是页面截图:
分区:6个
而刚才找到主机rrs-hdp-dn04,用时最长,所以到rrs-hdp-dn04上去看看:
到此分析完毕。由于集群有专门的运维团队,所以我只是想搞清楚为什么会那么慢,其实我是想集群调优,调整一些参数再测试一下的,但我的权限有限,不能随便进任何服务器,只好到此为止了,以后有机会再更新参数调优和sql优化。