【Spark】Spark Web UI - SQL

  工作中经常会出现 Spark SQL 执行很慢或者失败的情况,如果要排查问题,就必须要学会看 Spark Web UI。可以参考官网来学习:https://spark.apache.org/docs/3.2.1/web-ui.html#content。关于 Spark Web UI,上面有很多个 tab 页,后面逐一学习。

在这里插入图片描述

今天学习一个常用的 tab —— SQL。

SQL Tab

If the application executes Spark SQL queries, the SQL tab displays information, such as the duration, jobs, and physical and logical plans for the queries. Here we include a basic example to illustrate this tab:

如果应用程序执行 Spark SQL 查询,SQL 选项卡会显示查询的持续时间、作业以及物理和逻辑计划等信息。这里我们包含一个基本示例来说明此选项卡:

scala> val df = Seq((1, "andy"), (2, "bob"), (2, "andy")).toDF("count", "name")
df: org.apache.spark.sql.DataFrame = [count: int, name: string]

scala> df.count
res0: Long = 3                                                                  

scala> df.createGlobalTempView("df")

scala> spark.sql("select name,sum(count) from global_temp.df group by name").show
+----+----------+
|name|sum(count)|
+----+----------+
|andy|         3|
| bob|         2|
+----+----------+

在这里插入图片描述

Now the above three dataframe/SQL operators are shown in the list. If we click the ‘show at : 24’ link of the last query, we will see the DAG and details of the query execution.

现在上述三个 dataframe/SQL 运算符显示在列表中。如果我们单击最后一个查询的 show at <console>: 24 链接,我们将看到 DAG 和查询执行的详细信息。

在这里插入图片描述

The query details page displays information about the query execution time, its duration, the list of associated jobs, and the query execution DAG. The first block ‘WholeStageCodegen (1)’ compiles multiple operators (‘LocalTableScan’ and ‘HashAggregate’) together into a single Java function to improve performance, and metrics like number of rows and spill size are listed in the block. The annotation ‘(1)’ in the block name is the code generation id. The second block ‘Exchange’ shows the metrics on the shuffle exchange, including number of written shuffle records, total data size, etc.

查询详细信息页面显示有关查询执行时间、持续时间、关联作业列表和查询执行 DAG 的信息。 第一个块 “WholeStageCodegen (1)” 将多个运算符( “LocalTableScan” 和 “HashAggregate” )一起编译成一个 Java 函数以提高性能,该块中列出了行数和溢出大小等指标。 块名称中的注释 “(1)” 是代码生成 ID。 第二个 “Exchange” 块显示了 shuffle 交换的指标,包括写入的 shuffle 记录数、总数据大小等。

在这里插入图片描述

Clicking the ‘Details’ link on the bottom displays the logical plans and the physical plan, which illustrate how Spark parses, analyzes, optimizes and performs the query. Steps in the physical plan subject to whole stage code generation optimization, are prefixed by a star followed by the code generation id, for example: ‘*(1) LocalTableScan’

单击底部的 Details 链接会显示逻辑计划和物理计划,说明 Spark 如何解析、分析、优化和执行查询。 物理计划中要进行全阶段代码生成优化的步骤,以星号为前缀,后跟代码生成 id,例如:‘*(1) LocalTableScan’

SQL metrics

The metrics of SQL operators are shown in the block of physical operators. The SQL metrics can be useful when we want to dive into the execution details of each operator. For example, “number of output rows” can answer how many rows are output after a Filter operator, “shuffle bytes written total” in an Exchange operator shows the number of bytes written by a shuffle.

SQL 算子的指标显示在物理算子块中。 当我们想要深入了解每个运算符的执行细节时,SQL 指标会很有用。 例如,“number of output rows” 可以回答过滤操作符后输出了多少行,交换操作符中的 “shuffle byteswritten total” 显示了 shuffle 写入的字节数。

Here is the list of SQL metrics:

SQL metricsMeaningOperators
number of output rowsthe number of output rows of the operatorAggregate operators, Join operators, Sample, Range, Scan operators, Filter, etc.
data sizethe size of broadcast/shuffled/collected data of the operatorBroadcastExchange, ShuffleExchange, Subquery
time to collectthe time spent on collecting dataBroadcastExchange, Subquery
scan timethe time spent on scanning dataColumnarBatchScan, FileSourceScan
metadata timethe time spent on getting metadata like number of partitions, number of filesFileSourceScan
shuffle bytes writtenthe number of bytes writtenCollectLimit, TakeOrderedAndProject, ShuffleExchange
shuffle records writtenthe number of records writtenCollectLimit, TakeOrderedAndProject, ShuffleExchange
shuffle write timethe time spent on shuffle writingCollectLimit, TakeOrderedAndProject, ShuffleExchange
remote blocks readthe number of blocks read remotelyCollectLimit, TakeOrderedAndProject, ShuffleExchange
remote bytes readthe number of bytes read remotelyCollectLimit, TakeOrderedAndProject, ShuffleExchange
remote bytes read to diskthe number of bytes read from remote to local diskCollectLimit, TakeOrderedAndProject, ShuffleExchange
local blocks readthe number of blocks read locallyCollectLimit, TakeOrderedAndProject, ShuffleExchange
local bytes readthe number of bytes read locallyCollectLimit, TakeOrderedAndProject, ShuffleExchange
fetch wait timethe time spent on fetching data (local and remote)CollectLimit, TakeOrderedAndProject, ShuffleExchange
records readthe number of read recordsCollectLimit, TakeOrderedAndProject, ShuffleExchange
sort timethe time spent on sortingSort
peak memorythe peak memory usage in the operatorSort, HashAggregate
spill sizenumber of bytes spilled to disk from memory in the operatorSort, HashAggregate
time in aggregation buildthe time spent on aggregationHashAggregate, ObjectHashAggregate
avg hash probe bucket list itersthe average bucket list iterations per lookup during aggregationHashAggregate
data size of build sidethe size of built hash mapShuffledHashJoin
time to build hash mapthe time spent on building hash mapShuffledHashJoin
task commit timethe time spent on committing the output of a task after the writes succeedany write operation on a file-based table
job commit timethe time spent on committing the output of a job after the writes succeedany write operation on a file-based table

欢迎点击此处关注公众号。

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值