Spark内存泄露问题分析追查

本文分析思路非常清晰,这里转载作为学习分析spark内存泄露问题的案例。(原文见文章末尾参考)

[Abstract]

I recently encountered an OOM error in a PageRank application (org.apache.spark.examples.SparkPageRank). After profiling the application, I found the OOM error is related to the memory contention in shuffle spill phase. Here, the memory contention means that a task tries to release some old memory consumers from memory for keeping the new memory consumers. After analyzing the OOM heap dump, I found the root cause is a memory leak in TaskMemoryManager. Since memory contention is common in shuffle phase, this is a critical bug/defect. In the following sections, I will use the application dataflow, execution log, heap dump, and source code to identify the root cause.

[Application]

This is a PageRank application from Spark’s example library. The following figure shows the application dataflow. The source code is available at [1].
这里写图片描述

[Failure symptoms]

This application has a map stage and many iterative reduce stages. An OOM error occurs in a reduce task (Task-28) as follows.
这里写图片描述
这里写图片描述

[OOM root cause identification]

Each executor has 1 CPU core and 6.5GB memory, so it only runs one task at a time. After analyzing the application dataflow, error log, heap dump, and source code, I found the following steps lead to the OOM error.

=> The MemoryManager found that there is not enough memory to cache the links:ShuffledRDD (rdd-5-28, red circles in the dataflow figure).
这里写图片描述

=> The task needs to shuffle twice (1st shuffle and 2nd shuffle in the dataflow figure).
=> The task needs to generate two ExternalAppendOnlyMap (E1 for 1st shuffle and E2 for 2nd shuffle) in sequence.
=> The 1st shuffle begins and ends. E1 aggregates all the shuffled data of 1st shuffle and achieves 3.3 GB.
这里写图片描述
=> The 2nd shuffle begins. E2 is aggregating the shuffled data of 2nd shuffle, and finding that there is not enough memory left. This triggers the memory contention.
这里写图片描述
=> To handle the memory contention, the TaskMemoryManager releases E1 (spills it onto disk) and assumes that the 3.3GB space is free now.
这里写图片描述
=> E2 continues to aggregates the shuffled records of 2nd shuffle. However, E2 encounters an OOM error while shuffling.
这里写图片描述
这里写图片描述

[Guess]

The task memory usage below reveals that there is not memory drop down. So, the cause may be that the 3.3GB ExternalAppendOnlyMap (E1) is not actually released by the TaskMemoryManger.
这里写图片描述

[Root cause]

After analyzing the heap dump, I found the guess is right (the 3.3GB ExternalAppendOnlyMap is actually not released). The 1.6GB object is ExternalAppendOnlyMap (E2).
这里写图片描述

[Question]

Why the released ExternalAppendOnlyMap is still in memory?
The source code of ExternalAppendOnlyMap shows that the currentMap (AppendOnlyMap) has been set to null when the spill action is finished.
这里写图片描述

[Root cause in the source code]

I further analyze the reference chain of unreleased ExternalAppendOnlyMap. The reference chain shows that the 3.3GB ExternalAppendOnlyMap is still referenced by the upstream/readingIterator and further referenced by TaskMemoryManager as follows. So, the root cause in the source code is that the ExternalAppendOnlyMap is still referenced by other iterators (setting the currentMap to null is not enough).
这里写图片描述

[Potential solution]

Setting the upstream/readingIterator to null after the forceSpill() action. I will try this solution in these days.

[References]
[1] PageRank source code. https://github.com/JerryLead/SparkGC/blob/master/src/main/scala/applications/graph/PageRank.scala
[2] Task execution log. https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/log/TaskExecutionLog.txt

转载源自SPARK-22713,作者JerryLead(Xu Lijie)

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值