Error communicating with MapOutputTracker 问题解析1

04:36:09,043  INFO MapOutputTrackerWorker:59 - Doing the fetch; tracker actor = Actor[akka.tcp://sparkDriver@titannew135:45806/user/MapOutputTracker#526256448]
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:39,046 ERROR MapOutputTrackerWorker:96 - Error communicating with MapOutputTracker
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
shuffle大规模数据时GC效率下降,不能在 akka的timeouts时间内响应 MapOutputTrackerWorker的map outputs,导致Futures timed out after [30 seconds]异常。

解决方法:将akka的timeouts和hearbets乘以10,使得akka可以等待更长时间;延长akka的轮询时间,避免MapOutputTracker timeouts问题

# akka timeouts/heartbeats settings multiplied by 10 to avoid problems
spark.akka.timeout 1000
spark.akka.heartbeat.pauses 60000
spark.akka.failure-detector.threshold 3000.0
spark.akka.heartbeat.interval 10000

# Hidden akka conf to avoid MapOutputTracker timeouts
# See https://github.com/apache/spark/blob/branch-1.3/core/src/main/scala/org/apache/spark/util/AkkaUtils.scala
spark.akka.askTimeout 300
spark.akka.lookupTimeout 300
 
 
 
Ref: https://mail-archives.apache.org/mod_mbox/spark-user/201505.mbox/%3CCAGHU-i0L9VBxM+auAi4XDECchaLurvUPaJa_MZXc+mAq_2JjAg@mail.gmail.com%3E

Version: Hadoop 1.2.1, Spark 1.3.1 stanalone

如下为错误信息

04:36:08,876  INFO TorrentBroadcast:59 - Reading broadcast variable 4 took 117 ms

04:36:08,878  INFO MemoryStore:59 - ensureFreeSpace(8464) called with curMem=150220, maxMem=88905917399
04:36:08,878  INFO MemoryStore:59 - Block broadcast_4 stored as values in memory (estimated size 8.3 KB, free 82.8 GB)
04:36:09,038  INFO CacheManager:59 - Partition rdd_2_5 not found, computing it
04:36:09,039  INFO CacheManager:59 - Partition rdd_2_8 not found, computing it
04:36:09,038  INFO CacheManager:59 - Partition rdd_2_17 not found, computing it
04:36:09,038  INFO CacheManager:59 - Partition rdd_2_14 not found, computing it
04:36:09,038  INFO CacheManager:59 - Partition rdd_2_11 not found, computing it
04:36:09,038  INFO CacheManager:59 - Partition rdd_2_2 not found, computing it
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:09,043  INFO MapOutputTrackerWorker:59 - Doing the fetch; tracker actor = Actor[akka.tcp://sparkDriver@titannew135:45806/user/MapOutputTracker#526256448]
04:36:09,042  INFO MapOutputTrackerWorker:59 - Don't have map outputs for shuffle 1, fetching them
04:36:39,046 ERROR MapOutputTrackerWorker:96 - Error communicating with MapOutputTracker
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:112)
at org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:163)
at org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42)
at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:61)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
at org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
04:36:39,059  INFO MapOutputTrackerWorker:59 - Doing the fetch; tracker actor = Actor[akka.tcp://sparkDriver@titannew135:45806/user/MapOutputTracker#526256448]
04:36:39,060 ERROR Executor:96 - Exception in task 17.0 in stage 1.0 (TID 5640)
o
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值