requirement failed: Block broadcast_487 is already present in the MemoryStore

场景:

以往正常执行的sparksql,今天在公司执行报如下错误:

第一次执行报错如下:

Caused by: java.sql.SQLException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1382 in stage 320.0 failed 4 times, most recent failure: Lost task 1382.3 in stage 320.0 (TID 165140, leaptest-dn03.bjev.com): java.io.IOException: java.lang.IllegalArgumentException: requirement failed: Block broadcast_487 is already present in the MemoryStore
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1258)
    at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:174)
    at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:65)
    at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:65)
    at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:89)
    at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:72)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:283)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: requirement failed: Block broadcast_487 is already present in the MemoryStore
    at scala.Predef$.require(Predef.scala:224)
    at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:182)
    at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:919)
    at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
    at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
    at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
    at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:700)
    at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1213)
    at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:194)
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1251)
    ... 12 more

Driver stacktrace:
    at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:279)
    at com.lenovo.lps.farseer.priest2.ext.SparkExecDao.executeOneSql(SparkExecDao.java:370)
    at com.lenovo.lps.farseer.priest2.ext.SparkExecDao.executeOneSqlProxy(SparkExecDao.java:341)
    at com.lenovo.lps.farseer.priest2.ext.SparkExecDao.sparkExecute(SparkExecDao.java:255)
    ... 82 more


 

多次执行的时候偶尔会报另一个错误:

Caused by: java.sql.SQLException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 342.0 failed 4 times, most recent failure: Lost task 3.3 in stage 342.0 (TID 166270, leaptest-dn03.bjev.com): java.io.FileNotFoundException: /swap/hadoop/yarn/local/usercache/hive/appcache/application_1557227972884_8104/blockmgr-927821e7-2b49-465d-86de-9d2707e2517b/03/temp_shuffle_be51c50d-54ac-4a5b-b840-59129fed41fb (设备上没有空间)
    at java.io.FileOutputStream.open(Native Method)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
    at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:88)
    at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:181)
    at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:150)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:283)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
    at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:279)
    at com.lenovo.lps.farseer.priest2.ext.SparkExecDao.executeOneSql(SparkExecDao.java:370)
    at com.lenovo.lps.farseer.priest2.ext.SparkExecDao.executeOneSqlProxy(SparkExecDao.java:341)
    at com.lenovo.lps.farseer.priest2.ext.SparkExecDao.sparkExecute(SparkExecDao.java:255)
    ... 100 more

从第二次的异常中发现,部署hadoop集群的时候,实施人员将swap分区的目录配置成yarn的任务日志目录中了。即yarn-site.xml的这两个配置(yarn.nodemanager.log-dirs和yarn.nodemanager.local-dirs)。

    <property>
      <name>yarn.nodemanager.local-dirs</name>
      <value>/swap/hadoop/yarn/local,/data/hadoop/yarn/local</value>
    </property>

    <property>
      <name>yarn.nodemanager.log-dirs</name>
      <value>/swap/hadoop/yarn/log,/data/hadoop/yarn/log</value>
    </property>

在配置中去掉swap分区目录,重启yarn即可:

    <property>
      <name>yarn.nodemanager.local-dirs</name>
      <value>/data/hadoop/yarn/local</value>
    </property>

    <property>
      <name>yarn.nodemanager.log-dirs</name>
      <value>/data/hadoop/yarn/log</value>
    </property>

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值