Spark Trouble Shooting and Performance Tuning

- spark master server - more memory 

export SPARK_DAEMON_MEMORY=5g
spark.ui.retainedJobs 500   # 默认都是1000
spark.ui.retainedStages 500

spark.history.retainedApplications application in memory

in spark-env.sh


- Runtime error -> add more memory and CPU to executor

  • missing output location
  • failed to connect to host

- executor lost, task lost and timeout -> increase spark.network.timeout

==========================================

- idle executor -> more partitions and lower below values

spark.locality.wait
spark.locality.wait.process
spark.locality.wait.node
spark.locality.wait.rack

- spark task keep failing 

spark.scheduler.executorTaskBlacklistTime 30000

blacklist executor for 30s  

- spark higher API (DataFrame, DataSet, Spark SQL) over lower API

higher API is optimized by catalyst and tungsten

- balanced yarn resource to let executor run multiple tasks

- proper partition size

- avoid too many shuffle data

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值