一、spark on yarn 提交任务有两种模式
1、cluster模式 :
ApplicationMaster进程进行资源分配和executor的启动
提交命令:
--master yarn \
--deploy-mode client \
--driver-memory 5G \ applicationMaster 所在的容器分配的内存
--driver-cores 5 \ applicationMaster 所在的容器分配的核数
--executor-memory 3G \ executor所在容器的内存
--executor-cores 2 \ executor所在容器的核数
--num-executors 20 \ 最大分配的excuter数量
driver总内存:
ceil(((driver-memory *0.1 >384? driver-memory *0.1:384)+driver-memory)/yarn.scheduler.minimum-allocation-mb)*yarn.scheduler.minimum-allocation-mb
excutor总内存:
ceil(((executor-memory *0.1 >384? executor-memory *0.1:384)+executor-memory)/yarn.scheduler.minimum-allocation-mb)*yarn.scheduler.minimum-allocation-mb*(实际启动的executor个数)
2、client模式 :
ExecutorLauncher进程进行资源分配和executor的启动
提交命令:
--master yarn \
--deploy-mode client \
--conf spark.yarn.am.memory=1g \ ExecutorLauncher所在容器的内存
--conf spark.yarn.am.cores=3 \ ExecutorLauncher所在容器的核数
--executor-memory 3G \ executor所在容器的内存
--executor-cores 2 \ executor所在容器的核数
--num-executors 20 \ 最大分配的excuter数量
driver总内存:
ceil(((spark.yarn.am.memory *0.1 >384? spark.yarn.am.memory *0.1:384)+spark.yarn.am.memory)/yarn.scheduler.minimum-allocation-mb)*yarn.scheduler.minimum-allocation-mb
excutor总内存:
ceil(((executor-memory *0.1 >384? executor-memory *0.1:384)+executor-memory)/yarn.scheduler.minimum-allocation-mb)*yarn.scheduler.minimum-allocation-mb*(实际启动的executor个数)