spark运行进程解读
- Application => driver programe + n executors on cluster 都是进程
- Driver program => main() + SparkContext
- Cluster manager => standalone | mesos | YARN
- Deploy mode => client | cluster 模式
- Worker node => run application
- Executor => 进程 , 运行 task,Cache, 每个application有独自得executors
- Task => 工作单元,发送到executor内执行
- Job => 遇到action 动作,产生job(save collect)
- stage => 遇到shuffle产生新得stage => reduceByKey
Spark 应用程序: 每个进程都相互独立, 每个应用程序中有独立得executor