它相对前任解决了JobTracker:单点,过于繁忙(调度,追踪状态)集群规模横向扩展能力有限,分配策略粗略(以(map/reduce)tasks 的数目来估算资源消耗,而不是计算和内存),升级无法局部化,会影响整个集群。
YARN,化整为零,将JobTracker的职责分开还有sharding
ResourceManager,中心 ,接收任务请求;从 NodeManager采集资源状态;调度、启动、failover每一个 Job 所属的 ApplicationMaster。
ApplicationMaster, 相当之前的JobTracker,但是只维护一个job。
NodeManager,容器管理,和RM保持通讯
Container,NodeManager下的一个隔离的执行体,后期将赋予更多的角色,比如像VM一样的有计算和内存资源,而不是简单的m/r task slots。
其它资料:
The ResourceManager has two main components: Scheduler and ApplicationsManager.
The Scheduler has a pluggable policy plug-in, which is responsible for partitioning the cluster resources among the various queues, applications etc. The current Map-Reduce schedulers such as the CapacityScheduler and the FairScheduler would be some examples of the plug-in.