Hadoop Yarn-CSDN博客

本文链接：https://blog.csdn.net/lucklilili/article/details/119867549

The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job or a DAG of jobs.

基本思想是将资源管理和作业调度/监视的功能划分为单独的守护进程。其想法是拥有一个全局资源管理器（RM）和每个应用程序应用程序管理员（AM）。应用程序可以是单个作业，也可以是多个作业。

The ResourceManager and the NodeManager form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler.

ResourceManager和NodeManager构成了数据计算框架。ResourceManager是在系统中所有应用程序之间仲裁资源的最终机构。NodeManager是每台机器的框架代理，负责容器，监视其资源使用情况（cpu、内存、磁盘、网络），并将其报告给ResourceManager/Scheduler。

The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.

每个应用程序ApplicationMaster实际上是一个特定于框架的库，其任务是与ResourceManager协商资源，并与NodeManager合作执行和监视任务。

The ResourceManager has two main components: Scheduler and ApplicationsManager.

ResourceManager有两个主要组件：调度器和ApplicationManager。

The Scheduler is responsible for allocating resources to the various running applications subject to familiar constraints of capacities, queues etc. The Scheduler is pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based on the resource requirements of the applications; it does so based on the abstract notion of a resource Container which incorporates elements such as memory, cpu, disk, network etc.

调度器负责将资源分配给各种运行中的应用程序，这些应用程序受到熟悉的容量、队列等约束。调度器是纯粹的调度器，因为它不监视或跟踪应用程序的状态。此外，它也不能保证由于应用程序故障或硬件故障而重新启动失败的任务。调度器根据应用程序的资源需求执行其调度功能；它是基于资源容器的抽象概念来实现的，资源容器包含内存、cpu、磁盘、网络等元素。

The Scheduler has a pluggable policy which is responsible for partitioning the cluster resources among the various queues, applications etc. The current schedulers such as the CapacityScheduler and the FairScheduler would be some examples of plug-ins.

调度器有一个可插拔策略，负责在各种队列、应用程序等之间划分集群资源。当前的调度器（如CapacityScheduler和FairScheduler）就是插件的一些示例。

ApplicationManager负责接受作业提交，协商用于执行特定于应用程序的ApplicationMaster的第一个容器，并提供在出现故障时重新启动

The ApplicationsManager is responsible for accepting job-submissions, negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, tracking their status and monitoring for progress.

ApplicationMaster容器的服务。每个应用程序应用程序管理员负责与调度程序协商合适的资源容器，跟踪其状态并监控进度。

MapReduce in hadoop-2.x maintains API compatibility with previous stable release (hadoop-1.x). This means that all MapReduce jobs should still run unchanged on top of YARN with just a recompile.

hadoop-2.x中的MapReduce与以前的稳定版本（hadoop-1.x）保持API兼容性。这意味着，所有MapReduce作业都应该在仅重新编译的情况下在纱线的顶部保持不变地运行。

YARN supports the notion of resource reservation via the ReservationSystem, a component that allows users to specify a profile of resources over-time and temporal constraints (e.g., deadlines), and reserve resources to ensure the predictable execution of important jobs.The ReservationSystem tracks resources over-time, performs admission control for reservations, and dynamically instruct the underlying scheduler to ensure that the reservation is fullfilled.

Thread通过ReservationSystem支持资源保留的概念，ReservationSystem是一个组件，允许用户指定资源随时间和时间限制（如截止日期）变化的配置文件，并保留资源以确保重要作业的可预测执行。ReservationSystem随时间跟踪资源，对保留执行准入控制，并动态指示基础计划程序确保保留已满。

In order to scale YARN beyond few thousands nodes, YARN supports the notion of Federation via the YARN Federation feature. Federation allows to transparently wire together multiple yarn (sub-)clusters, and make them appear as a single massive cluster. This can be used to achieve larger scale, and/or to allow multiple independent clusters to be used together for very large jobs, or for tenants who have capacity across all of them.

为了将纱线扩展到数千个节点之外，纱线通过纱线联盟功能支持联盟的概念。联邦允许透明地将多个纱线（子）簇连接在一起，并使它们看起来像一个巨大的簇。这可用于实现更大的规模，和/或允许多个独立集群一起用于非常大的工作，或用于具有所有工作能力的租户。