YARN (Yet Another Resource Negotiator) - Cluster Manager_yet another resource negotiator 是指?区块链-CSDN博客

- what is yarn

- Yarn application run

- Resources request

all requests up front (Spark) or dynamic request (MapReduce, mapper tasks requests are up front, but reduce tasks are dynamic)

- application lifespan, one application per job (MR), one application per workflow (Spark, more efficient since containers are reused), long-running application

- Yarn compared to MR 1

scalability availability utilization multitenancy

- Scheduling,

delay scheduling to meet data locality,

preemption based on timeout on minimum share and half of its fair share

FIFO
Capacity
Fair

<?xml version="1.0"?>
<allocations>
    <defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
    <queue name="prod">
        <weight>40</weight>
        <schedulingPolicy>fifo</schedulingPolicy>
    </queue>
    <queue name="dev">
        <weight>60</weight>
        <queue name="eng" />
        <queue name="science" />
    </queue>
    <queuePlacementPolicy>
        <rule name="specified" create="false" /> //create specified queue ?
        <rule name="primaryGroup" create="false" /> //create queue of user's primary unix group ?
        <rule name="default" queue="dev.eng" />
    </queuePlacementPolicy>
</allocations>

- Dominant Resource Fairness

Imagine a cluster with a total of 100 CPUs and 10 TB of memory. Application A requests
containers of (2 CPUs, 300 GB), and application B requests containers of (6 CPUs, 100
GB). A’s request is (2%, 3%) of the cluster, so memory is dominant since its proportion
(3%) is larger than CPU’s (2%). B’s request is (6%, 1%), so CPU is dominant. Since B’s
container requests are twice as big in the dominant resource (6% versus 3%), it will be
allocated half as many containers under fair sharing.