Show something


ccah-500 第5题 How will the Fair Scheduler handle these two jobs?

5.You have a cluster running with the fair Scheduler enabled. There are currently no jobs running on the cluster, and you submit a job A, so that only job A is running on the cluster. A while later, you submit Job B. now Job A and Job B are running on the cluster at the same time. How will the Fair Scheduler handle these two jobs? 

A. When Job B gets submitted, it will get assigned tasks, while job A continues to run with fewer tasks. 

B. When Job B gets submitted, Job A has to finish first, before job B can gets scheduled. 

C. When Job A gets submitted, it doesn't consumes all the task slots. 

D. When Job A gets submitted, it consumes all the task slots. 

Answer: B --> A

解析: A



With the Fair Scheduler (iii in Figure 4-3), there is no need to reserve a set amount of

capacity, since it will dynamically balance resources between all running jobs. Just after

the first (large) job starts, it is the only job running, so it gets all the resources in the

cluster. When the second (small) job starts, it is allocated half of the cluster resources so

that each job is using its fair share of resources.

Note that there is a lag between the time the second job starts and when it receives its fair

share, since it has to wait for resources to free up as containers used by the first job

complete. After the small job completes and no longer requires resources, the large job

goes back to using the full cluster capacity again. The overall effect is both high cluster

utilization and timely small job completion.


Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also a reasonable way to share a cluster between a number of users.


文章标签: ccah ccah500 cloudera
个人分类: ccah-500
下一篇ccah-500 第7题 swap Hadoop daemon data from RAM to disk
想对作者说点什么? 我来说一句