MapReduce YARN Memory Parameters

Environment

ProductVersion
PHD2.x - 3.x
OSRHEL 6.x

Overview

This is an informational article that attempts to contain the complexity of the Hadoop parameters used to manage memory allocations for MapReduce jobs that are executed in the YARN framework. 

What is a container?

A container is a YARN JVM process. In MapReduce the application master service, mapper and reducer tasks are all containers that execute inside the YARN framework. You can view running container stats by accessing Resource Managers web interface "http://<resource_manager_host>:8088/cluster/scheduler"

The Basics

YARN Resource Manager (RM) allocates resources to the application through logical queues which include memory, CPU, and disks resources.  By default, the RM will allow up to 8192MB ("yarn.scheduler.maximum-allocation-mb") to an Application Master (AM) container allocation request. The default minimum allocation is 1024MB ("yarn.scheduler.minimum-allocation-mb").  The AM can only request resources from the RM that are in increments of ("yarn.scheduler.minimum-allocation-mb") and do not exceed ("yarn.scheduler.maximum-allocation-mb")  The AM is responsible for rounding off ("mapreduce.map.memory.mb") and ("mapreduce.reduce.memory.mb")  to a value divisible by the ("yarn.scheduler.minimum-allocation-mb").   RM will deny an allocation greater than 8192MB and a value not divisible by 1024MB in the following example.  

The Params

  • YARN
    • yarn.scheduler.minimum-allocation-mb
    • yarn.scheduler.maximum-allocation-mb
    • yarn.nodemanager.vmem-pmem-ratio
    • yarn.nodemanager.resource.memory.mb
  • MapReduce
    • Map Memory
      • mapreduce.map.java.opts
      • mapreduce.map.memory.mb
    • Reduce Memory
      • mapreduce.reduce.java.opts
      • mapreduce.reduce.memory.mb

The above diagram shows an example of a map, reduce, and application master container (AM JVM).  The JVM rectangle represents the server process. The Max Heap and Max Virtual rectangles represent the maximum logical memory constraints for the JVM processes which are enforced by the NodeManager. 

The Map container memory allocation ("mapreduce.map.memory.mb") is set to 1536MB in this example.  The AM will request 2048MB from the RM for a Map Container because the minimum allocation ("yarn.scheduler.minimum-allocation-mb") is set to 1024MB.  This allocation is a logical allocation used by the NodeManager to monitor the process memory usage. If the Map tasks heap usage exceeds 2048MB then NodeManager will kill the task.  The JVM heap size set to 1024MB ("mapreduce.map.java.opts=-Xmx1024m") which fits well inside the logical allocation of 2048MB.  The same is true for the reduce container.  ("mapreduce.reduce.memory.mb") is set to 3072MB.  

When a MapReduce job completes you will see several counters dumped at the end of the job. The three memory counters below show how much physical memory was allocated vs how much virtual memory. 

Physical memory (bytes) snapshot=21850116096
Virtual memory (bytes) snapshot=40047247360
Total committed heap usage (bytes)=22630105088

Virtual Memory

By default ("yarn.nodemanager.vmem-pmem-ratio") is set to 2.1.  This means that a map or reduce container can allocate up to 2.1 times the ("mapreduce.reduce.memory.mb") or ("mapreduce.map.memory.mb")  of virtual memory before the NM will kill the container.  If the ("mapreduce.map.memory.mb") is set to 1536MB then the total allowed virtual memory is 2.1 * 1536 = 3225.6MB.

The log messages look similar to the below example when the NM kills a container due to memory oversubscription. 

Current usage: 2.1gb of 2.0gb physical memory used; 1.6gb of 3.15gb virtual memory used. Killing container.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值