YARN和MapReduce内存分配计算公式

TomAndersen

已于 2022-03-31 15:55:47 修改

阅读量2.6k

点赞数 7

分类专栏： Hadoop 文章标签： hadoop 大数据 mapreduce

于 2020-06-08 11:36:16 首次发布

本文链接：https://blog.csdn.net/TomAndersen/article/details/106615454

版权

Hadoop 专栏收录该内容

20 篇文章 0 订阅

订阅专栏

前言

hadoop：2.7.7
本文内容均来自：HDP Command Line Installation 2.6.5 中第1.10节
HDP（Hortonworks Data Platform）是最常见的Hadoop的第三方发行版之一，类似的Hadoop发行版还有CDH、MapR等
YARN和MapReduce此处特指Hadoop自带的分布式资源调度框架，和分布式计算框架，其中MapReduce默认运行在YARN之上

计算YARN和MapReduce的内存需求

在Hadoop集群中，YARN管理着集群中的每个节点上的可用资源，并为运行在集群中的应用程序（如MapReduce）提供需要的资源，其中Container是YARN中最小的资源分配单位，同时也是硬件资源的一种封装。而如何分配RAM（amount of memory）、CORES（number of CPU core）、DISKS（number of disks），对于能否平衡集群的节点利用率起着至关重要的作用。

对于集群中的每个节点而言，其所具有的内存资源，除了用于YARN使用，还需要保留一部分用于系统进程和其他的Hadoop进程使用（如HBase等），因此每个节点为YARN分配的内存要小于实际的物理内存容量。以下是不同内存大小情况下的保留内存推荐值：

每个节点物理内存	系统保留内存推荐值	HBase保留内存推荐值
4GB	1GB	1GB
8GB	2GB	1GB
16GB	2GB	2GB
24GB	4GB	4GB
48GB	6GB	8GB
64GB	8GB	8GB
72GB	8GB	8GB
96GB	12GB	16GB
128GB	24GB	24GB
256GB	32GB	32GB
512GB	64GB	64GB

YARN和MapReduce内存分配参数及计算公式

每个节点的Container个数最大值：

containers = min (2*CORES, 1.8*DISKS, (total available RAM) / MIN_CONTAINER_SIZE)

total available RAM，Total available RAM = Total RAM - Reserved Memory（即可用内存 = 每个节点的实际物理内存 - 保留内存）
CORES，即当前节点的CPU核心个数
DISKS，即当前节点 hdfs-site.xml 文件中 dfs.data.dirs 或 dfs.datanode.data.dir 参数中设置的各个路径所在的不同磁盘个数
MIN_CONTAINER_SIZE，即当前节点Container的最小内存分配

MIN_CONTAINER_SIZE 的推荐值：

每个节点物理内存	MIN_CONTAINER_SIZE推荐值
Less than 4 GB	256 MB
Between 4 GB and 8 GB	512 MB
Between 8 GB and 24 GB	1024 MB
Above 24 GB	2048 MB

每个Container分配的内存大小：

RAM-per-container = max(MIN_CONTAINER_SIZE, (total available RAM / containers)

YARN和MapReduce各参数计算公式：

配置文件	配置参数	计算公式
yarn-site.xml	yarn.nodemanager.resource.memory-mb	= containers * RAM-per-container
yarn-site.xml	yarn.scheduler.minimum-allocation-mb	= RAM-per-container
yarn-site.xml	yarn.scheduler.maximum-allocation-mb	= containers * RAM-per-container
mapred-site.xml	mapreduce.map.memory.mb	= RAM-per-container
mapred-site.xml	mapreduce.reduce.memory.mb	= 2 * RAM-per-container
mapred-site.xml	mapreduce.map.java.opts	= 0.8 * RAM-per-container
mapred-site.xml	mapreduce.reduce.java.opts	= 0.8 * 2 * RAM-per-container
mapred-site.xml	yarn.app.mapreduce.am.resource.mb	= 2 * RAM-per-container
mapred-site.xml	yarn.app.mapreduce.am.command-opts	= 0.8 * 2 * RAM-per-container

其他参数：

yarn.nodemanager.resource.cpu-vcores： 表示该节点上YARN可使用的虚拟CPU个数，默认是8，注意，目前推荐将该值设值为与物理CPU核数数目相同。如果你的节点CPU核数不够8个，则需要调减小这个值，而YARN不会智能的探测节点的物理CPU总数。PS： 建议保留给操作系统一定数量的CPU个数。
yarn.scheduler.minimum-allocation-vcores： 单个任务可申请的最小虚拟CPU个数，默认是1，如果一个任务申请的CPU个数少于该数，则该对应的值改为这个数。
yarn.scheduler.maximum-allocation-vcores： 单个任务可申请的最多虚拟CPU个数，默认是32。此参数建议设置成 “节点个数 * yarn.nodemanager.resource.cpu-vcores”。

计算示例

假设当前集群中每个节点中CORES = 12，RAM = 48GB，DISKS = 12，且没有安装HBase，则根据之前的内存推荐值和计算公式可得：

Reserved RAM = 6GB（操作系统保留内存）

MIN_CONTAINER_SIZE = 2GB

Total available RAM = 48 - 6 = 42GB

containers = min (2*12, 1.8*12, 42/2) = min (24, 21.6, 21) = 21

RAM-per-container = max (2, 42/21) = max (2, 2) = 2GB

参数计算：

配置文件	配置参数	计算公式
yarn-site.xml	yarn.nodemanager.resource.memory-mb	= containers * RAM-per-container = 43008MB
yarn-site.xml	yarn.scheduler.minimum-allocation-mb	= RAM-per-container = 2048MB
yarn-site.xml	yarn.scheduler.maximum-allocation-mb	= containers * RAM-per-container = 43008MB
mapred-site.xml	mapreduce.map.memory.mb	= RAM-per-container = 2048MB
mapred-site.xml	mapreduce.reduce.memory.mb	= 2 * RAM-per-container = 4096MB
mapred-site.xml	mapreduce.map.java.opts	= 0.8 * RAM-per-container = 1638MB
mapred-site.xml	mapreduce.reduce.java.opts	= 0.8 * 2 * RAM-per-container = 3276MB
mapred-site.xml	yarn.app.mapreduce.am.resource.mb	= 2 * RAM-per-container = 4096MB
mapred-site.xml	yarn.app.mapreduce.am.command-opts	= 0.8 * 2 * RAM-per-container = 3276MB

其他参数：

配置文件	配置参数	值
yarn-site.xml	yarn.scheduler.minimum-allocation-vcores	1
yarn-site.xml	yarn.scheduler.maximum-allocation-vcores	11*nodes
yarn-site.xml	yarn.nodemanager.resource.cpu-vcores	11

参数计算脚本工具

python3

#!/usr/bin/env python
'''
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
'''

import optparse
from pprint import pprint
import logging
import sys
import math
import ast

''' Reserved for OS + DN + NM,  Map: Memory => Reservation '''
reservedStack = { 4:1, 8:2, 16:2, 24:4, 48:6, 64:8, 72:8, 96:12, 
                   128:24, 256:32, 512:64}
''' Reserved for HBase. Map: Memory => Reservation '''
  
reservedHBase = {4:1, 8:1, 16:2, 24:4, 48:8, 64:8, 72:8, 96:16, 
                   128:24, 256:32, 512:64}
GB = 1024

def getMinContainerSize(memory):
  if (memory <= 4):
    return 256
  elif (memory <= 8):
    return 512
  elif (memory <= 24):
    return 1024
  else:
    return 2048
  pass

def getReservedStackMemory(memory):
  if memory in reservedStack:
    return reservedStack[memory]
  if (memory <= 4):
    ret = 1
  elif (memory >= 512):
    ret = 64
  else:
    ret = 1
  return ret

def getReservedHBaseMem(memory):
  if memory in reservedHBase:
    return reservedHBase[memory]
  if (memory <= 4):
    ret = 1
  elif (memory >= 512):
    ret = 64
  else:
    ret = 2
  return ret
                    
def main():
  log = logging.getLogger(__name__)
  out_hdlr = logging.StreamHandler(sys.stdout)
  out_hdlr.setFormatter(logging.Formatter(' %(message)s'))
  out_hdlr.setLevel(logging.INFO)
  log.addHandler(out_hdlr)
  log.setLevel(logging.INFO)
  parser = optparse.OptionParser()
  memory = 0
  cores = 0
  disks = 0
  hbaseEnabled = True
  parser.add_option('-c', '--cores', default = 16,
                     help = 'Number of cores on each host')
  parser.add_option('-m', '--memory', default = 64, 
                    help = 'Amount of Memory on each host in GB')
  parser.add_option('-d', '--disks', default = 4, 
                    help = 'Number of disks on each host')
  parser.add_option('-k', '--hbase', default = "True",
                    help = 'True if HBase is installed, False is not')
  (options, args) = parser.parse_args()
  
  cores = int (options.cores)
  memory = int (options.memory)
  disks = int (options.disks)
  hbaseEnabled = ast.literal_eval(options.hbase)
  
  log.info("Using cores=" +  str(cores) + " memory=" + str(memory) + "GB" +
            " disks=" + str(disks) + " hbase=" + str(hbaseEnabled))
  minContainerSize = getMinContainerSize(memory)
  reservedStackMemory = getReservedStackMemory(memory)
  reservedHBaseMemory = 0
  if (hbaseEnabled):
    reservedHBaseMemory = getReservedHBaseMem(memory)
  reservedMem = reservedStackMemory + reservedHBaseMemory
  usableMem = memory - reservedMem
  memory -= (reservedMem)
  if (memory < 2):
    memory = 2
    reservedMem = max(0, memory - reservedMem)
    
  memory *= GB
  
  containers = int (min(2 * cores,
                         min(math.ceil(1.8 * float(disks)),
                              memory/minContainerSize)))
  if (containers <= 2):
    containers = 3

  log.info("Profile: cores=" + str(cores) + " memory=" + str(memory) + "MB"
           + " reserved=" + str(reservedMem) + "GB" + " usableMem="
           + str(usableMem) + "GB" + " disks=" + str(disks))
    
  container_ram =  abs(memory/containers)
  if (container_ram > GB):
    container_ram = int(math.floor(container_ram / 512)) * 512
  log.info("Num Container=" + str(containers))
  log.info("Container Ram=" + str(container_ram) + "MB")
  log.info("Used Ram=" + str(int (containers*container_ram/float(GB))) + "GB")
  log.info("Unused Ram=" + str(reservedMem) + "GB")
  log.info("yarn.scheduler.minimum-allocation-mb=" + str(container_ram))
  log.info("yarn.scheduler.maximum-allocation-mb=" + str(containers*container_ram))
  log.info("yarn.nodemanager.resource.memory-mb=" + str(containers*container_ram))
  map_memory = container_ram
  reduce_memory = 2*container_ram if (container_ram <= 2048) else container_ram
  am_memory = max(map_memory, reduce_memory)
  log.info("mapreduce.map.memory.mb=" + str(map_memory))
  log.info("mapreduce.map.java.opts=-Xmx" + str(int(0.8 * map_memory)) +"m")
  log.info("mapreduce.reduce.memory.mb=" + str(reduce_memory))
  log.info("mapreduce.reduce.java.opts=-Xmx" + str(int(0.8 * reduce_memory)) + "m")
  log.info("yarn.app.mapreduce.am.resource.mb=" + str(am_memory))
  log.info("yarn.app.mapreduce.am.command-opts=-Xmx" + str(int(0.8*am_memory)) + "m")
  log.info("mapreduce.task.io.sort.mb=" + str(int(0.4 * map_memory)))
  pass

if __name__ == '__main__':
  try:
    main()
  except(KeyboardInterrupt, EOFError):
    print("\nAborting ... Keyboard Interrupt.")
    sys.exit(1)

python2

#!/usr/bin/env python
'''
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
'''

import optparse
from pprint import pprint
import logging
import sys
import math
import ast

''' Reserved for OS + DN + NM,  Map: Memory => Reservation '''
reservedStack = { 4:1, 8:2, 16:2, 24:4, 48:6, 64:8, 72:8, 96:12, 
                   128:24, 256:32, 512:64}
''' Reserved for HBase. Map: Memory => Reservation '''
  
reservedHBase = {4:1, 8:1, 16:2, 24:4, 48:8, 64:8, 72:8, 96:16, 
                   128:24, 256:32, 512:64}
GB = 1024

def getMinContainerSize(memory):
  if (memory <= 4):
    return 256
  elif (memory <= 8):
    return 512
  elif (memory <= 24):
    return 1024
  else:
    return 2048
  pass

def getReservedStackMemory(memory):
  if (reservedStack.has_key(memory)):
    return reservedStack[memory]
  if (memory <= 4):
    ret = 1
  elif (memory >= 512):
    ret = 64
  else:
    ret = 1
  return ret

def getReservedHBaseMem(memory):
  if (reservedHBase.has_key(memory)):
    return reservedHBase[memory]
  if (memory <= 4):
    ret = 1
  elif (memory >= 512):
    ret = 64
  else:
    ret = 2
  return ret
                    
def main():
  log = logging.getLogger(__name__)
  out_hdlr = logging.StreamHandler(sys.stdout)
  out_hdlr.setFormatter(logging.Formatter(' %(message)s'))
  out_hdlr.setLevel(logging.INFO)
  log.addHandler(out_hdlr)
  log.setLevel(logging.INFO)
  parser = optparse.OptionParser()
  memory = 0
  cores = 0
  disks = 0
  hbaseEnabled = True
  parser.add_option('-c', '--cores', default = 16,
                     help = 'Number of cores on each host')
  parser.add_option('-m', '--memory', default = 64, 
                    help = 'Amount of Memory on each host in GB')
  parser.add_option('-d', '--disks', default = 4, 
                    help = 'Number of disks on each host')
  parser.add_option('-k', '--hbase', default = "True",
                    help = 'True if HBase is installed, False is not')
  (options, args) = parser.parse_args()
  
  cores = int (options.cores)
  memory = int (options.memory)
  disks = int (options.disks)
  hbaseEnabled = ast.literal_eval(options.hbase)
  
  log.info("Using cores=" +  str(cores) + " memory=" + str(memory) + "GB" +
            " disks=" + str(disks) + " hbase=" + str(hbaseEnabled))
  minContainerSize = getMinContainerSize(memory)
  reservedStackMemory = getReservedStackMemory(memory)
  reservedHBaseMemory = 0
  if (hbaseEnabled):
    reservedHBaseMemory = getReservedHBaseMem(memory)
  reservedMem = reservedStackMemory + reservedHBaseMemory
  usableMem = memory - reservedMem
  memory -= (reservedMem)
  if (memory < 2):
    memory = 2
    reservedMem = max(0, memory - reservedMem)
    
  memory *= GB
  
  containers = int (min(2 * cores,
                         min(math.ceil(1.8 * float(disks)),
                              memory/minContainerSize)))
  if (containers <= 2):
    containers = 3

  log.info("Profile: cores=" + str(cores) + " memory=" + str(memory) + "MB"
           + " reserved=" + str(reservedMem) + "GB" + " usableMem="
           + str(usableMem) + "GB" + " disks=" + str(disks))
    
  container_ram =  abs(memory/containers)
  if (container_ram > GB):
    container_ram = int(math.floor(container_ram / 512)) * 512
  log.info("Num Container=" + str(containers))
  log.info("Container Ram=" + str(container_ram) + "MB")
  log.info("Used Ram=" + str(int (containers*container_ram/float(GB))) + "GB")
  log.info("Unused Ram=" + str(reservedMem) + "GB")
  log.info("yarn.scheduler.minimum-allocation-mb=" + str(container_ram))
  log.info("yarn.scheduler.maximum-allocation-mb=" + str(containers*container_ram))
  log.info("yarn.nodemanager.resource.memory-mb=" + str(containers*container_ram))
  map_memory = container_ram
  reduce_memory = 2*container_ram if (container_ram <= 2048) else container_ram
  am_memory = max(map_memory, reduce_memory)
  log.info("mapreduce.map.memory.mb=" + str(map_memory))
  log.info("mapreduce.map.java.opts=-Xmx" + str(int(0.8 * map_memory)) +"m")
  log.info("mapreduce.reduce.memory.mb=" + str(reduce_memory))
  log.info("mapreduce.reduce.java.opts=-Xmx" + str(int(0.8 * reduce_memory)) + "m")
  log.info("yarn.app.mapreduce.am.resource.mb=" + str(am_memory))
  log.info("yarn.app.mapreduce.am.command-opts=-Xmx" + str(int(0.8*am_memory)) + "m")
  log.info("mapreduce.task.io.sort.mb=" + str(int(0.4 * map_memory)))
  pass

if __name__ == '__main__':
  try:
    main()
  except(KeyboardInterrupt, EOFError):
    print("\nAborting ... Keyboard Interrupt.")
    sys.exit(1)

脚本选项解释

Option	Description
-c, --CORES	The number of cores on each host, default = 16
-m, --MEMORY	The amount of memory on each host, in gigabytes, default = 64
-d, --disks	The number of disks on each host, default = 4
-k, --hbase	True if HBase is installed on this host, False is not, default = “True”

使用示例

python yarn-utils.py -c 12 -m 48 -d 12 -k False

结果：

 Using cores=12 memory=48GB disks=12 hbase=False
 Profile: cores=12 memory=43008MB reserved=6GB usableMem=42GB disks=12
 Num Container=21
 Container Ram=2048MB
 Used Ram=42GB
 Unused Ram=6GB
 yarn.scheduler.minimum-allocation-mb=2048
 yarn.scheduler.maximum-allocation-mb=43008
 yarn.nodemanager.resource.memory-mb=43008
 mapreduce.map.memory.mb=2048
 mapreduce.map.java.opts=-Xmx1638m
 mapreduce.reduce.memory.mb=4096
 mapreduce.reduce.java.opts=-Xmx3276m
 yarn.app.mapreduce.am.resource.mb=4096
 yarn.app.mapreduce.am.command-opts=-Xmx3276m
 mapreduce.task.io.sort.mb=819

End~

TomAndersen

关注

7
点赞
踩
6

收藏

觉得还不错? 一键收藏
1
评论
YARN和MapReduce内存分配计算公式

前言hadoop：2.7.7本文内容均来自：HDP Command Line Installation 2.6.5 中第1.10节HDP（Hortonworks Data Platform）是最常见的Hadoop的第三方发行版之一，类似的Hadoop发行版还有CDH、MapR等计算YARN和MapReduce的内存需求在Hadoop集群中，YARN管理着集群中的每个节点上的可用资源，并为运行在集群中的应用程序（如MapReduce）提供需要的资源，其中Container是YARN中最小的资源
复制链接

扫一扫

专栏目录