Yarn资源调优与调度器

最新推荐文章于 2024-07-25 21:52:39 发布

妖精小狗

最新推荐文章于 2024-07-25 21:52:39 发布

阅读量1k

点赞数 2

分类专栏：大数据数仓文章标签： yarn CDH Hive hadoop

本文链接：https://blog.csdn.net/baidu_29343517/article/details/121847480

版权

大数据数仓专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Container

容器，虚拟化的，维度
内容是memory+vcore
负责运行task任务

生产如何调优Container参数：

yarnResource

假设128G，16物理core，分配内存

装完CentOS，消耗内存1G

系统预览15%-20%内存（包含装完CentOS需要的内存空间），以防全部使用导致系统夯住，和oom机制事件，或者给未来部署组件预览点空间

计算空间128 * 20% = 25.6G == 26G

假设只有DN NM节点，余下内存有128-26 = 102g

假设DN拥有内存2G，还有NM留存4G这样就剩下102-2-4=96G

container内存

划分区域，每个区域的内存

yarn.nodemanager.resource.memory-mb   96G
（调度的最小内存）yarn.scheduler.minimum-allocation-mb     1G   
极限情况下，只有96个container 内存1G
（调度的最大内存，一般设置成和环境一样大的内存）yarn.scheduler.maximum-allocation-mb     96G  
极限情况下，只有1个container 内存96G
container的内存会自动增加，默认1g递增
所以这种情况下container的个数是1-96个

container虚拟核

物理核:虚拟核一般比例是1:2 32vcore

yarn.nodemanager.resource.pcores-vcores-multiplier 2
yarn.nodemanager.resource.cpu-vcores      32
yarn.scheduler.minimum-allocation-vcores    1  极限情况下，只有32个container	
yarn.scheduler.maximum-allocation-vcores   32  极限情况下，只有1个container
所以这种情况下container的个数是1-32个

官方建议

cloudera公司推荐，一个container的vcore最好不要超过5，那么我们设置4（这是核心）

1	yarn.scheduler.maximum-allocation-vcores 4 极限情况下，只有8个container

综合memory+vcore

确定vcore = 4，container = 8

yarn.nodemanager.resource.memory-mb     96G
yarn.scheduler.minimum-allocation-mb       1G
yarn.scheduler.maximum-allocation-mb      12G 极限container 8个
当然当spark计算时内存不够大，这个参数肯定要调大，那么这种理想化的设置个数必然要打破，以memory为主
yarn.nodemanager.resource.cpu-vcores            32
yarn.scheduler.minimum-allocation-vcores         1
yarn.scheduler.maximum-allocation-vcores	4 极限container 8个
所以在设定的情况下最好是8个container，每个有12G，再加上4个vcore

举个例子：

yarn.nodemanager.resource.memory-mb     96G
yarn.scheduler.minimum-allocation-mb	      1G  
yarn.scheduler.maximum-allocation-mb      8G 
yarn.nodemanager.resource.cpu-vcores      32
yarn.scheduler.minimum-allocation-vcores    1
yarn.scheduler.maximum-allocation-vcores    2

如果是上面的情况，在以内存为主导的情况下：
12container 12 * 2 = 24但是现在有32个vcore（不能100%完全使用）
但是如果有16 container每个8g内存就会爆掉

假如 256G内存 56core，请问参数如何设置

256 * 20% = 52G
DN拥有内存2G，还有NM留存4G这样就剩下204 - 2 -4 = 198G

yarn.nodemanager.resource.memory-mb     198G
yarn.scheduler.minimum-allocation-mb       1G
yarn.scheduler.maximum-allocation-mb      24G 极限container 8个
yarn.nodemanager.resource.cpu-vcores            112
yarn.scheduler.minimum-allocation-vcores         1
yarn.scheduler.maximum-allocation-vcores	4 极限container 8个

假如该节点还有组件，比如hbase regionserver进程，那么该如何设置？

假如hbase regionserver有30G，此时内存的缩放就是102-2-4-30=66G

vcore是yarn自己引入的

设计初衷是考虑不同节点的CPU的性能不一样，每个CPU的计算能力不一样。
比如某个物理CPU是另外一个物理CPU的2倍，这时通过设置第一个物理CPU的虚拟core来弥补这种差异。

第一台机器很强悍设置为pcore: vcore=1:2
第二台机器不强悍设置为pcore: vcore=1:1
现在的CPU性能都差不多所以xml配置，所有节点pcore: vcore=1:2

再回顾架构

yarn

Client提交一个job到Resource Manager，然后RM会向某一个Node Manager请求一个Container容器来运行job的App Master，App Master启动成功之后会去申请资源，并且到对应的Container容器的节点去跑task任务。

调度器

有三种调度器：

FIFO（先进先出）

fifo

2处表示job2一直等待job1完成之后再提交

Capacity（计算）

capacity

有一个专门的队列来运行小任务，但是为了小任务专门设置一个队列预先占用一定的集群资源，这会导致大任务的执行时间落后FIFO的调度时间。

Fair（公平）

fair

Apache中默认是Capacity（计算）：

yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

CDH默认Fair（公平）：

Web界面可以查看，CDH认为计算调度器会占一个位置20%的队列就会浪费了
CDH fair来配置动态资源池，放置规则

three

常用命令

yarn jar

[mao@JD root]$ yarn
Usage: yarn [--config confdir] COMMAND
where COMMAND is one of:
  resourcemanager -format-state-store   deletes the RMStateStore
  resourcemanager                       run the ResourceManager
                                        Use -format-state-store for deleting the RMStateStore.
                                        Use -remove-application-from-state-store <appId> for 
                                            removing application from RMStateStore.
  nodemanager                           run a nodemanager on each slave
  timelineserver                        run the timeline server
  rmadmin                               admin tools
  version                               print the version
  jar <jar>                             run a jar file
  application                           prints application(s)
                                        report/kill application
  applicationattempt                    prints applicationattempt(s)
                                        report
  container                             prints container(s) report
  node                                  prints node report(s)
  queue                                 prints queue information
  logs                                  dump container logs
  classpath                             prints the class path needed to
                                        get the Hadoop jar and the
                                        required libraries
  daemonlog                             get/set the log level for each
                                        daemon
  top                                   run cluster usage tool
 or
  CLASSNAME                             run the class named CLASSNAME

Most commands print help when invoked w/o parameters.

yarn application -kill <Application ID> （如果权限控制严格会经常使用）

[mao@JD root]$ yarn application
19/12/17 17:53:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
19/12/17 17:53:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Invalid Command Usage :
usage: application
-appStates <States> Works with -list to filter applications
based on input comma-separated list of
application states. The valid application
state can be one of the following:
ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
NING,FINISHED,FAILED,KILLED
-appTypes <Types> Works with -list to filter applications
based on input comma-separated list of
application types.
-help Displays help for all commands.
-kill <Application ID> Kills the application.
-list List applications. Supports optional use
of -appTypes to filter applications based
on application type, and -appStates to
filter applications based on application
state.
-movetoqueue <Application ID> Moves the application to a different
queue.
-queue <Queue Name> Works with the movetoqueue command to
specify which queue to move an
application to.
-status <Application ID> Prints the status of the application.

还有yarn logs

[mao@JD root]$ yarn logs
Retrieve logs for completed YARN applications.
usage: yarn logs -applicationId <application ID> [OPTIONS]
general options are:
 -appOwner <Application Owner>   AppOwner (assumed to be current user if
                                 not specified)
 -containerId <Container ID>     ContainerId (must be specified if node
                                 address is specified)
 -nodeAddress <Node Address>     NodeAddress in the format nodename:port
                                 (must be specified if container id is
                                 specified)

木桶效应：

一个水桶无论有多高，它盛水的高度取决于其中最低的那块木板。

原始的 128M：