大数据—Hadoop（十六）_ Yarn_02、命令行操作和核心参数配置

大数据之负

已于 2022-11-03 11:35:39 修改

阅读量2.1k

点赞数

分类专栏： Hadoop 文章标签：大数据 hadoop 分布式

于 2022-11-03 09:00:00 首次发布

本文链接：https://blog.csdn.net/m0_52968216/article/details/127446284

版权

Hadoop 专栏收录该内容

24 篇文章 2 订阅

订阅专栏

文章目录

1、命令行操作Yarn
2、Yarn 生产环境核心参数
3、Yarn 生产环境核心参数配置案例

1、命令行操作Yarn

1.1 网页查看Yarn状态

网址
hadoop103:8088
在这里插入图片描述
执行WordCount任务开始

hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output1

1.2 yarn application查看正在运行的任务

列出所有 Application：yarn application -list

[atguigu@hadoop102 ~]$ yarn application -list
2022-10-22 12:05:24,481 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):0
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
[atguigu@hadoop102 ~]$ yarn application -list
2022-10-22 12:05:56,314 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):1
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
application_1666410750861_0002	          word count	           MAPREDUCE	   atguigu	   default	          ACCEPTED	         UNDEFINED	             0%	                                N/A
[atguigu@hadoop102 ~]$ yarn application -list
2022-10-22 12:06:06,044 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):1
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
application_1666410750861_0002	          word count	           MAPREDUCE	   atguigu	   default	           RUNNING	         UNDEFINED	            50%	             http://hadoop104:38677
[atguigu@hadoop102 ~]$ yarn application -list
2022-10-22 12:06:09,257 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):0
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL

将web上的任务信息打印到了控制台
在这里插入图片描述

根据任务状态过滤

所有状态：ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED
例：打印所有已经跑完的任务：yarn application -list -appStates FINISHED

[atguigu@hadoop102 ~]$ yarn application -list -appStates FINISHED
2022-10-22 12:13:58,928 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [FINISHED] and tags: []):2
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
application_1666410750861_0001	          word count	           MAPREDUCE	   atguigu	   default	          FINISHED	         SUCCEEDED	           100%	http://hadoop102:19888/jobhistory/job/job_1666410750861_0001
application_1666410750861_0002	          word count	           MAPREDUCE	   atguigu	   default	          FINISHED	         SUCCEEDED	           100%	http://hadoop102:19888/jobhistory/job/job_1666410750861_0002

Kill掉Application：yarn application -kill 任务ID

[atguigu@hadoop102 ~]$ yarn application -kill application_1666410750861_0003
2022-10-22 12:22:13,796 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Killing application application_1666410750861_0003
2022-10-22 12:22:14,522 INFO impl.YarnClientImpl: Killed application application_1666410750861_0003
[atguigu@hadoop102 ~]$ yarn application -list -appStates KILLED
2022-10-22 12:23:37,002 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [KILLED] and tags: []):1
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
application_1666410750861_0003	          word count	           MAPREDUCE	   atguigu	   default	            KILLED	            KILLED	           100%	http://hadoop103:8088/cluster/app/application_1666410750861_0003

1.3 yarn logs 查看日志

将任务日志打印到控制台，生产环境常用：yarn logs -applicationId 任务ID

[atguigu@hadoop102 ~]$ yarn logs -applicationId application_1666410750861_0002

将任务日志下的子任务日志打印到控制台，通过子任务的容器ID来判别子任务：yarn logs -applicationId 任务ID -containerId 容器ID

[atguigu@hadoop102 ~]$ yarn logs -applicationId application_1666515493135_0004  -containerId  container_1666515493135_0004_01_000001

1.4 yarn applicationattempt 查看尝试运行的任务

列出当前任务的尝试任务列表：yarn applicationattempt -list 任务ID
主要用于查看尝试任务ID和Container容器信息
不仅可以打印正在RUNNING的任务，还可以打印FINISHED、KiLLED类型的任务

[atguigu@hadoop102 ~]$ yarn applicationattempt -list application_1666515493135_0002
2022-10-23 17:15:25,144 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of application attempts :1
         ApplicationAttempt-Id	               State	                    AM-Container-Id	                       Tracking-URL
appattempt_1666515493135_0002_000001	            FINISHED	container_1666515493135_0002_01_000001	http://hadoop103:8088/proxy/application_1666515493135_0002/
[atguigu@hadoop102 ~]$ yarn applicationattempt -list application_1666515493135_0003
2022-10-23 17:15:28,579 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of application attempts :1
         ApplicationAttempt-Id	               State	                    AM-Container-Id	                       Tracking-URL
appattempt_1666515493135_0003_000001	              KILLED	container_1666515493135_0003_01_000001	http://hadoop103:8088/cluster/app/application_1666515493135_0003

打印ApplicationAttemp状态：yarn applicationattempt -status 尝试任务ID

[atguigu@hadoop102 ~]$ yarn applicationattempt -status appattempt_1666518343848_0003_000001
2022-10-23 17:53:48,570 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Application Attempt Report : 
	ApplicationAttempt-Id : appattempt_1666518343848_0003_000001
	State : RUNNING
	AMContainer : container_1666518343848_0003_01_000001
	Tracking-URL : http://hadoop103:8088/proxy/application_1666518343848_0003/
	RPC Port : 37555
	AM Host : hadoop103
	Diagnostics : 
[atguigu@hadoop102 ~]$ yarn applicationattempt -status appattempt_1666518343848_0003_000001
2022-10-23 17:58:56,823 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Application Attempt Report : 
	ApplicationAttempt-Id : appattempt_1666518343848_0003_000001
	State : FINISHED
	AMContainer : container_1666518343848_0003_01_000001
	Tracking-URL : http://hadoop103:8088/proxy/application_1666518343848_0003/
	RPC Port : 37555
	AM Host : hadoop103
	Diagnostics :

1.5 yarn container查看容器

列出所有Container：yarn container -list 尝试任务ID
容器信息，只有任务正在运行时才可以查看，否则是查看不到任何信息的
因为容器会在任务一结束时就立即释放

[atguigu@hadoop102 ~]$ yarn container -list appattempt_1666518343848_0004_000001
2022-10-23 18:07:16,321 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of containers :1
                  Container-Id	          Start Time	         Finish Time	               State	                Host	   Node Http Address	                            LOG-URL
container_1666518343848_0004_01_000001	星期日 十月 23 18:07:02 +0800 2022	                 N/A	             RUNNING	     hadoop102:34951	http://hadoop102:804http://hadoop102:8042/node/containerlogs/container_1666518343848_0004_01_000001/atguigu
[atguigu@hadoop102 ~]$ yarn container -list appattempt_1666518343848_0004_000001
2022-10-23 18:10:07,820 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of containers :0
                  Container-Id	          Start Time	         Finish Time	               State	                Host	   Node Http Address	                            LOG-URL

打印Container状态： yarn container -status 容器ID

[atguigu@hadoop102 ~]$ yarn container -status container_1666515493135_0006_01_000001
2022-10-23 17:35:20,188 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Container Report : 
	Container-Id : container_1666515493135_0006_01_000001
	Start-Time : 1666517710188
	Finish-Time : 0
	State : RUNNING
	Execution-Type : GUARANTEED
	LOG-URL : http://hadoop102:8042/node/containerlogs/container_1666515493135_0006_01_000001/atguigu
	Host : hadoop102:34218
	NodeHttpAddress : http://hadoop102:8042
	Diagnostics : null
[atguigu@hadoop102 ~]$ yarn container -status container_1666515493135_0004_01_000001
2022-10-23 17:31:59,399 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Container with id 'container_1666515493135_0004_01_000001' doesn't exist in RM or Timeline Server.

1.6 yarn node查看节点状态

可以看到集群有三台Node Manager，都正在运行：yarn node -list -all

[atguigu@hadoop102 ~]$ yarn node -list -all
2022-10-23 18:22:39,998 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total Nodes:3
         Node-Id	     Node-State	Node-Http-Address	Number-of-Running-Containers
 hadoop103:35552	        RUNNING	   hadoop103:8042	                           0
 hadoop102:34951	        RUNNING	   hadoop102:8042	                           0
 hadoop104:36756	        RUNNING	   hadoop104:8042	                           0

1.7 yarn rmadmin更新配置

刷新队列相关参数：yarn rmadmin -refreshQueues
后续我们修改队列相关参数信息，一般是需要重新启动yarn
如果不想重新启动，也可以直接执行该命令，则会直接重新读取参数配置，修改队列信息
动态修改不停机

[atguigu@hadoop102 ~]$ yarn rmadmin -refreshQueues
2022-10-23 18:23:03,910 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8033

1.8 yarn queue查看队列

查看队列资源使用情况，如状态、资源、已使用率……
yarn queue -status 队列名

[atguigu@hadoop102 ~]$ yarn queue -status default
2022-10-23 18:23:43,909 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Queue Information : 
Queue Name : default
	State : RUNNING
	Capacity : 100.0%
	Current Capacity : .0%
	Maximum Capacity : 100.0%
	Default Node Label expression : <DEFAULT_PARTITION>
	Accessible Node Labels : *
	Preemption : disabled
	Intra-queue Preemption : disabled

也可以在网页上查看，网页的内容更全
在这里插入图片描述

2、Yarn 生产环境核心参数

2.1 Resource Manager 相关

2.1.1 调度器配置：默认容量调度器

对于大公司或者是并发度要求比较高的公司，选择公平调度器
对于中小型公司或者是并发度要求比较低的公司，选择容量调度器
FIFO不选

yarn-default.xml

<property>
  <description>The class to use as the resource scheduler.</description>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

2.1.2 处理调度器请求线程数量，默认50

指Resource Manager 默认最多处理50个提交上来的任务，后续会讲怎么调整

<property>
  <description>Number of threads to handle scheduler interface.</description>
  <name>yarn.resourcemanager.scheduler.client.thread-count</name>
  <value>50</value>
</property>

2.2 Node Manager 相关（单节点）

2.2.1 是否让Yarn检查硬件进行配置

默认是false，通常情况也是选false，让开发自己根据硬件配置，自主调整相关参数
false类似于安装软件时自定义安装，true为自动安装

<property>
  <description>Enable auto-detection of node capabilities such as
  memory and CPU.
  </description>
  <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
  <value>false</value>
</property>

2.2.2 是否将虚拟CPU核数当作CPU核数

默认是false，通常情况也是选false
在各台服务器的CPU配置不一样时，才会选择去开启虚拟核
例如：Node Manager1 是i7 ， Node Manager2 是 i5，Node Manager3 是 i3。但一般情况下，服务器都是一起购买的，配置基本相同

<property>
  <description>Flag to determine if logical processors(such as
  hyperthreads) should be counted as cores. Only applicable on Linux
  when yarn.nodemanager.resource.cpu-vcores is set to -1 and
  yarn.nodemanager.resource.detect-hardware-capabilities is true.
  </description>
  <name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
  <value>false</value>
</property>

2.2.3 虚拟核数和物理核数的比值

默认是1.0，1核1线程
如果电脑是4核8线程，则设置为2.0

<property>
  <description>Multiplier to determine how to convert phyiscal cores to
  vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
  is set to -1(which implies auto-calculate vcores) and
  yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The
  number of vcores will be calculated as
  number of CPUs * multiplier.
  </description>
  <name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
  <value>1.0</value>
</property>

2.2.4 Node Manager使用的内存

默认8G，生产环境必须调整，服务器一般是128G内存起

<property>
  <description>Amount of physical memory, in MB, that can be allocated 
  for containers. If set to -1 and
  yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
  automatically calculated(in case of Windows and Linux).
  In other cases, the default is 8192MB.
  </description>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>-1</value>
</property>

2.2.5 Node Manager为系统保留多少内存

2.2.4和2.2.5只选其一配置
系统内存 = 2.2.4 Node Manager使用的内存 + 2.2.5 Node Manager为系统保留多少内存

<property>
  <description>Amount of physical memory, in MB, that is reserved
  for non-YARN processes. This configuration is only used if
  yarn.nodemanager.resource.detect-hardware-capabilities is set
  to true and yarn.nodemanager.resource.memory-mb is -1. If set
  to -1, this amount is calculated as
  20% of (system memory - 2*HADOOP_HEAPSIZE)
  </description>
  <name>yarn.nodemanager.resource.system-reserved-memory-mb</name>
  <value>-1</value>
</property>

2.2.6 Node Manager占用的CPU核数

默认8个

<property>
  <description>Number of vcores that can be allocated
  for containers. This is used by the RM scheduler when allocating
  resources for containers. This is not used to limit the number of
  CPUs used by YARN containers. If it is set to -1 and
  yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
  automatically determined from the hardware in case of Windows and Linux.
  In other cases, number of vcores is 8 by default.</description>
  <name>yarn.nodemanager.resource.cpu-vcores</name>
  <value>-1</value>
</property>

2.2.7 是否开启物理内存检查限制container

默认打开
检查当前节点， Node Manager 使用的内存有没有超过预设的内存大小
例如：Node Manager预设内存为100G，Linux系统内存为128G，当Node Manager使用的内存超过100G时会预警，防止占用Linux系统内存

<property>
  <description>Whether physical memory limits will be enforced for
  containers.</description>
  <name>yarn.nodemanager.pmem-check-enabled</name>
  <value>true</value>
</property>

2.2.8 是否开启虚拟内存检查限制container

默认打开
检查当前节点， Node Manager 使用的内存有没有超过预设的逻辑上内存大小
例如：Node Manager预设虚拟内存为200G，Linux系统虚拟内存为256G，当Node Manager使用的内存超过200G时会预警，防止占用Linux系统内存（后续会详细讲解）

<property>
  <description>Whether virtual memory limits will be enforced for
  containers.</description>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>true</value>
</property>

2.2.9 虚拟内存和物理内存的比例

默认是2.1倍

<property>
  <description>Ratio between virtual memory to physical memory when
  setting memory limits for containers. Container allocations are
  expressed in terms of physical memory, and virtual memory usage
  is allowed to exceed this allocation by this ratio.
  </description>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>2.1</value>
</property>

2.3 Contain 相关

2.3.1 容器最小内存

默认1G

<property>
  <description>The minimum allocation for every container request at the RM
  in MBs. Memory requests lower than this will be set to the value of this
  property. Additionally, a node manager that is configured to have less memory
  than this value will be shut down by the resource manager.</description>
  <name>yarn.scheduler.minimum-allocation-mb</name>
  <value>1024</value>
</property>

2.3.2 容器最大内存

默认8G，容器最大内存不能超过Node Manager 预设的使用内存

<property>
  <description>The maximum allocation for every container request at the RM
  in MBs. Memory requests higher than this will throw an
  InvalidResourceRequestException.</description>
  <name>yarn.scheduler.maximum-allocation-mb</name>
  <value>8192</value>
</property>

2.3.3 容器最小CPU核数

默认1个

<property>
  <description>The minimum allocation for every container request at the RM
  in terms of virtual CPU cores. Requests lower than this will be set to the
  value of this property. Additionally, a node manager that is configured to
  have fewer virtual cores than this value will be shut down by the resource
  manager.</description>
  <name>yarn.scheduler.minimum-allocation-vcores</name>
  <value>1</value>
</property>

2.3.4 容器最大CPU核数

默认4个，生产环境必须调整，服务器一般是20核40线程

<property>
  <description>The maximum allocation for every container request at the RM
  in terms of virtual CPU cores. Requests higher than this will throw an
  InvalidResourceRequestException.</description>
  <name>yarn.scheduler.maximum-allocation-vcores</name>
  <value>4</value>
</property>

3、Yarn 生产环境核心参数配置案例

3.1 需求分析

需求

对1G数据，统计每个单词出现次数

服务器

3台，每台配置4G内存，4核CPU，4线程
如果服务器CPU配置较好，可以设置成8线程

查看CPU核数和线程数方式

如图：4核8线程

在这里插入图片描述

需求分析

数据可以被切片
1G数据，1G/128 = 8
- 所以需要8个Map Task
- 1个Reduce Task
- 1个MrAppMaster
一共要开启10个容器，平均每个节点运行 10个 / 3台 ≈ 3个任务（4 3 3）

3.2 核心配置

3.2.1 Resource Manager 相关

调度器配置：默认容量调度器

<property>
	<description>The class to use as the resource scheduler.</description>
	<name>yarn.resourcemanager.scheduler.class</name>
	<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

处理调度器请求线程数量，默认50
如果提交的任务数大于50，可以增加该值，但是不能超过3台 * 4线程 = 12线程
指集群最多可以运行12个任务
Resource Manager用于接收任务的线程，按照该需求，最多是会同时开启8个MapTask，所以给8，而且103上除了有 Resouce Manager

<property>
	<description>Number of threads to handle scheduler interface.</description>
	<name>yarn.resourcemanager.scheduler.client.thread-count</name>
	<value>8</value>
</property>

3.2.2 Node Manager 相关（单节点）

是否让Yarn检查硬件进行配置
默认是false，通常情况也是选false，让开发自己根据硬件配置，自主调整相关参数
如果这个参数设置成true，下面无论怎么配置都是无效的

<property>
	<description>Enable auto-detection of node capabilities such as
	memory and CPU.
	</description>
	<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
	<value>false</value>
</property>

是否将虚拟核数当作CPU核数，默认是false，采用物理CPU核数
因为每一台服务器CPU核数都是一样的，如果有某台服务器CPU核数配置较好，则需要在该服务器上的 yarn-site.xml，设置成true
再对下面虚拟核数和物理核数的比值参数做相应调整
如果是虚拟成2台，设置成2.0；如果是3台，设置3.0

<property>
	<description>Flag to determine if logical processors(such as
	hyperthreads) should be counted as cores. Only applicable on Linux
	when yarn.nodemanager.resource.cpu-vcores is set to -1 and
	yarn.nodemanager.resource.detect-hardware-capabilities is true.
	</description>
	<name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
	<value>false</value>
</property>

虚拟核数和物理核数的比值，默认是1.0

<property>
	<description>Multiplier to determine how to convert phyiscal cores to
	vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
	is set to -1(which implies auto-calculate vcores) and
	yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The	number of vcores will be calculated as	number of CPUs * multiplier.
	</description>
	<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
	<value>1.0</value>
</property>

Node Manager使用的内存，默认8G，修改为4G内存
因为案例使用的服务器内存为4G

<property>
	<description>Amount of physical memory, in MB, that can be allocated 
	for containers. If set to -1 and
	yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
	automatically calculated(in case of Windows and Linux).
	In other cases, the default is 8192MB.
	</description>
	<name>yarn.nodemanager.resource.memory-mb</name>
	<value>4096</value>
</property>

Node Manager占用的CPU核数，默认8个，修改为4个

<property>
	<description>Number of vcores that can be allocated
	for containers. This is used by the RM scheduler when allocating
	resources for containers. This is not used to limit the number of
	CPUs used by YARN containers. If it is set to -1 and
	yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
	automatically determined from the hardware in case of Windows and Linux.
	In other cases, number of vcores is 8 by default.</description>
	<name>yarn.nodemanager.resource.cpu-vcores</name>
	<value>4</value>
</property>

3.2.3 Contain 相关

容器最小内存，默认1G，
单节点有4G，可以开4个容器

<property>
	<description>The minimum allocation for every container request at the RM	in MBs. Memory requests lower than this will be set to the value of this	property. Additionally, a node manager that is configured to have less memory	than this value will be shut down by the resource manager.
	</description>
	<name>yarn.scheduler.minimum-allocation-mb</name>
	<value>1024</value>
</property>

容器最大内存，默认8G，修改为2G
单节点有4G，可以开2个容器跑大型任务

<property>
	<description>The maximum allocation for every container request at the RM	in MBs. Memory requests higher than this will throw an	InvalidResourceRequestException.
	</description>
	<name>yarn.scheduler.maximum-allocation-mb</name>
	<value>2048</value>
</property>

容器最小CPU核数，默认1个

<property>
	<description>The minimum allocation for every container request at the RM	in terms of virtual CPU cores. Requests lower than this will be set to the	value of this property. Additionally, a node manager that is configured to	have fewer virtual cores than this value will be shut down by the resource	manager.
	</description>
	<name>yarn.scheduler.minimum-allocation-vcores</name>
	<value>1</value>
</property>

容器最大CPU核数，默认4个，修改为2个

<property>
	<description>The maximum allocation for every container request at the RM	in terms of virtual CPU cores. Requests higher than this will throw an
	InvalidResourceRequestException.</description>
	<name>yarn.scheduler.maximum-allocation-vcores</name>
	<value>2</value>
</property>

虚拟内存检查，默认打开，修改为关闭
因为centos7以上的版本和jdk1.8存在bug
物理内存默认打开，但不做修改，正常使用

<property>
	<description>Whether virtual memory limits will be enforced for
	containers.</description>
	<name>yarn.nodemanager.vmem-check-enabled</name>
	<value>false</value>
</property>

虚拟内存和物理内存设置比例,默认2.1

<property>
	<description>Ratio between virtual memory to physical memory when	setting memory limits for containers. Container allocations are	expressed in terms of physical memory, and virtual memory usage	is allowed to exceed this allocation by this ratio.
	</description>
	<name>yarn.nodemanager.vmem-pmem-ratio</name>
	<value>2.1</value>
</property>

3.3 配置步骤

进入 yarn-site.xml 存储路径

cd /opt/module/hadoop-3.1.3/etc/hadoop

修改配置（将上面的配置粘贴到 xml 文件中）
分发配置（注意：如果集群的硬件资源不一致，要每个NodeManager单独配置）
重新启动集群
可以看到上面配置已被修改

在这里插入图片描述

跑一个WordCount任务，可以看到使用了2G内存

[atguigu@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output5

在这里插入图片描述

大数据之负

关注

0
点赞
踩
12

收藏

觉得还不错? 一键收藏
打赏
0
评论
大数据—Hadoop（十六）_ Yarn_02、命令行操作和核心参数配置

Yarn资源管理协调软件，解耦了计算本身和计算管理两件事情，让后续出现的Tez、Spark、Flink等框架都能通过Yarn来管理资源，只处理自身擅长的数据计算工作
复制链接

扫一扫