Hadoop--YARN

Yarn是一个资源调度平台,负责为运算程序提供服务器运算资源,相当于一个分布式的操作系统平台,而MapReduce等运算程序则相当于运行于操作系统之上的应用程序

YARN基础架构

RM相当于整个集群资源管理器的老大,而NM是单个节点的老大,AM则是管理Map Task 和Reduce Task资源(向RM申请资源,分配给它两),Container相当于一台小电脑

YARN工作机制

0,当Driver类的main方法执行到job,waitForCompletion();时开启YARNRunner  

1,需要申请一个Application

2,RM返回资源提交路径以及application_id

3,提交job运行所需的资源(切片,xml(自己所需要的配置参数)),jar)

4,申请mrAppMaster 

5,RM将用户的请求初始化成一个Task并放入队列

6、7,一个NM领取Task并创建Container

8,从提交的路径中下载job资源到本地

9,AM向RM申请运行MapTask容器

10,创建容器(这里可以再同一个NM下,也可以不同的NM,但都要创建自己的容器)

11,AM发送程序启动脚本

12,运行完Map后AM向RM申请容器运行Reduce Task

14,程序运行完后,AM会向RM申请注销自己

Yarn调度器和调度算法

目前,Hadoop作业调度器主要有三种:FIFO、容量(Capacity Scheduler)和公平(Fair Scheduler)。Apache Hadoop3.1.3默认的资源调度器是Capacity Scheduler,CDH框架默认调度器是Fair Scheduler。

先进先出调度器(FIFO)

        FIFO调度器(First In First Out):单队列,根据提交作业的先后顺序,先来先服务 

        缺点:不支持多队列,生产环境很少使用;

容量调度器(Capacity Scheduler)是Yahoo开发的多用户调度器

 1、多队列:每个队列可配置一定的资源量,每个队列采用FIFO调度策略

 2、容量保证:管理员可为每个队列设置资源最低保证和资源使用上限(即最大最小值)
 3、灵活性:如果一个队列中的资源有剩余,可以暂时共享给那些需要资源的队列,而一旦该队列有新的应用程序提交,则其他队列借调的资源会归还给该队列。
 4、多租户:
        支持多用户共享集群和多应用程序同时运行。
        为了防止同一个用户的作业独占队列中的资源,该调度器会对同一用户提交的作业所占资源            量进行限定。

公平调度器(Fair Scheduler)    Facebook开发的多用户调度器

 

因为要平均分配,所以画圈的地方为job15的缺额 

公平调度器设计目标是:在时间尺度上,所有作业获得公平的资源。某一时刻一个作业应获资源实际获取资源的差距叫“缺额”
调度器会优先为缺额大的作业分配资源

公平调度器队列资源分配方式  :任何情况第一步都是先均分   对并发要求高的用公平调度器

 队列分配

 作业资源分配

 

 Yarn常用命令

Yarn状态的查询,除了可以在hadoop103:8088页面查看外,还可以通过命令操作

//执行WordCount案例
[xwt@hadoop102 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.1.3.jar wordcount /input /outputyarnwc
[xwt@hadoop102 mapreduce]$ pwd
/opt/module/hadoop-3.1.3/share/hadoop/mapreduce

列出所有Application :yarn application -list

[xwt@hadoop102 mapreduce]$ yarn application -list
2022-05-30 19:08:22,105 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):0
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL

根据Application状态过滤:yarn application -list -appStates (所有状态:ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED)

[xwt@hadoop102 mapreduce]$ yarn application -list -appStates FINISHED
2022-05-30 19:05:02,986 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of applications (application-types: [], states: [FINISHED] and tags: []):1
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
application_1653902284486_0002	          word count	           MAPREDUCE	       xwt	   default	          FINISHED	         SUCCEEDED	           100%	http://hadoop102:19888/jobhistory/job/job_1653902284486_0002

Kill掉Application:  yarn application -kill application_1653902284486_0002

[xwt@hadoop102 mapreduce]$ yarn application -kill application_1653902284486_0002
2022-05-30 19:10:11,758 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Application application_1653902284486_0002 has already finished 

yarn logs查看日志

(1)查询Application日志:yarn logs -applicationId <ApplicationId>

[xwt@hadoop102 ~]$ yarn logs -applicationId application_1653902284486_0002

(2)查询Container日志:yarn logs -applicationId <ApplicationId> -containerId <ContainerId>

yarn applicationattempt查看尝试运行的任务

(1)列出所有Application尝试的列表:yarn applicationattempt -list <ApplicationId>

[xwt@hadoop102 ~]$ yarn applicationattempt -list application_1653902284486_0002
2022-05-30 19:22:00,808 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of application attempts :1
         ApplicationAttempt-Id	               State	                    AM-Container-Id	                       Tracking-URL
appattempt_1653902284486_0002_000001	            FINISHED	container_1653902284486_0002_01_000001	http://hadoop103:8088/proxy/application_1653902284486_0002/

(2)打印ApplicationAttemp状态:yarn applicationattempt -status <ApplicationAttemptId>

[xwt@hadoop102 ~]$ yarn applicationattempt -list application_1653902284486_0002
2022-05-30 19:25:55,825 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of application attempts :1
         ApplicationAttempt-Id	               State	                    AM-Container-Id	                       Tracking-URL
appattempt_1653902284486_0002_000001	            FINISHED	container_1653902284486_0002_01_000001	http://hadoop103:8088/proxy/application_1653902284486_0002/
[xwt@hadoop102 ~]$ yarn applicationattempt -status appattempt_1653902284486_0002_000001
2022-05-30 19:27:25,945 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Application Attempt Report : 
	ApplicationAttempt-Id : appattempt_1653902284486_0002_000001
	State : FINISHED
	AMContainer : container_1653902284486_0002_01_000001
	Tracking-URL : http://hadoop103:8088/proxy/application_1653902284486_0002/
	RPC Port : 33783
	AM Host : hadoop102
	Diagnostics : 

yarn container查看容器  容器只有正在运行的程序才会有,运行完就会被释放掉

(1)列出所有Container:yarn container -list <ApplicationAttemptId>

[xwt@hadoop102 ~]$ yarn container -list appattempt_1653902284486_0002_000001
2022-05-30 19:29:58,897 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total number of containers :0
                  Container-Id	          Start Time	         Finish Time	               State	                Host	   Node Http Address	                            LOG-URL

(2)打印Container状态: yarn container -status <ContainerId>注:只有在任务跑的途中才能看到container的状态

[xwt@hadoop102 ~]$ yarn container -status container_1653902284486_0002_01_000001
2022-05-30 19:33:17,243 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Container with id 'container_1653902284486_0002_01_000001' doesn't exist in RM or Timeline Server.

yarn node查看节点状态   列出所有节点:yarn node -list -all

[xwt@hadoop102 ~]$ yarn node -list -all
2022-05-30 19:34:02,601 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Total Nodes:3
         Node-Id	     Node-State	Node-Http-Address	Number-of-Running-Containers
 hadoop104:37056	        RUNNING	   hadoop104:8042	                           0
 hadoop102:41254	        RUNNING	   hadoop102:8042	                           0
 hadoop103:33917	        RUNNING	   hadoop103:8042	                           0

yarn rmadmin更新配置   加载队列配置:yarn rmadmin -refreshQueues

[xwt@hadoop102 ~]$ yarn rmadmin -refreshQueues
2022-05-30 19:34:46,577 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8033

yarn queue查看队列  打印队列信息:yarn queue -status <QueueName>

[xwt@hadoop102 ~]$ yarn queue -status default
2022-05-30 19:35:36,406 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
Queue Information : 
Queue Name : default
	State : RUNNING
	Capacity : 100.0%
	Current Capacity : .0%
	Maximum Capacity : 100.0%
	Default Node Label expression : <DEFAULT_PARTITION>
	Accessible Node Labels : *
	Preemption : disabled
	Intra-queue Preemption : disabled

也可以通过8088页面查看

 Yarn生产环境核心参数

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值