Yarn笔记

目录

YARN基础架构

YARN作业提交流程

YARN调度器和调度算法

Yarn重点总结:

1.5 Yarn常用命令

1.5.1 yarn application查看任务

1.5.2 yarn logs查看日志

1.5.3 yarn applicationattempt查看尝试运行的任务

1.5.4 yarn container查看容器

1.5.5 yarn node查看节点状态

1.5.6 yarn rmadmin更新配置

1.5.7 yarn queue查看队列


Yarn是一个资源调度平台,负责为运算程序提供服务器运算资源,相当于一个分布式的操作系统平台,而MapReduce等运算程序则相当于运行于操作系统之上的应用程序。

YARN基础架构

YARN作业提交流程

(1)MR程序提交到客户端所在的节点。

(2)YarnRunner向ResourceManager申请一个Application。

(3)RM该应用程序的资源路径返回给YarnRunner

(4)该程序将运行所需资源提交到HDFS上

(5)程序资源提交完毕后,申请运行mrAppMaster

(6)RM将用户的请求初始化成一个Task

(7)其中一个NodeManager领取到Task任务。

(8)该NodeManager创建容器Container,并产生MRAppmaster

(9)Container从HDFS上拷贝资源到本地

(10)MRAppmasterRM 申请运行MapTask资源。

(11)RM运行MapTask任务分配给另外两个NodeManager另两个NodeManager分别领取任务创建容器。

(12)MR向两个接收到任务的NodeManager发送程序启动脚本这两个NodeManager分别启动MapTaskMapTask数据分区排序。

(13)MrAppMaster等待所有MapTask运行完毕后,向RM申请容器,运行ReduceTask

(14)ReduceTask向MapTask获取相应分区的数据。

(15)程序运行完毕后,MR会向RM申请注销自己。

 

可以画一张大图,把HDFS读写流程与MapReduce串联起来,再把Yarn的工作流程画出来,就会理解Hadoop底层的流程。有HDFS文件读写系统,有MapReduce数据计算框架,有Yarn资源调度框架。

YARN调度器和调度算法

刷新队列:yarn rmadmin -refreshQueues

Yarn重点总结:

1.5 Yarn常用命令

Yarn状态的查询,除了可以在hadoop103:8088页面查看外,还可以通过命令操作。常见的命令操作如下所示:

需求:执行WordCount案例,并用Yarn命令查看任务运行情况。

[atguigu@hadoop102 hadoop-3.1.3]$ myhadoop.sh start

[atguigu@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output

1.5.1 yarn application查看任务

(1)列出所有Application:

[atguigu@hadoop102 hadoop-3.1.3]$ yarn application -list

2021-02-06 10:21:19,238 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):0

                Application-Id     Application-Name     Application-Type       User      Queue              State        Final-State        Progress                        Tracking-URL

(2)根据Application状态过滤:yarn application -list -appStates (所有状态:ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED)

[atguigu@hadoop102 hadoop-3.1.3]$ yarn application -list -appStates FINISHED

2021-02-06 10:22:20,029 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Total number of applications (application-types: [], states: [FINISHED] and tags: []):1

                Application-Id     Application-Name     Application-Type       User      Queue              State        Final-State        Progress                        Tracking-URL

application_1612577921195_0001           word count            MAPREDUCE    atguigu    default           FINISHED          SUCCEEDED            100% http://hadoop102:19888/jobhistory/job/job_1612577921195_0001

(3)Kill掉Application:

[atguigu@hadoop102 hadoop-3.1.3]$ yarn application -kill application_1612577921195_0001

2021-02-06 10:23:48,530 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Application application_1612577921195_0001 has already finished

1.5.2 yarn logs查看日志

(1)查询Application日志:yarn logs -applicationId <ApplicationId>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn logs -applicationId application_1612577921195_0001

(2)查询Container日志:yarn logs -applicationId <ApplicationId> -containerId <ContainerId>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn logs -applicationId application_1612577921195_0001 -containerId container_1612577921195_0001_01_000001

1.5.3 yarn applicationattempt查看尝试运行的任务

(1)列出所有Application尝试的列表:yarn applicationattempt -list <ApplicationId>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn applicationattempt -list application_1612577921195_0001

2021-02-06 10:26:54,195 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Total number of application attempts :1

         ApplicationAttempt-Id                State                     AM-Container-Id                        Tracking-URL

appattempt_1612577921195_0001_000001             FINISHED container_1612577921195_0001_01_000001 http://hadoop103:8088/proxy/application_1612577921195_0001/

(2)打印ApplicationAttemp状态:yarn applicationattempt -status <ApplicationAttemptId>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn applicationattempt -status appattempt_1612577921195_0001_000001

2021-02-06 10:27:55,896 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Application Attempt Report :

ApplicationAttempt-Id : appattempt_1612577921195_0001_000001

State : FINISHED

AMContainer : container_1612577921195_0001_01_000001

Tracking-URL : http://hadoop103:8088/proxy/application_1612577921195_0001/

RPC Port : 34756

AM Host : hadoop104

Diagnostics :

1.5.4 yarn container查看容器

(1)列出所有Container:yarn container -list <ApplicationAttemptId>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn container -list appattempt_1612577921195_0001_000001

2021-02-06 10:28:41,396 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Total number of containers :0

                  Container-Id           Start Time          Finish Time                State                 Host    Node Http Address

(2)打印Container状态: yarn container -status <ContainerId>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn container -status container_1612577921195_0001_01_000001

2021-02-06 10:29:58,554 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Container with id 'container_1612577921195_0001_01_000001' doesn't exist in RM or Timeline Server.

    注:只有在任务跑的途中才能看到container的状态

1.5.5 yarn node查看节点状态

列出所有节点:yarn node -list -all

[atguigu@hadoop102 hadoop-3.1.3]$ yarn node -list -all

2021-02-06 10:31:36,962 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Total Nodes:3

         Node-Id      Node-State Node-Http-Address Number-of-Running-Containers

 hadoop103:38168         RUNNING    hadoop103:8042                            0

 hadoop102:42012         RUNNING    hadoop102:8042                            0

 hadoop104:39702         RUNNING    hadoop104:8042                            0

1.5.6 yarn rmadmin更新配置

加载队列配置:yarn rmadmin -refreshQueues

[atguigu@hadoop102 hadoop-3.1.3]$ yarn rmadmin -refreshQueues

2021-02-06 10:32:03,331 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8033

1.5.7 yarn queue查看队列

打印队列信息:yarn queue -status <QueueName>

[atguigu@hadoop102 hadoop-3.1.3]$ yarn queue -status default

2021-02-06 10:32:33,403 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032

Queue Information :

Queue Name : default

State : RUNNING

Capacity : 100.0%

Current Capacity : .0%

Maximum Capacity : 100.0%

Default Node Label expression : <DEFAULT_PARTITION>

Accessible Node Labels : *

Preemption : disabled

Intra-queue Preemption : disabled

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值