linux bash shell call applicaiton,distributedshell yarn编程指南

hadoop yarn是一个独立的调度框架,我们自己也可以通用yarn提供的api编写程序将我们自己的写的程序用yarn来调度运行

编写yarn程序一般需要

编写一个客户端,客户端定义了启动ApplicationMaster的方式,提交application到RM

编写自己的ApplicationMaster,在ApplicationMaster中创建与RM,NN交互的客户端用于向RM申请资源并且在NN中启动容器运行任务

接口

底层接口

ApplicationClientProtocol

clients和ResourceManager之间用于提交/中断job和获取 application ,集群metrics ,node queue和ACLs信息的底层协议

ApplicationMasterProtocol

AM和RM通信的底层协议,ApplicationMasterProtocol.allocate用于AM和RM心跳

ContainerManagementProtocol

AM和NN通信的底层协议,用于启动和停止容器以及获取运行容器的状态信息

高层接口

ClientResourceManager

使用 YarnClient 对象

ApplicationMasterResourceManager

用于向RM申请Container,使用AMRMClientAsync对象, AMRMClientAsync.CallbackHandler 用于异步事件处理

ApplicationMasterNodeManager

用于在NodeManager上启动容器,使用NMClientAsync 与NodeManager通信,NMClientAsync.CallbackHandler用于异步事件处理

distributedshell

下面我们来看一下hadoop的这个例子程序distributedshell是怎么编写的

编写Client

org.apache.hadoop.yarn.applications.distributedshell.Client这个是客户端入口程序用于和RM交互

程序首先初始化了YarnClient

yarnClient = YarnClient.createYarnClient();

yarnClient.init(conf);

然后调用yarnClient.createApplication方法创建App,获取application id,底层api使用的是ApplicationClientProtocol

// Get a new application id

YarnClientApplication app = yarnClient.createApplication();

GetNewApplicationResponse appResponse = app.getNewApplicationResponse();

根据请求RM获取到的application id 构造ApplicationSubmissionContext,ApplicationSubmissionContext代表了RM启动ApplicationMaster所需要的信息,客户端需要在这个上下文中设置如下信息:

Application 信息: id, name

Queue, priority 信息:提交application到哪个队列, 优先级.

User: 提交application的用户

ContainerLaunchContext: AM将要运行的容器的所有信息的定义,比如Local Resources (binaries, jars, files etc.), Environment settings (CLASSPATH etc.), 需要执行的Command 和 security Tokens (RECT).

ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext();

ApplicationId appId = appContext.getApplicationId();

...

//设置appName

appContext.setApplicationName(appName);

...

//准备Local Resources ,Environment ,和Command

...

// 通过准备的信息构造 application master的 ContainerLaunchContext

ContainerLaunchContext amContainer = ContainerLaunchContext.newInstance( localResources, env, commands, null, null, null);

...

//设置启动资源信息

Resource capability = Resource.newInstance(amMemory, amVCores);

appContext.setResource(capability);

//设置ContainerLaunchContext

appContext.setAMContainerSpec(amContainer);

//设置优先级

Priority pri = Priority.newInstance(amPriority);

appContext.setPriority(pri);

//设置队列

appContext.setQueue(amQueue);

//提交应用

yarnClient.submitApplication(appContext);

举个例子帮助理解

假如以test用户运行hadoop-yarn-applications-distributedshell, 运行命令如下:

hadoop jar hadoop/yarn/hadoop-yarn-applications-distributedshell-2.6.0-cdh5.15.2.jar -jar hadoop/yarn/hadoop-yarn-applications-distributedshell-2.6.0-cdh5.15.2.jar -queue root.download -shell_script /tmp/a.sh

假设yarnClient.createApplication()

申请到的appid为 application_1576067711791_1132781 ,

Client会将 -jar参数 传过来的jar路径即hadoop/yarn/hadoop-yarn-applications-distributedshell-2.6.0-cdh5.15.2.jar作为为本地资源(Local Resources),放到test用户的如下hdfs 路径

/user/test/DistributedShell/application_1576067711791_1132781/AppMaster.jar

对应的执行脚本/tmp/a.sh也会放到对于hadoop家目录对于application 的hdfs路径下:

/user/test/DistributedShell/application_1576067711791_1132781/ExecScript.sh

其中运行ApplicationMaster的comond为:

org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --container_vcores 1 --num_containers 1 --priority 0 1>/AppMaster.stdout 2>/AppMaster.stderr

客户端与RM交互图

69ec7134ca92

ClientRM

ApplicationSubmissionContext结构图

69ec7134ca92

image.png

编写 ApplicationMaster (AM)

AM使用ApplicationAttemptId与RM交互

可以从传进来的env环境变量中获取 Container信息,进一步获取ApplicationAttemptId

ContainerId containerId = ConverterUtils.toContainerId(envs

.get(Environment.CONTAINER_ID.name()));

appAttemptID = containerId.getApplicationAttemptId();

在AM完全初始化自身之后,我们可以启动两个客户端:一个与RM通信,一个与NM通信。并设置相关的事件处理函数

//与RM通信并设置相关的事件处理函数

AMRMClientAsync.CallbackHandler allocListener = new RMCallbackHandler();

amRMClient = AMRMClientAsync.createAMRMClientAsync(1000, allocListener);

amRMClient.init(conf);

amRMClient.start();

//与NM通信并设置相关的事件处理函数

containerListener = createNMCallbackHandler();

nmClientAsync = new NMClientAsyncImpl(containerListener);

nmClientAsync.init(conf);

nmClientAsync.start();

AM必须向RM发出心跳信息

// Register self with ResourceManager

// This will start heartbeating to the RM

appMasterHostname = NetUtils.getHostname();

RegisterApplicationMasterResponse response = amRMClient

.registerApplicationMaster(appMasterHostname, appMasterRpcPort,

appMasterTrackingUrl);

AM向RM申请container资源

// Dump out information about cluster capability as seen by the

// resource manager

int maxMem = response.getMaximumResourceCapability().getMemory();

LOG.info("Max mem capabililty of resources in this cluster " + maxMem);

int maxVCores = response.getMaximumResourceCapability().getVirtualCores();

LOG.info("Max vcores capabililty of resources in this cluster " + maxVCores);

// A resource ask cannot exceed the max.

if (containerMemory > maxMem) {

LOG.info("Container memory specified above max threshold of cluster."

+ " Using max value." + ", specified=" + containerMemory + ", max="

+ maxMem);

containerMemory = maxMem;

}

if (containerVirtualCores > maxVCores) {

LOG.info("Container virtual cores specified above max threshold of cluster."

+ " Using max value." + ", specified=" + containerVirtualCores + ", max="

+ maxVCores);

containerVirtualCores = maxVCores;

}

List previousAMRunningContainers =

response.getContainersFromPreviousAttempts();

LOG.info(appAttemptID + " received " + previousAMRunningContainers.size()

+ " previous attempts' running containers on AM registration.");

numAllocatedContainers.addAndGet(previousAMRunningContainers.size());

int numTotalContainersToRequest =

numTotalContainers - previousAMRunningContainers.size();

// Setup ask for containers from RM

// Send request for containers to RM

// Until we get our fully allocated quota, we keep on polling RM for

// containers

// Keep looping until all the containers are launched and shell script

// executed on them ( regardless of success/failure).

// 申请启动container

for (int i = 0; i < numTotalContainersToRequest; ++i) {

ContainerRequest containerAsk = setupContainerAskForRM();

amRMClient.addContainerRequest(containerAsk);

}

numRequestedContainers.set(numTotalContainers);

AMRMClientAsync.CallbackHandler

onContainersAllocated回调函数启动LaunchContainerRunnable线程执行

//containerListener是 containerListener = createNMCallbackHandler();

//containerListener是NMCallbackHandler类型的

LaunchContainerRunnable runnableLaunchContainer =

new LaunchContainerRunnable(allocatedContainer, containerListener);

Thread launchThread = new Thread(runnableLaunchContainer);

LaunchContainerRunnable

LaunchContainerRunnable 的run方法里面构造了需要在container中运行shell的ContainerLaunchContext,并且绑定containerListener回调函数,然后使用nmClientAsync 异步启动container

//封装用于启动shell脚本的ctx

ContainerLaunchContext ctx = ContainerLaunchContext.newInstance(

localResources, shellEnv, commands, null, allTokens.duplicate(), null);

//注册NMCallbackHandler回调函数

containerListener.addContainer(container.getId(), container);

//使用nmClientAysnc异步启动container

nmClientAsync.startContainerAsync(container, ctx);

整体交互流程图

69ec7134ca92

yarn 交互

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值