#0. 先看一下测试demo的运行流程
a.创建一个YARN客户端YarnClient,
并与ResourceManager建立连接
b.通过YARN客户端创建一个应用并获取到应用提交上下文对象、设置相关的属性
-
特别是需要设置setAMContainerSpec,
防止后面getTokensConf时报空指针异常;
-
还需要设置setUnmanagedAM,不需要RM来管理AM(分配并启动Container),置为true之后就不再需要设置请求的资源大小了(UAM主要用来做测试)。
c.通过YARN客户端提交应用
d.获取并监控应用的状态ApplicationReport
e.判断应用状态,如果为ACCEPTED(已接受),则继续监控应用尝试的启动状态,并获取到应用尝试ID
f.跟据应用尝试ID获取到通信令牌Token
g.拿着Token去注册ApplicationMaster
h.最后输出一下响应信息
RegisterApplicationMasterResponse
测试代码片段:
@Test
public void testAMRMClientAsync() throws Exception {
YarnConfiguration yarnConfig = new YarnConfiguration();
YarnClient yarnClient = YarnClient.createYarnClient();
yarnClient.init(yarnConfig);
// a.创建一个YARN客户端YarnClient,并与ResourceManager建立连接
yarnClient.start();
// 应用提交上下文
ApplicationSubmissionContext appContext = yarnClient.createApplication().getApplicationSubmissionContext();
ApplicationId appId = appContext.getApplicationId();
// 设置应用名称
appContext.setApplicationName("zlxiaoxiang_test");
// 为ApplicationMaster设置优先级
Priority pri = Records.newRecord(Priority.class);
pri.setPriority(1);
appContext.setPriority(pri);
// 设置调度队列名称 默认default
appContext.setQueue("default");
// 为AM设置Container启动的上下文
ContainerLaunchContext amContainer = Records.newRecord(ContainerLaunchContext.class);
// 防止tokenConf报空指针异常 ByteBuffer tokenConf = submissionContext.getAMContainerSpec().getTokensConf();
appContext.setAMContainerSpec(amContainer);
// 不由RM管理,此处设置为true;RM不会为AM分配Container并启动它;
appContext.setUnmanagedAM(true);
// 设置要请求的资源大小
// appContext.setResource(Resource.newInstance(512, 1));
// c.通过YARN客户端提交应用
yarnClient.submitApplication(appContext);
// d.获取并监控应用的状态ApplicationReport
ApplicationReport appReport = monitorApplication(yarnClient, appId, EnumSet.of(YarnApplicationState.ACCEPTED,
YarnApplicationState.KILLED, YarnApplicationState.FAILED,
YarnApplicationState.FINISHED));
// e.判断应用状态,如果为ACCEPTED(已接受),则继续监控应用尝试的启动状态,并获取到应用尝试ID
if (appReport.getYarnApplicationState() == YarnApplicationState.ACCEPTED) {
// 监控应用状态
ApplicationAttemptReport attemptReport = monitorCurrentAppAttempt(yarnClient, appId, YarnApplicationAttemptState.LAUNCHED);
// 应用尝试ID、用于启动ApplicationMaster
ApplicationAttemptId attemptId = attemptReport.getApplicationAttemptId();
// f.跟据应用尝试ID获取到通信令牌Token
Token<AMRMTokenIdentifier> token = yarnClient.getAMRMToken(attemptId.getApplicationId());
UserGroupInformation userGroupInformation = UserGroupInformation.getCurrentUser();
userGroupInformation.addToken(token);
// g.拿着Token去注册AM,否则跳转不到RM服务端
userGroupInformation.doAs((PrivilegedExceptionAction<Void>) () -> {
// 创建一个AM-RM异步客户端
AMRMClientAsync<AMRMClient.ContainerRequest> resourceManagerClient = AMRMClientAsync.createAMRMClientAsync(
5000, new TestCallbackHandler());
resourceManagerClient.init(yarnConfig);
// 与RM建立连接
resourceManagerClient.start();
// 调用registerApplicationMaster
final RegisterApplicationMasterResponse registerApplicationMasterResponse =
resourceManagerClient.registerApplicationMaster("localhost", 10088, "https://blog.icocoro.me/");
LOG.info("registerApplicationMasterResponse: " + registerApplicationMasterResponse);
return null;
});
}
yarnClient.stop();
}
整个DEBUG过程:
a.在idea中DEBUG模式启动ResourceManager,
在重点关注的位置打上断点,也可以边调试边打断点
b.在idea中DEBUG模式运行testAMRMClientAsync,
从主要关注的位置开始一步一步往下走
#1. DEBUG模式运行ResourceManager
#2. 浏览器访问默认的8088端口
#3. 调试开始,跳到提交应用的断点位置
#4. 会调用到RM端的ClientRMService,此处是可能发生NPE的位置
客户端设置了相关属性即可避免
#5. 调用到RM端的RMAppManager,进行应用注册
#6. 应用注册结束,返回到测试客户端
判断应用当前的状态
#7. 观察YARN的WEB界面
已经有一个应用了,状态是ACCEPTED
#8. 需要取得通信Token
可以看到Token的属性:
private byte[] identifier;
private byte[] password;
private Text kind;
private Text service;
private TokenRenewer renewer;
#9. 准备调用AMRMClientAsync的
registerApplicationMaster方法
#10. 进入AMRMClientAsyncImpl
#11. 进入AMRMClientImpl
这里用到rmClient
是一个ApplicationMasterProtocolPBClientImpl对象
#12. 进入
ApplicationMasterProtocolPBClientImpl
这里用到proxy
是一个ApplicationMasterProtocolPBServiceImpl对象
#13. 进入
ApplicationMasterProtocolPBServiceImpl
这里用到real
是一个ApplicationMasterService对象
#14. 进入ApplicationMasterService
这里用到amsProcessingChain
是一个AMSProcessingChain对象
实现了ApplicationMasterServiceProcessor接口
public interface ApplicationMasterServiceProcessor {
/**
* Initialize with and ApplicationMasterService Context as well as the
* next processor in the chain.
* @param amsContext AMSContext.
* @param nextProcessor next ApplicationMasterServiceProcessor
*/
void init(ApplicationMasterServiceContext amsContext,
ApplicationMasterServiceProcessor nextProcessor);
/**
* Register AM attempt.
* @param applicationAttemptId applicationAttemptId.
* @param request Register Request.
* @param response Register Response.
* @throws IOException IOException.
* @throws YarnException in critical situation where invalid
* profiles/resources are added.
*/
void registerApplicationMaster(ApplicationAttemptId applicationAttemptId,
RegisterApplicationMasterRequest request,
RegisterApplicationMasterResponse response)
throws IOException, YarnException;
/**
* Allocate call.
* @param appAttemptId appAttemptId.
* @param request Allocate Request.
* @param response Allocate Response.
* @throws YarnException YarnException.
*/
void allocate(ApplicationAttemptId appAttemptId,
AllocateRequest request, AllocateResponse response) throws YarnException;
/**
* Finish AM.
* @param applicationAttemptId applicationAttemptId.
* @param request Finish AM Request.
* @param response Finish AM Response.
*/
void finishApplicationMaster(
ApplicationAttemptId applicationAttemptId,
FinishApplicationMasterRequest request,
FinishApplicationMasterResponse response);
}
#15. 进入AMSProcessingChain
这里的head是个DisabledPlacementProcessor
当然也是ApplicationMasterServiceProcessor
#16. 最终会进入DefaultAMSProcessor
DefaultAMSProcessor是AMSProcessingChain里最后一个processor
这里面会有一块核心代码:
getRmContext().getDispatcher().getEventHandler()
.handle(
new RMAppAttemptRegistrationEvent(applicationAttemptId, request
.getHost(), request.getRpcPort(), request.getTrackingUrl()));
#17. 返回到AMRMClientAsyncImpl的注册方法里面
#18. 查看YARN的WEB UI
显示应用的状态为RUNNING。
Running Containers为0,也就是并没有分配实际的Container在运行。
返回的注册响应信息:
INFO test.MyTestAMRMClientAsync (MyTestAMRMClientAsync.java:lambda$testAMRMClientAsync$0(97)) - registerApplicationMasterResponse: maximumCapability { memory: 8192 virtual_cores: 4 resource_value_map { key: "memory-mb" value: 8192 units: "Mi" type: COUNTABLE } resource_value_map { key: "vcores" value: 4 units: "" type: COUNTABLE } } queue: "default" scheduler_resource_types: MEMORY resource_profiles { } resource_types { name: "memory-mb" units: "Mi" type: COUNTABLE } resource_types { name: "vcores" units: "" type: COUNTABLE }
Disconnected from the target VM, address: '127.0.0.1:55513', transport: 'socket'
以上就是AM注册自己的整体宏观过程。
【END】
往期推荐:
Apache Hadoop YARN:Client<-->ResourceManager源码解析
Apache Hadoop YARN:Client<-->ResourceManager源码DEBUG
Hadoop YARN:ApplicationMaster与ResourceManager交互源码解析
Hive-DML(Data Manipulation Language)数据操作语言