YARN distributedShell 客户端 源码流程查看

前提

  1. hadoop版本是3.1.1
  2. 在源码添加打印日志 方便学习

 在RetryInvocationHandler添加打印,这样每次RPC调用我们都能看到相关日志,方便定位流程

添加打印日志

开始正题:

执行SQL:很简单打印当前目录

hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1.3.0.1.0-187.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1.3.0.1.0-187.jar -shell_command  'echo ---- %cd%' -num_containers 2 -container_memory 300 -master_memory 400

打印日志

08/13 14:22:37 [main] Client(254):Initializing Client
08/13 14:22:37 [main] Client(632):Running Client
08/13 14:22:37 [main] RMProxy(133):Connecting to ResourceManager at /0.0.0.0:8032
08/13 14:31:55 [main] RetryInvocationHandler(355):==>hander 被执行method:getResourceTypeInfo
08/13 14:31:55 [main] RetryInvocationHandler(355):==>hander 被执行method:getClusterMetrics,{}
08/13 14:31:55 [main] Client(636):Got Cluster metric info from ASM, numNodeManagers=1
08/13 14:31:56 [main] RetryInvocationHandler(355):==>hander 被执行method:getClusterNodes,{nodeStates: NS_RUNNING}
08/13 14:31:56 [main] Client(641):Got Cluster node info from ASM
08/13 14:31:56 [main] RetryInvocationHandler(355):==>hander 被执行method:getQueueInfo,{queueName: "default" includeApplications: true includeChildQueues: false recursive: false}
08/13 14:31:56 [main] Client(651):Queue info, queueName=default, queueCurrentCapacity=0.0, queueMaxCapacity=1.0, queueApplicationCount=0, queueChildQueueCount=0
08/13 14:31:56 [main] RetryInvocationHandler(355):==>hander 被执行method:getQueueUserAcls,{}
08/13 14:31:56 [main] Client(661):User ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS
08/13 14:31:56 [main] Client(661):User ACL Info for Queue, queueName=root, userAcl=ADMINISTER_QUEUE
08/13 14:31:56 [main] Client(661):User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS
08/13 14:31:56 [main] Client(661):User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE
08/13 14:31:56 [main] RetryInvocationHandler(355):==>hander 被执行method:getResourceProfiles,{org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetAllResourceProfilesRequestPBImpl@5c86a017}
08/13 14:31:56 [main] RetryInvocationHandler(355):==>hander 被执行method:getNewApplication,{}
08/13 14:31:56 [main] Client(706):Max mem capability of resources in this cluster 8192
08/13 14:31:56 [main] Client(717):Max virtual cores capability of resources in this cluster 4
08/13 14:31:56 [main] RetryInvocationHandler(355):==>hander 被执行method:getResourceTypeInfo,{org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetAllResourceTypeInfoRequestPBImpl@da6efc73}
08/13 14:31:56 [main] Client(1184):AM vcore not specified, use 1 mb as AM vcores
08/13 14:31:56 [main] Client(1191):AM Resource capability=<memory:400, vCores:1>
08/13 14:31:56 [main] Client(765):Copy App Master jar from local filesystem and add to local environment
08/13 14:31:56 [main] FileSystem(3296):Loading filesystems[viewfs=class org.apache.hadoop.fs.viewfs.ViewFileSystem, swebhdfs=class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem, file=class org.apache.hadoop.fs.LocalFileSystem, har=class org.apache.hadoop.fs.HarFileSystem, http=class org.apache.hadoop.fs.http.HttpFieSystem, hdfs=class org.apache.hadoop.hdfs.DistributedFileSystem, webhdfs=class org.apache.hadoop.hdfs.web.WebHdfsFileSystem, https=class org.apache.hadoop.fs.http.HttpsFileSystem],8
08/13 14:31:57 [main] RetryInvocationHandler(355):==>hander 被执行method:toString,{}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:getFileInfo,{/user/***/DistributedShell/application_1597216395121_0007/AppMaster.jar}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:toString,{}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:create,{/user/***/DistributedShell/application_1597216395121_0007/AppMaster.jar,{ masked: rw-r--r--, unmasked: rw-rw-rw- },DFSClient_NONMAPREDUCE_1707719962_1,[CREATE, OVERWRITE],true,1,134217728,{CryptoProtocolVersion{description='Encry
08/13 14:31:58 [Thread-6] RetryInvocationHandler(355):==>hander 被执行method:addBlock,{/user/***/DistributedShell/application_1597216395121_0007/AppMaster.jar,DFSClient_NONMAPREDUCE_1707719962_1,<null>,<null>,17203,<null>,[]}
08/13 14:31:58 [Thread-6] RetryInvocationHandler(355):==>hander 被执行method:getServerDefaults,{}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:complete,{/user/***/DistributedShell/application_1597216395121_0007/AppMaster.jar,DFSClient_NONMAPREDUCE_1707719962_1,BP-1150083184-10.180.201.39-1587535791595:blk_1073742314_1492,17203}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:getFileInfo,{/user/***/DistributedShell/application_1597216395121_0007/AppMaster.jar}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:toString,{}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:create,{/user/***/DistributedShell/application_1597216395121_0007/shellCommands,{ masked: rw-r--r--, unmasked: rw-rw-rw- },DFSClient_NONMAPREDUCE_1707719962_1,[CREATE, OVERWRITE],true,1,134217728,{CryptoProtocolVersion{description='Encry
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:setPermission,{/user/***/DistributedShell/application_1597216395121_0007/shellCommands,rwx--x---}
08/13 14:31:58 [Thread-9] RetryInvocationHandler(355):==>hander 被执行method:addBlock,{/user/***/DistributedShell/application_1597216395121_0007/shellCommands,DFSClient_NONMAPREDUCE_1707719962_1,<null>,<null>,17204,<null>,[]}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:complete,{/user/***/DistributedShell/application_1597216395121_0007/shellCommands,DFSClient_NONMAPREDUCE_1707719962_1,BP-1150083184-10.180.201.39-1587535791595:blk_1073742315_1493,17204}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:getFileInfo,{/user/***/DistributedShell/application_1597216395121_0007/shellCommands}
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:toString,{}
08/13 14:31:58 [main] Client(814):Set the environment for the application master
08/13 14:31:58 [main] Client(856):Setting up app master command
08/13 14:31:58 [main] Client(918):Completed setting up app master command {{JAVA_HOME}}/bin/java -Xmx400m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_type GUARANTEED --container_memory 300 --container_vcores 1 --num_containers 2 --priority 0 1><LOG_DIR>/AppMaster.stdout 2><LOG
08/13 14:31:58 [main] Client(985):Submitting application to ASM
08/13 14:31:58 [main] RetryInvocationHandler(355):==>hander 被执行method:submitApplication,
08/13 14:31:59 [main] YarnClientImpl(306):Submitted application application_1597216395121_0007
08/13 14:32:00 [main] RetryInvocationHandler(355):==>hander 被执行method:getApplicationReport,{application_id { id: 7 cluster_timestamp: 1597216395121 }}
08/13 14:32:00 [main] Client(1021):Got application report from ASM for, appId=7, clientToAMToken=null, appDiagnostics=AM container is launched, waiting for AM container to Register with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1597300318994, yarnAppState=ACCEPTED, distributedFinalSate=UNDEFINED, appTrackingUrl=http://***.home.langchao.com:8088/proxy/application_1597216395121_0007/, appUser=***
08/13 14:32:01 [main] RetryInvocationHandler(355):==>hander 被执行method:getApplicationReport,{application_id { id: 7 cluster_timestamp: 1597216395121 }}
08/13 14:32:01 [main] Client(1021):Got application report from ASM for, appId=7, clientToAMToken=null, appDiagnostics=AM container is launched, waiting for AM container to Register with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1597300318994, yarnAppState=ACCEPTED, distributedFinalSate=UNDEFINED, appTrackingUrl=http://***.home.langchao.com:8088/proxy/application_1597216395121_0007/, appUser=***
08/13 14:32:28 [main] RetryInvocationHandler(355):==>hander 被执行method:getApplicationReport,{application_id { id: 7 cluster_timestamp: 1597216395121 }}
08/13 14:32:28 [main] Client(1021):Got application report from ASM for, appId=7, clientToAMToken=null, appDiagnostics=, appMasterHost=***/10.180.201.39, appQueue=default, appMasterRpcPort=-1, appStartTime=1597300318994, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://***home.langchao.com:8088/proxy/application_1597216395121_0007/, appUser=***
08/13 14:32:29 [main] RetryInvocationHandler(355):==>hander 被执行method:getApplicationReport,{application_id { id: 7 cluster_timestamp: 1597216395121 }}

08/13 14:32:37 [main] Client(1021):Got application report from ASM for, appId=7, clientToAMToken=null, appDiagnostics=, appMasterHost=***/10.180.201.39, appQueue=default, appMasterRpcPort=-1, appStartTime=1597300318994, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://***home.langchao.com:8088/proxy/application_1597216395121_0007/, appUser=***
08/13 14:32:37 [main] Client(1057):Reached client specified timeout for application. Killing application
08/13 14:32:37 [main] RetryInvocationHandler(355):==>hander 被执行method:forceKillApplication,{application_id { id: 7 cluster_timestamp: 1597216395121 }}
08/13 14:32:37 [main] RetryInvocationHandler(355):==>hander 被执行method:forceKillApplication,{application_id { id: 7 cluster_timestamp: 1597216395121 }}
08/13 14:32:37 [main] YarnClientImpl(474):Killed application application_1597216395121_0007
08/13 14:32:37 [main] Client(274):Application failed to complete successfully

从日志可以看出大概逻辑分为如下几步

  • 1.获取 Cluster metric 信息
  • 2.获取 Cluster node info from ASM
  • 3.获取Queue info
  • 4. 获取User ACL Info for Queue
  • 5.Get the resource profiles available in the RM
  • 7.Get available resource types supported by RM
  • 8.整理文件
  • 9.提交任务
  • 10.循环获取报告源码

源码

 1.开始main

public static void main(String[] args) {
    boolean result = false;
    try {
      Client client = new Client();
      LOG.info("Initializing Client");
      try {
       //init处理命令行参数 
        boolean doRun = client.init(args);
        if (!doRun) {
          System.exit(0);
        }
      } catch (IllegalArgumentException e) {
        System.err.println(e.getLocalizedMessage());
        client.printUsage();
        System.exit(-1);
      }
      //
      result = client.run();
    } catch (Throwable t) {
      LOG.error("Error running Client", t);
      System.exit(1);
    }
    if (result) {
      LOG.info("Application completed successfully");
      System.exit(0);			
    } 
    LOG.error("Application failed to complete successfully");
    System.exit(2);
  }

2.进入业务代码

 代码太长进行用"...." 进行忽略

 public boolean run() throws IOException, YarnException {

    LOG.info("Running Client");
    yarnClient.start();
    1.获取 Cluster metric 信息
    YarnClusterMetrics clusterMetrics = yarnClient.getYarnClusterMetrics();
    ....
    2.获取 Cluster node info from ASM
    List<NodeReport> clusterNodeReports = yarnClient.getNodeReports(
        NodeState.RUNNING);
    LOG.info("Got Cluster node info from ASM");
    
    ....
    3.获取Queue info
    QueueInfo queueInfo = yarnClient.getQueueInfo(this.amQueue);
    ....
    4. 获取User ACL Info for Queue
    List<QueueUserACLInfo> listAclInfo = yarnClient.getQueueAclsInfo();
    .....
    5.5.Get the resource profiles available in the RM
    Map<String, Resource> profiles;
    try {
      profiles = yarnClient.getResourceProfiles();
    } catch (YARNFeatureNotEnabledException re) {
      profiles = null;
    }

    List<String> appProfiles = new ArrayList<>(2);
    appProfiles.add(amResourceProfile);
    appProfiles.add(containerResourceProfile);

    ....
    6.Get a new application id
    YarnClientApplication app = yarnClient.createApplication();
    GetNewApplicationResponse appResponse = app.getNewApplicationResponse();
    
    .....

    ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext();
    ApplicationId appId = appContext.getApplicationId();

    // Set up resource type requirements
    // For now, both memory and vcores are supported, so we set memory and
    // vcores requirements

    7.Get available resource types supported by RM

    List<ResourceTypeInfo> resourceTypes = yarnClient.getResourceTypeInfo();
    ....
    8.整理参数文件  
       包含三部分 resource,commands,env
       还包含处理 token
    ....    

    9.提交任务
    yarnClient.submitApplication(appContext);

    10.循环获取报告    

    return monitorApplication(appId);

  }

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值