Hadoop2.8.5 RPC 机制 二

由上一篇我们知道 Hadoop 是利用 ProtoBuf 和 Proxy 搭建起自己的 RPC 机制的。Hadoop 的 RPC 机制在 Client 一侧就是通过 Proxy 实现的。Proxy ,即“代理”,是 JDK 提供的一个类,可以说是专为RMI 定制的。Proxy 对象的创建一定是与 InvocationHandler 联系在一起的。InvocationHandler 是JDK 中定义的一个界面,这个界面上只有一个操作方法,就是 invoke ()。创建Proxy 对象之前一定要先定义一个实现了 InvocationHandler 界面的类,例如 Invoker ,并创建一个该类对象,例如 invoker ,然后就可以创建 Proxy 对象了。创建时的参数包括其所代理的一个或几个界面,以及实现了 InvocationHandler 界面的对象 invoker 。JDK 在创建 Proxy 对象时会动态(运行时)生成一个隐形的对象,这个对象(类)实现了给定界面上所定义的所有方法,但是每个方法函数的代码都是相同的,那就是把描述其自身的 Method 对象作为参数,连同上面传下来的调用参数一起,调用 Invoker.invoke ()。这样,从 Client 看来跟以前的直接调用并无不同,因为 Proxy 提供了定义于界面上的所有方法,但是九九归一都跑到了 Invoker.invoke (),于是我们就可以在这个方法中实现把有关本次调用的信息编码发送到 Server 所在的 JVM ,再由那里的相关程序加以解码,转化成对于那里的服务调用。

1. RPC服务端协议栈

还是以ResourceManager和ApplicationClientProtocol为入口

public class ResourceManager extends CompositeService implements Recoverable {
	public static void main(String argv[]) {
		......
		resourceManager.start();
		......
	}
	protected void serviceInit(Configuration conf) throws Exception {
		......
		createAndInitActiveServices ();
		......
    }
    
    @Private
    public class RMActiveServices extends CompositeService {
    	@Override
    	protected void serviceInit(Configuration configuration) throws Exception {
    		......
    		lientRM=createClientRMService ();
    		......
    	}
    }
    //创建ClientRMService
    protected ClientRMService createClientRMService() {
    return new ClientRMService(this.rmContext, scheduler, this.rmAppManager,
        this.applicationACLsManager, this.queueACLsManager,
        this.rmContext.getRMDelegationTokenSecretManager());
   }
}

ClientRMService实现了ApplicationClientProtocol

public class ClientRMService extends AbstractService implements ApplicationClientProtocol {
	 @Override
  protected void serviceStart() throws Exception {
    Configuration conf = getConfig();
    //YarnRPC 是一个接口 
    YarnRPC rpc = YarnRPC.create(conf);
    //获取服务
    this.server = rpc.getServer(ApplicationClientProtocol.class, this,
            clientBindAddress,
            conf, this.rmDTSecretManager,
            conf.getInt(YarnConfiguration.RM_CLIENT_THREAD_COUNT, 
                YarnConfiguration.DEFAULT_RM_CLIENT_THREAD_COUNT));
    super.serviceStart();
  }
}

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\ipc\YarnRPC.java
这里定义了客户端和服务端的RPC操作

public abstract class YarnRPC {
  //客户端调用预留给子类扩充
  public abstract Object getProxy(Class protocol, InetSocketAddress addr, Configuration conf);
  public abstract void stopProxy(Object proxy, Configuration conf);
  //服务端调用预留给子类扩充
  public abstract Server getServer(Class protocol, Object instance,
      InetSocketAddress addr, Configuration conf,
      SecretManager<? extends TokenIdentifier> secretManager,
      int numHandlers, String portRangeConfig);

  public Server getServer(Class protocol, Object instance,
      InetSocketAddress addr, Configuration conf,
      SecretManager<? extends TokenIdentifier> secretManager,
      int numHandlers) {
    return getServer(protocol, instance, addr, conf, secretManager, numHandlers, null);
  }
  //创建实现类 HadoopYarnProtoRPC 
  public static YarnRPC create(Configuration conf) {
    String clazzName = conf.get(YarnConfiguration.IPC_RPC_IMPL);
    if (clazzName == null) {
      clazzName = YarnConfiguration.DEFAULT_IPC_RPC_IMPL;
    }
    try {
      return (YarnRPC) Class.forName(clazzName).newInstance();
    } catch (Exception e) {
      throw new YarnRuntimeException(e);
    }
  }
}

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\ipc\HadoopYarnProtoRPC.java
对YarnRPC扩充

public class HadoopYarnProtoRPC extends YarnRPC {

  //客户端调用
  @Override
  public Object getProxy(Class protocol, InetSocketAddress addr,
      Configuration conf) {
    return RpcFactoryProvider.getClientFactory(conf).getClient(protocol, 1,
        addr, conf);
  }

  @Override
  public void stopProxy(Object proxy, Configuration conf) {
    RpcFactoryProvider.getClientFactory(conf).stopClient(proxy);
  }
  
  //服务端调用
  @Override
  public Server getServer(Class protocol, Object instance,
      InetSocketAddress addr, Configuration conf,
      SecretManager<? extends TokenIdentifier> secretManager,
      int numHandlers, String portRangeConfig) {
    return RpcFactoryProvider.getServerFactory(conf).getServer(protocol, 
        instance, addr, conf, secretManager, numHandlers, portRangeConfig);

  }

}

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\factories\impl\pb\RpcServerFactoryPBImpl.java
通过反射返回协议对应的服务

@Private
public class RpcServerFactoryPBImpl implements RpcServerFactory {
  @Override
  public Server getServer(Class<?> protocol, Object instance,
      InetSocketAddress addr, Configuration conf,
      SecretManager<? extends TokenIdentifier> secretManager, int numHandlers,
      String portRangeConfig) {
    //检查缓存
    Constructor<?> constructor = serviceCache.get(protocol);
    if (constructor == null) {
      Class<?> pbServiceImplClazz = null;
      try {
        pbServiceImplClazz = conf
            .getClassByName(getPbServiceImplClassName(protocol));
      } catch (ClassNotFoundException e) {
      		.......
      }
      try {
        constructor = pbServiceImplClazz.getConstructor(protocol);
        constructor.setAccessible(true);
        serviceCache.putIfAbsent(protocol, constructor);
      } catch (NoSuchMethodException e) {
       	......
      }
    }
    //创建服务对象
    Object service = null;
    try {
      service = constructor.newInstance(instance);
    } catch (InvocationTargetException e) {
      ......
    }
    //获取服务接口方法
    Class<?> pbProtocol = service.getClass().getInterfaces()[0];
    Method method = protoCache.get(protocol);
    if (method == null) {
      Class<?> protoClazz = null;
      try {
        protoClazz = conf.getClassByName(getProtoClassName(protocol));
      } catch (ClassNotFoundException e) {
        ......
      }
      try {
        //反射调用PB协议生成的同步接口
        method = protoClazz.getMethod("newReflectiveBlockingService",
            pbProtocol.getInterfaces()[0]);
        method.setAccessible(true);
        protoCache.putIfAbsent(protocol, method);
      } catch (NoSuchMethodException e) {
        throw new YarnRuntimeException(e);
      }
    }
    
    try {
      return createServer(pbProtocol, addr, conf, secretManager, numHandlers,
          (BlockingService)method.invoke(null, service), portRangeConfig);
    } catch (InvocationTargetException e) {
      ......
    }
  }
  
  //创建具体的服务
  private Server createServer(Class<?> pbProtocol, InetSocketAddress addr, Configuration conf, 
      SecretManager<? extends TokenIdentifier> secretManager, int numHandlers, 
      BlockingService blockingService, String portRangeConfig) throws IOException {
    //关联协议引擎
    RPC.setProtocolEngine(conf, pbProtocol, ProtobufRpcEngine.class);
    //构建具体的服务
    RPC.Server server = new RPC.Builder(conf).setProtocol(pbProtocol)
        .setInstance(blockingService).setBindAddress(addr.getHostName())
        .setPort(addr.getPort()).setNumHandlers(numHandlers).setVerbose(false)
        .setSecretManager(secretManager).setPortRangeConfig(portRangeConfig)
        .build();
    //关联RPC类型
    server.addProtocol(RPC.RpcKind.RPC_PROTOCOL_BUFFER, pbProtocol, blockingService);
    return server;
  }
}

由 RM 创建的是服务端的“一条龙”,即整个协议栈:其底层是 IPC 层的 Server ,它的上面是 RPC.Server ;再上面是一个 RPC 层上实现了 com. google. protobuf. BlockingService 界面、由 ApplicationClientProtocolService 中的 new ReflectiveBlockingService ()方法加以动态定义和创建的无名类对象;再上面就是应用层的 ApplicationClientProtoco lPBServiceImpl。最后ClientRMService可以通过返回的服务调用服务的方法。即 IPC 层将请求提交给 RPC 层的 RPC.Server ,后者从接收到的 RPC 请求报文中恢复出有关参数,就调用BlockingService.callBlocking Method () , 那里会调用作为跳板和中介的。

2.RPC客户端协议栈

hadoop-yarn-applications-distributedshell\src\main\java\org\apache\hadoop\yarn\applications\distributedshell\Client.java
注意文件路径,这是一个分布式的 Shell, 是整个系统的 Client 。

Client(String appMasterMainClass, Configuration conf) {
    this.conf = conf;
    this.appMasterMainClass = appMasterMainClass;
    yarnClient = YarnClient.createYarnClient();
    yarnClient.init(conf);
    ......

hadoop-yarn-client\src\main\java\org\apache\hadoop\yarn\client\api\YarnClient.java

public abstract class YarnClient extends AbstractService {
  @Public
  public static YarnClient createYarnClient() {
    YarnClient client = new YarnClientImpl();
    return client;
  }
}

真正创建的是扩充了这个抽象类的实体类,即 YarnClientImpl,RPC 层 Client 的问题,核心在于 Proxy ,因为发送给服务端的 RPC 请求要靠 Proxy 传递。Proxy 就像联络员,像服务端派驻在客户端的代理人,所以这里要创建Proxy。

hadoop-yarn-client\src\main\java\org\apache\hadoop\yarn\client\api\impl\YarnClientImpl.java

public class YarnClientImpl extends YarnClient {
 @Override
  protected void serviceStart() throws Exception {
    try {
      rmClient = ClientRMProxy.createRMProxy(getConfig(),
          ApplicationClientProtocol.class);
      if (historyServiceEnabled) {
        historyClient.start();
      }
    } catch (IOException e) {
      throw new YarnRuntimeException(e);
    }
    super.serviceStart();
  }
}

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\client\ClientRMProxy.java

public class ClientRMProxy<T> extends RMProxy<T>  {
  ......
  public static <T> T createRMProxy(final Configuration configuration,
      final Class<T> protocol) throws IOException {
    return createRMProxy(configuration, protocol, INSTANCE);
  }
  ......
}

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\client\RMProxy.java

public class RMProxy<T> {
  //创建代理
  private static <T> T createRMProxy(final YarnConfiguration conf,
      final Class<T> protocol, RMProxy instance, RetryPolicy retryPolicy)
          throws IOException{
    if (HAUtil.isHAEnabled(conf)) { //是否开启高可用(即失败重试)
      RMFailoverProxyProvider<T> provider =
          instance.createRMFailoverProxyProvider(conf, protocol);
      return (T) RetryProxy.create(protocol, provider, retryPolicy);
    } else {
      //我们只看这一分支
      InetSocketAddress rmAddress = instance.getRMAddress(conf, protocol);
      T proxy = RMProxy.<T>getProxy(conf, protocol, rmAddress);
      return (T) RetryProxy.create(protocol, proxy, retryPolicy);
    }
  }
}

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\client\RMProxy.java

  @Private
  static <T> T getProxy(final Configuration conf,
      final Class<T> protocol, final InetSocketAddress rmAddress)
      throws IOException {
    return UserGroupInformation.getCurrentUser().doAs(
      new PrivilegedAction<T>() {
        @Override
        public T run() {
          //这里调用的是HadoopYarnProtoRPC. getProxy()
          return (T) YarnRPC.create(conf).getProxy(protocol, rmAddress, conf);
        }
      });
  }

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\ipc\HadoopYarnProtoRPC.java

//客户端调用
  @Override
  public Object getProxy(Class protocol, InetSocketAddress addr,
      Configuration conf) {
    return RpcFactoryProvider.getClientFactory(conf).getClient(protocol, 1,
        addr, conf);
  }

hadoop-yarn-common\src\main\java\org\apache\hadoop\yarn\factories\impl\pb\RpcClientFactoryPBImpl.java
最后通过RpcClientFactoryPBImpl来返回代理

public class RpcClientFactoryPBImpl implements RpcClientFactory {
	public Object getClient(Class<?> protocol, long clientVersion,
      InetSocketAddress addr, Configuration conf) {
    //从缓存中取构造函数
    Constructor<?> constructor = cache.get(protocol);
    if (constructor == null) {
      Class<?> pbClazz = null;
      try {
      //在我们这个情景中是ApplicationClientProtocolPBClientImpl 
        pbClazz = conf.getClassByName(getPBImplClassName(protocol));
      } catch (ClassNotFoundException e) {
        ......
      }
      try {
        constructor = pbClazz.getConstructor(Long.TYPE, InetSocketAddress.class, Configuration.class);
        constructor.setAccessible(true);
        cache.putIfAbsent(protocol, constructor);
      } catch (NoSuchMethodException e) {
        ......
      }
    }
    try {
      //反射构造目标对象,在我们这个情景中是 ApplicationClientProtocolPBClientImpl
      Object retObject = constructor.newInstance(clientVersion, addr, conf);
      return retObject;
    } catch (InvocationTargetException e) {
      ......
    }
  }
}

以ApplicationClientProtocol为例,最后构造的对象是 ApplicationClientProtocolPBClientImpl

class ApplicationClientProtocolPBClientImpl implements ApplicationClientProtocol {
    public ApplicationClientProtocolPBClientImpl(long clientVersion,
      InetSocketAddress addr, Configuration conf) throws IOException {
    //注册协议和引擎
    RPC.setProtocolEngine(conf, ApplicationClientProtocolPB.class,
      ProtobufRpcEngine.class);
    //生成代理,到这里由回到了上一篇
    proxy = RPC.getProxy(ApplicationClientProtocolPB.class, clientVersion, addr, conf);
  }
}

hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\ipc\RPC.java

 public static <T> ProtocolProxy<T> getProtocolProxy(Class<T> protocol,
                                long clientVersion,
                                InetSocketAddress addr,
                                UserGroupInformation ticket,
                                Configuration conf,
                                SocketFactory factory,
                                int rpcTimeout,
                                RetryPolicy connectionRetryPolicy,
                                AtomicBoolean fallbackToSimpleAuth)
       throws IOException {
    if (UserGroupInformation.isSecurityEnabled()) {
      SaslRpcServer.init(conf);
    }
    return getProtocolEngine(protocol, conf).getProxy(protocol, clientVersion,
        addr, ticket, conf, factory, rpcTimeout, connectionRetryPolicy,
        fallbackToSimpleAuth);
  }

hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\ipc\RPC.java

  static synchronized RpcEngine getProtocolEngine(Class<?> protocol,
      Configuration conf) {
    RpcEngine engine = PROTOCOL_ENGINES.get(protocol);
    if (engine == null) {
      Class<?> impl = conf.getClass(ENGINE_PROP+"."+protocol.getName(),
                                    WritableRpcEngine.class);
      engine = (RpcEngine)ReflectionUtils.newInstance(impl, conf);
      PROTOCOL_ENGINES.put(protocol, engine);
    }
    return engine;
  }

hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\ipc\ProtobufRpcEngine.java
最终回到ProtobufRpcEngine

public class ProtobufRpcEngine implements RpcEngine {
  @Override
  @SuppressWarnings("unchecked")
  public <T> ProtocolProxy<T> getProxy(Class<T> protocol, long clientVersion,
      InetSocketAddress addr, UserGroupInformation ticket, Configuration conf,
      SocketFactory factory, int rpcTimeout, RetryPolicy connectionRetryPolicy,
      AtomicBoolean fallbackToSimpleAuth) throws IOException {
      //内部类Invoker,参考上一篇
    final Invoker invoker = new Invoker(protocol, addr, ticket, conf, factory,
        rpcTimeout, connectionRetryPolicy, fallbackToSimpleAuth);
    return new ProtocolProxy<T>(protocol, (T) Proxy.newProxyInstance(
        protocol.getClassLoader(), new Class[]{protocol}, invoker), false);
  }
}

3. 客户端App作业提交

hadoop-mapreduce-client-jobclient\src\main\java\org\apache\hadoop\mapred\YARNRunner.java

@SuppressWarnings("unchecked")
public class YARNRunner implements ClientProtocol {
	@Override
  public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
  throws IOException, InterruptedException {
    
    addHistoryToken(ts);
    
    // Construct necessary information to start the MR AM
    ApplicationSubmissionContext appContext =
      createApplicationSubmissionContext(conf, jobSubmitDir, ts);

    // Submit to ResourceManager
    try {
      ApplicationId applicationId =
          resMgrDelegate.submitApplication(appContext);

      ApplicationReport appMaster = resMgrDelegate
          .getApplicationReport(applicationId);
      String diagnostics =
          (appMaster == null ?
              "application report is null" : appMaster.getDiagnostics());
      if (appMaster == null
          || appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
          || appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
        throw new IOException("Failed to run job : " +
            diagnostics);
      }
      return clientCache.getClient(jobId).getJobStatus(jobId);
    } catch (YarnException e) {
      throw new IOException(e);
    }
  }
}

hadoop-mapreduce-client-jobclient\src\main\java\org\apache\hadoop\mapred\ResourceMgrDelegate.java

public class ResourceMgrDelegate extends YarnClient {

  @Override
  public ApplicationId
      submitApplication(ApplicationSubmissionContext appContext)
          throws YarnException, IOException {
    return client.submitApplication(appContext);
  }
  
}

最终还是走YarnClientImpl。HDFS 子系统也是一样,也要在服务端和客户端建立起 RPC 通信的“协议栈”,也是那样的 RPC.Server 和 Client ,只不过所用的 protocol 当然不是 ApplicationClientProtocol ,而是用于 HDFS 的 protocol 了,但是底层的机制还是一样。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值