Hadoop的RPC的通信与其他系统的RPC通信不太一样,作者针对Hadoop的使用特点,专门的设计了一套RPC框架,这套框架个人感觉还是有点小复杂的。所以我打算分成Client客户端和Server服务端2个模块做分析。如果你对RPC的整套流程已经非常了解的前提下,对于Hadoop的RPC,你也一定可以非常迅速的了解的。OK,下面切入正题。
Hadoop的RPC的相关代码都在org.apache.hadoop.ipc的包下,首先RPC的通信必须遵守许多的协议,其中最最基本的协议即使如下;
/**
* Superclass of all protocols that use Hadoop RPC.
* Subclasses of this interface are also supposed to have
* a static final long versionID field.
* Hadoop RPC所有协议的基类,返回协议版本号
*/
public interface VersionedProtocol {
/**
* Return protocol version corresponding to protocol interface.
* @param protocol The classname of the protocol interface
* @param clientVersion The version of the protocol that the client speaks
* @return the version that the server will speak
*/
public long getProtocolVersion(String protocol,
long clientVersion) throws IOException;
}
他是所有协议的基类,他的下面还有一堆的子类,分别对应于不同情况之间的通信,下面是一张父子类图:
顾名思义,只有客户端和服务端遵循相同的版本号,才能进行通信。
RPC客户端的所有相关操作都被封装在了一个叫Client.java的文件中:
/** A client for an IPC service. IPC calls take a single {@link Writable} as a
* parameter, and return a {@link Writable} as their value. A service runs on
* a port and is defined by a parameter class and a value class.
* RPC客户端类
* @see Server
*/
public class Client {
public static final Log LOG =
LogFactory.getLog(Client.class);
//客户端到服务端的连接
private Hashtable<ConnectionId, Connection> connections =
new Hashtable<ConnectionId, Connection>();
//回调值类
private Class<? extends Writable> valueClass; // class of call values
//call回调id的计数器
private int counter; // counter for call ids
//原子变量判断客户端是否还在运行
private AtomicBoolean running = new AtomicBoolean(true); // if client runs
final private Configuration conf;
//socket工厂,用来创建socket
private SocketFactory socketFactory; // how to create sockets
private int refCount = 1;
......
从代码中明显的看到,这里存在着一个类似于connections连接池的东西,其实这暗示着连接是可以被复用的,在hashtable中,与每个Connecttion连接的对应的是一个ConnectionId,显然这里不是一个Long类似的数值:
/**
* This class holds the address and the user ticket. The client connections
* to servers are uniquely identified by <remoteAddress, protocol, ticket>
* 连接的唯一标识,主要通过<远程地址,协议类型,用户组信息>
*/
static class ConnectionId {
//远程的socket地址
InetSocketAddress address;
//用户组信息
UserGroupInformation ticket;
//协议类型
Class<?> protocol;
private static final int PRIME = 16777619;
private int rpcTimeout;
private String serverPrincipal;
private int maxIdleTime; //connections will be culled if it was idle for
//maxIdleTime msecs
private int maxRetries; //the max. no. of retries for socket connections
private boolean tcpNoDelay; // if T then disable Nagle's Algorithm
private int pingInterval; // how often sends ping to the server in m