近段时间,怀着一个好奇的心态去学习一下yarn,并且简单地看了一下源代码。我直接从hadoop-common的trunk中下载源码并且编译运行,这样与社区比较同步。如果你对maven 比较了解的话,编译起来都很简单的。
1. Service
在hadoop 3.0-snapshot的源码里面分析出,它把系统里面每一个功能都抽象成服务。一个服务都有一个状态机,里面包含四种状态:未初始化(notinited), 已初始化(inited), 已经启动(started), 已经停用(stopped)。
public interface Service extends Closeable {
/**
* Service states
*/
public enum STATE {
/** Constructed but not initialized */
NOTINITED(0, "NOTINITED"),
/** Initialized but not started or stopped */
INITED(1, "INITED"),
/** started and not stopped */
STARTED(2, "STARTED"),
/** stopped. No further state transitions are permitted */
STOPPED(3, "STOPPED");
//省略掉剩下的一部分...
}
//省略掉剩下的一部分...
}
一个服务按照一定的规则在这四种状态下进行有序地转化。这种转变是不可逆的,具有单向性的特点。
public class ServiceStateModel {
/**
* Map of all valid state transitions
* [current] [proposed1, proposed2, ...]
*/
private static final boolean[][] statemap =
{
// uninited inited started stopped
/* uninited */ {false, true, false, true},
/* inited */ {false, true, true, true},
/* started */ {false, false, true, true},
/* stopped */ {false, false, false, true},
};
//省略掉剩下一部分...
}
2. Client
在hadoop 3.0中的服务端代码,job的概念已经被application所代替了,然而从客户端那一层API来说,基本上与hadoop 1.0的mapreduce接口保持一致,只是从接口实现里做了些修改。在编写Map/Reduce代码时,还是以Job的概念去实现相关代码。
客户端提交任务的代码如下:
/**
* Submit the job to the cluster and return immediately.
* @throws IOException
*/
public void submit()
throws IOException, InterruptedException, ClassNotFoundException {
ensureState(JobState.DEFINE);
setUseNewAPI();
connect();
final JobSubmitter submitter =
getJobSubmitter(cluster.getFileSystem(), cluster.getClient());
status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {
public JobStatus run() throws IOException, InterruptedException,
ClassNotFoundException {
return submitter.submitJobInternal(Job.this, cluster);
}
});
state = JobState.RUNNING;
LOG.info("The url to track the job: " + getTrackingURL());
}
以上代码的connect()方法创建集群的信息,其实就是创建jobtracker的连接,然而在hadoop 3.0中,已经没有jobtracker这个概念了,它被resourcemanager所代替。因此实质上会在客户端创建一个YarnRunner对象向hadoop yarn集群中提交任务,YarnRunner会通过代理ResourceMgrDelegate是直接向resourcemanager提交相应job。如下是YarnRunner.submitJob(...)的相关代码。
@Override
public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
throws IOException, InterruptedException {
addHistoryToken(ts);
// Construct necessary information to start the MR AM
ApplicationSubmissionContext appContext =
createApplicationSubmissionContext(conf, jobSubmitDir, ts);
// Submit to ResourceManager
try {
ApplicationId applicationId =
resMgrDelegate.submitApplication(appContext);
ApplicationReport appMaster = resMgrDelegate
.getApplicationReport(applicationId);
String diagnostics =
(appMaster == null ?
"application report is null" : appMaster.getDiagnostics());
if (appMaster == null
|| appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
|| appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
throw new IOException("Failed to run job : " +
diagnostics);
}
return clientCache.getClient(jobId).getJobStatus(jobId);
} catch (YarnException e) {
throw new IOException(e);
}
}
3. 进程通信
完成进程之间的数据交换,需要包括两个方面:序列化/反序列化和RPC通信框架。hadoop实现了自己的一套RPC通信框架。在序列化/反序列化方面,hadoop 3.0-snapshot大量采用protobuf形式,这与hadoop 1.0有着显著的不同(Writable结构)。
3.1. hadoop 1.x.x
在hadoop 1.x.x中的org.apache.hadoop.ipc.RPC类是RPC框架中的统一入口,它里面有两类重要的静态方法:getServer和getProxy。getProxy方法就是创建一个jobtracker的远程访问代理Invoker,而getServer就是new一个新Server用于开启配置端口进行监听处理服务,它们通过Writable这种序列化/反序列化机制进行数据交换。
getServer方法如下:
/** Construct a server for a protocol implementation instance listening on a
* port and address, with a secret manager. */
public static Server getServer(final Object instance, final String bindAddress, final int port,
final int numHandlers,
final boolean verbose, Configuration conf,
SecretManager<? extends TokenIdentifier> secretManager)
throws IOException {
return new Server(instance, conf, bindAddress, port, numHandlers, verbose, secretManager);
}
getProxy方法如下:
/** Construct a client-side proxy object that implements the named protocol,
* talking to a server at the named address. */
public static VersionedProtocol getProxy(
Class<? extends VersionedProtocol> protocol,
long clientVersion, InetSocketAddress addr, UserGroupInformation ticket,
Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException {
if (UserGroupInformation.isSecurityEnabled()) {
SaslRpcServer.init(conf);
}
VersionedProtocol proxy =null;
String strAddr=addr.getHostName()+":"+addr.getPort();
String jtServers=conf.get("jobtracker.servers","");
if(jtServers.contains(strAddr)){
proxy =(VersionedProtocol) Proxy.newProxyInstance(
protocol.getClassLoader(), new Class[] { protocol },
new RPCRetryAndSwitchInvoker(protocol, addr, ticket, conf, factory, rpcTimeout));
}else{
proxy =(VersionedProtocol) Proxy.newProxyInstance(
protocol.getClassLoader(), new Class[] { protocol },
new Invoker(protocol, addr, ticket, conf, factory, rpcTimeout));
}
long serverVersion = proxy.getProtocolVersion(protocol.getName(),
clientVersion);
if (serverVersion == clientVersion) {
return proxy;
} else {
throw new VersionMismatch(protocol.getName(), clientVersion,
serverVersion);
}
}
hadoop 1.x.x中的Jobtracker,Tasktracker都需要实现对应的ClientProtocol,UmbilicalProtocol等接口里面的方法,远程调用的业务逻辑就是在这些方法里面。
3.2. hadoop 3.x.x
在hadoop 3.x.x的org.apache.hadoop.ipc.RPC类中同样有两个这样的方法:getServer和getProxy,但是它的实现比起hadoop 1.x.x更加复杂。该RPC框架的统一接口为org.apache.hadoop.yarn.factories.RecordFactory,这里面只有一个newRecordInstance方法。org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl实现了org.apache.hadoop.yarn.factories.RecordFactory接口。
它能够构造出org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl对象或者org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl对象。
- org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl用于创建客户端,里面有getClient和stopClient方法,它们根据不同的接口名,会产生一个在org.apache.hadoop.yarn.api.impl.pb.client package的类对象,这些对象会创建RPC代理在不同的节点之间进行通信。
- org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl用于创建服务端,里面有getServer方法,它们根据不同的接口名,会构造一个在org.apache.hadoop.yarn.api.impl.pb.service的类对象,RPC通信的逻辑代码都实现在这些实现类中。
无论是Server端还是Client端最终都会通过一个RpcEngine接口相互通信,org.apache.hadoop.ipc.WritableRpcEngine实现了该RpcEngine接口。
在org.apache.hadoop.ipc.WritableRpcEngine.getProxy(Class<T>, long, InetSocketAddress, UserGroupInformation, Configuration, SocketFactory, int, RetryPolicy)里面的实现代码如下。
/** Construct a client-side proxy object that implements the named protocol,
* talking to a server at the named address.
* @param <T>*/
@Override
@SuppressWarnings("unchecked")
public <T> ProtocolProxy<T> getProxy(Class<T> protocol, long clientVersion,
InetSocketAddress addr, UserGroupInformation ticket,
Configuration conf, SocketFactory factory,
int rpcTimeout, RetryPolicy connectionRetryPolicy)
throws IOException {
if (connectionRetryPolicy != null) {
throw new UnsupportedOperationException(
"Not supported: connectionRetryPolicy=" + connectionRetryPolicy);
}
T proxy = (T) Proxy.newProxyInstance(protocol.getClassLoader(),
new Class[] { protocol }, new Invoker(protocol, addr, ticket, conf,
factory, rpcTimeout));
return new ProtocolProxy<T>(protocol, proxy, true);
}
org.apache.hadoop.ipc.WritableRpcEngine.getServer(Class<?>, Object, String, int, int, int, int, boolean, Configuration, SecretManager<? extends TokenIdentifier>, String)里面的实现代码如下。
/* Construct a server for a protocol implementation instance listening on a
* port and address. */
@Override
public RPC.Server getServer(Class<?> protocolClass,
Object protocolImpl, String bindAddress, int port,
int numHandlers, int numReaders, int queueSizePerHandler,
boolean verbose, Configuration conf,
SecretManager<? extends TokenIdentifier> secretManager,
String portRangeConfig)
throws IOException {
return new Server(protocolClass, protocolImpl, conf, bindAddress, port,
numHandlers, numReaders, queueSizePerHandler, verbose, secretManager,
portRangeConfig);
}
hadoop 3.0里面采用的RpcEngine引擎思想在hadoop生态系统里面不完全是一个新的东东,hbase 0.94.1里面早已经采用了这种思路。那时hbase还不支持hadoop 2.0以上的版本。