Hadoop源码分析(19)
1、 RPC解析
在文档(18)中解析了RPC服务端的reader线程的使用情况。reader线程的主要作用是从listener中接收的到的channel中读取数据并将其封装车一个call对象并存储到callQueue中。
这里继续分析服务端剩下的两个线程类handler和responder。先从handler开始,其run方法如下:
public void run() {
LOG.debug(Thread.currentThread().getName() + ": starting");
SERVER.set(Server.this);
ByteArrayOutputStream buf =
new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);
while (running) {
TraceScope traceScope = null;
try {
final Call call = callQueue.take(); // pop the queue; maybe blocked here
if (LOG.isDebugEnabled()) {
LOG.debug(Thread.currentThread().getName() + ": " + call + " for RpcKind " + call.rpcKind);
}
if (!call.connection.channel.isOpen()) {
LOG.info(Thread.currentThread().getName() + ": skipped " + call);
continue;
}
String errorClass = null;
String error = null;
RpcStatusProto returnStatus = RpcStatusProto.SUCCESS;
RpcErrorCodeProto detailedErr = null;
Writable value = null;
CurCall.set(call);
if (call.traceSpan != null) {
traceScope = Trace.continueSpan(call.traceSpan);
}
try {
// Make the call as the user via Subject.doAs, thus associating
// the call with the Subject
if (call.connection.user == null) {
value = call(call.rpcKind, call.connection.protocolName, call.rpcRequest,
call.timestamp);
} else {
value =
call.connection.user.doAs
(new PrivilegedExceptionAction<Writable>() {
@Override
public Writable run() throws Exception {
// make the call
return call(call.rpcKind, call.connection.protocolName,
call.rpcRequest, call.timestamp);
}
}
);
}
} catch (Throwable e) {
if (e instanceof UndeclaredThrowableException) {
e = e.getCause();
}
String logMsg = Thread.currentThread().getName() + ", call " + call;
if (exceptionsHandler.isTerse(e.getClass())) {
// Don't log the whole stack trace. Way too noisy!
LOG.info(logMsg + ": " + e);
} else if (e instanceof RuntimeException || e instanceof Error) {
// These exception types indicate something is probably wrong
// on the server side, as opposed to just a normal exceptional
// result.
LOG.warn(logMsg, e);
} else {
LOG.info(logMsg, e);
}
if (e instanceof RpcServerException) {
RpcServerException rse = ((RpcServerException)e);
returnStatus = rse.getRpcStatusProto();
detailedErr = rse.getRpcErrorCodeProto();
} else {
returnStatus = RpcStatusProto.ERROR;
detailedErr = RpcErrorCodeProto.ERROR_APPLICATION;
}
errorClass = e.getClass().getName();
error = StringUtils.stringifyException(e);
// Remove redundant error class name from the beginning of the stack trace
String exceptionHdr = errorClass + ": ";
if (error.startsWith(exceptionHdr)) {
error = error.substring(exceptionHdr.length());
}
}
CurCall.set(null);
synchronized (call.connection.responseQueue) {
setupResponse(buf, call, returnStatus, detailedErr,
value, errorClass, error);
// Discard the large buf and reset it back to smaller size
// to free up heap.
if (buf.size() > maxRespSize) {
LOG.warn("Large response size " + buf.size() + " for call "
+ call.toString());
buf = new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);
}
call.sendResponse();
}
} catch (InterruptedException e) {
if (running) { // unexpected -- log it
LOG.info(Thread.currentThread().getName() + " unexpectedly interrupted", e);
if (Trace.isTracing()) {
traceScope.getSpan().addTimelineAnnotation("unexpectedly interrupted: " +
StringUtils.stringifyException(e));
}
}
} catch (Exception e) {
LOG.info(Thread.currentThread().getName() + " caught an exception", e);
if (Trace.isTracing()) {
traceScope.getSpan().addTimelineAnnotation("Exception: " +
StringUtils.stringifyException(e));
}
} finally {
if (traceScope != null) {
traceScope.close();
}
IOUtils.cleanup(LOG, traceScope);
}
}
LOG.debug(Thread.currentThread().getName() + ": exiting");
}
首先是第9行,这里调用callqueue的take方法从其中取出call对象。这里的call对象就是文档(18)中分析的reader所设置的call对象。然后就是一堆赋值检查操作,跳过不看,接下来重点是第28行到第46行,这里会根据是否有用户来执行不同方法,但本质都是执行call方法,这个方法便是真执行远程调用请求的方法。然后是一堆异常处理,跳过之后接下来的重点是第81行到第93行,这里首先会调用setupResponse和sendResponse方法将结果返回给客户端。
这里先来分析执行远程调用的call方法,其内容如下:
public Writable call(RPC.RpcKind rpcKind, String protocol,
Writable rpcRequest, long receiveTime) throws Exception {
return getRpcInvoker(rpcKind).call(this, protocol, rpcRequest,
receiveTime);
}
这里先调用了一个getRpcInvoker方法,然后再调用返回值的call方法。这里的getRpcInvoker方法内容如下:
public static RpcInvoker getRpcInvoker(RPC.RpcKind rpcKind) {
RpcKindMapValue val = rpcKindMap.get(rpcKind);
return (val == null) ? null : val.rpcInvoker;
}
这里会从rpcKindMap中获取对应的值然后再从这个值中取出对应的RPCInvoker。从客户端发送数据的代码中可以看到这两点rpckind的值为RPC.RpcKind.RPC_PROTOCOL_BUFFER。其定义如下:
在创建server的时候创建的ProtobufRpcEngine类的内部类Server的对象,而在ProtobufRpcEngine类中有一段静态代码内容如下:
这里调用ipc的Server类的registerProtocolEngine方法传入的kind与客户端发来的请求的kind相同。同时还有两个参数:RpcRequestWrapper和ProtoBufRpcInvoker。这个方法内容如下:
public static void registerProtocolEngine(RPC.RpcKind rpcKind,
Class<? extends Writable> rpcRequestWrapperClass,
RpcInvoker rpcInvoker) {
RpcKindMapValue old =
rpcKindMap.put(rpcKind, new RpcKindMapValue(rpcRequestWrapperClass, rpcInvoker));
if (old != null) {
rpcKindMap.put(rpcKind, old);
throw new IllegalArgumentException("ReRegistration of rpcKind: " +
rpcKind);
}
LOG.debug("rpcKind=" + rpcKind +
", rpcRequestWrapperClass=" + rpcRequestWrapperClass +
", rpcInvoker=" + rpcInvoker);
}
重点在第4行,这里会将数据填入rpcKindMap中,即上文分析的getRpcInvoker方法取值的map中。
从上面的代码可以知道getRpcInvoker方法最终返回的invoker为:ProtoBufRpcInvoker。然后执行这个invoker的call方法,其内容如下:
public Writable call(RPC.Server server, String protocol,
Writable writableRequest, long receiveTime) throws Exception {
RpcRequestWrapper request = (RpcRequestWrapper) writableRequest;
RequestHeaderProto rpcRequest = request.requestHeader;
String methodName = rpcRequest.getMethodName();
String protoName = rpcRequest.getDeclaringClassProtocolName();
long clientVersion = rpcRequest.getClientProtocolVersion();
if (server.verbose)
LOG.info("Call: protocol=" + protocol + ", method=" + methodName);
ProtoClassProtoImpl protocolImpl = getProtocolImpl(server, protoName,
clientVersion);
BlockingService service = (BlockingService) protocolImpl.protocolImpl;
MethodDescriptor methodDescriptor = service.getDescriptorForType()
.findMethodByName(methodName);
if (methodDescriptor == null) {
String msg = "Unknown method " + methodName + " called on " + protocol
+ " protocol.";
LOG.warn(msg);
throw new RpcNoSuchMethodException(msg);
}
Message prototype = service.getRequestPrototype(methodDescriptor);
Message param = prototype.newBuilderForType()
.mergeFrom(request.theRequestRead).build();
Message result;
long startTime = Time.now();
int qTime = (int) (startTime - receiveTime);
Exception exception = null;
try {
server.rpcDetailedMetrics.init(protocolImpl.protocolClass);
result = service.callBlockingMethod(methodDescriptor, null, param);
} catch (ServiceException e) {
exception = (Exception) e.getCause();
throw (Exception) e.getCause();
} catch (Exception e) {
exception = e;
throw e;
} finally {
int processingTime = (int) (Time.now() - startTime);
if (LOG.isDebugEnabled()) {
String msg = "Served: " + methodName + " queueTime= " + qTime +
" procesingTime= " + processingTime;
if (exception != null) {
msg += " exception= " + exception.getClass().getSimpleName();
}
LOG.debug(msg);
}
String detailedMetricsName = (exception == null) ?
methodName :
exception.getClass().getSimpleName();
server.rpcMetrics.addRpcQueueTime(qTime);
server.rpcMetrics.addRpcProcessingTime(processingTime);
server.rpcDetailedMetrics.addProcessingTime(detailedMetricsName,
processingTime);
if (server.isLogSlowRPC()) {
server.logSlowRpcCalls(methodName, processingTime);
}
}
return new RpcResponseWrapper(result);
}
首先从第3行到第24行主要是从request中获取数据,其中最重要的是第13行获取的service对象,这个对象是代表了真正执行方法的对象。这个对象就是在文档(16)中提到的接口的实现类对象。在文档(16)中详细解析了其存储的方式,细看第13行前的代码先拿到接口的名称,然后拿到接口的版本号,最后再调用getProtocolImpl方法。这些都是为了得到第13行的service对象。
然后下一个重点是第32行,这里调用了service的callBlockingMethod方法执行远程调用的方法。随后便是一些异常处理,异常处理过后便是第60行,将执行接封装成RpcResponseWrapper对象返回。
上文提到的service是在文档(16)中初始化的,初始化的代码如下:
这里创建一个translator对象,这个对象会传入一个this,这里的this是指JournalNodeRpcServer类的对象。然后再利用newReflectiveBlockingService方法创建service。这个方法内容如下:
public static com.google.protobuf.BlockingService
newReflectiveBlockingService(final BlockingInterface impl) {
return new com.google.protobuf.BlockingService() {
public final com.google.protobuf.Descriptors.ServiceDescriptor
getDescriptorForType() {
return getDescriptor();
}
public final com.google.protobuf.Message callBlockingMethod(
com.google.protobuf.Descriptors.MethodDescriptor method,
com.google.protobuf.RpcController controller,
com.google.protobuf.Message request)
throws com.google.protobuf.ServiceException {
if (method.getService() != getDescriptor()) {
throw new java.lang.IllegalArgumentException(
"Service.callBlockingMethod() given method descriptor for " +
"wrong service type.");
}
switch(method.getIndex()) {
case 0:
return impl.isFormatted(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.IsFormattedRequestProto)request);
case 1:
return impl.discardSegments(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DiscardSegmentsRequestProto)request);
case 2:
return impl.getJournalCTime(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetJournalCTimeRequestProto)request);
case 3:
return impl.doPreUpgrade(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoPreUpgradeRequestProto)request);
case 4:
return impl.doUpgrade(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoUpgradeRequestProto)request);
case 5:
return impl.doFinalize(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoFinalizeRequestProto)request);
case 6:
return impl.canRollBack(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.CanRollBackRequestProto)request);
case 7:
return impl.doRollback(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoRollbackRequestProto)request);
case 8:
return impl.getJournalState(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetJournalStateRequestProto)request);
case 9:
return impl.newEpoch(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.NewEpochRequestProto)request);
case 10:
return impl.format(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.FormatRequestProto)request);
case 11:
return impl.journal(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.JournalRequestProto)request);
case 12:
return impl.heartbeat(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.HeartbeatRequestProto)request);
case 13:
return impl.startLogSegment(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.StartLogSegmentRequestProto)request);
case 14:
return impl.finalizeLogSegment(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.FinalizeLogSegmentRequestProto)request);
case 15:
return impl.purgeLogs(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.PurgeLogsRequestProto)request);
case 16:
return impl.getEditLogManifest(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetEditLogManifestRequestProto)request);
case 17:
return impl.prepareRecovery(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.PrepareRecoveryRequestProto)request);
case 18:
return impl.acceptRecovery(controller, (org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.AcceptRecoveryRequestProto)request);
default:
throw new java.lang.AssertionError("Can't get here.");
}
}
public final com.google.protobuf.Message
getRequestPrototype(
com.google.protobuf.Descriptors.MethodDescriptor method) {
if (method.getService() != getDescriptor()) {
throw new java.lang.IllegalArgumentException(
"Service.getRequestPrototype() given method " +
"descriptor for wrong service type.");
}
switch(method.getIndex()) {
case 0:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.IsFormattedRequestProto.getDefaultInstance();
case 1:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DiscardSegmentsRequestProto.getDefaultInstance();
case 2:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetJournalCTimeRequestProto.getDefaultInstance();
case 3:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoPreUpgradeRequestProto.getDefaultInstance();
case 4:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoUpgradeRequestProto.getDefaultInstance();
case 5:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoFinalizeRequestProto.getDefaultInstance();
case 6:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.CanRollBackRequestProto.getDefaultInstance();
case 7:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoRollbackRequestProto.getDefaultInstance();
case 8:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetJournalStateRequestProto.getDefaultInstance();
case 9:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.NewEpochRequestProto.getDefaultInstance();
case 10:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.FormatRequestProto.getDefaultInstance();
case 11:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.JournalRequestProto.getDefaultInstance();
case 12:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.HeartbeatRequestProto.getDefaultInstance();
case 13:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.StartLogSegmentRequestProto.getDefaultInstance();
case 14:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.FinalizeLogSegmentRequestProto.getDefaultInstance();
case 15:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.PurgeLogsRequestProto.getDefaultInstance();
case 16:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetEditLogManifestRequestProto.getDefaultInstance();
case 17:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.PrepareRecoveryRequestProto.getDefaultInstance();
case 18:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.AcceptRecoveryRequestProto.getDefaultInstance();
default:
throw new java.lang.AssertionError("Can't get here.");
}
}
public final com.google.protobuf.Message
getResponsePrototype(
com.google.protobuf.Descriptors.MethodDescriptor method) {
if (method.getService() != getDescriptor()) {
throw new java.lang.IllegalArgumentException(
"Service.getResponsePrototype() given method " +
"descriptor for wrong service type.");
}
switch(method.getIndex()) {
case 0:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.IsFormattedResponseProto.getDefaultInstance();
case 1:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DiscardSegmentsResponseProto.getDefaultInstance();
case 2:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetJournalCTimeResponseProto.getDefaultInstance();
case 3:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoPreUpgradeResponseProto.getDefaultInstance();
case 4:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoUpgradeResponseProto.getDefaultInstance();
case 5:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoFinalizeResponseProto.getDefaultInstance();
case 6:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.CanRollBackResponseProto.getDefaultInstance();
case 7:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.DoRollbackResponseProto.getDefaultInstance();
case 8:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetJournalStateResponseProto.getDefaultInstance();
case 9:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.NewEpochResponseProto.getDefaultInstance();
case 10:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.FormatResponseProto.getDefaultInstance();
case 11:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.JournalResponseProto.getDefaultInstance();
case 12:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.HeartbeatResponseProto.getDefaultInstance();
case 13:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.StartLogSegmentResponseProto.getDefaultInstance();
case 14:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.FinalizeLogSegmentResponseProto.getDefaultInstance();
case 15:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.PurgeLogsResponseProto.getDefaultInstance();
case 16:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.GetEditLogManifestResponseProto.getDefaultInstance();
case 17:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.PrepareRecoveryResponseProto.getDefaultInstance();
case 18:
return org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos.AcceptRecoveryResponseProto.getDefaultInstance();
default:
throw new java.lang.AssertionError("Can't get here.");
}
}
};
}
这段代码看起来很长,但实际就一句话:创建一个BlockingService类的匿名内部类。上文解析的callBlockingMethod方法在第9行,这个方法很简单,主要是根据方法的id找到需要执行的方法,然后执行对应的方法便可。
至此server端如何执行远程调用方法的操作便解析完了,但在上文解析handler的run方法的时候还提到了setupResponse和sendResponse两个方法。这两个方法是负责向客户端返回数据的,这里先分析setupResponse方法,其内容如下:
private static void setupResponse(ByteArrayOutputStream responseBuf,
Call call, RpcStatusProto status, RpcErrorCodeProto erCode,
Writable rv, String errorClass, String error)
throws IOException {
responseBuf.reset();
DataOutputStream out = new DataOutputStream(responseBuf);
RpcResponseHeaderProto.Builder headerBuilder =
RpcResponseHeaderProto.newBuilder();
headerBuilder.setClientId(ByteString.copyFrom(call.clientId));
headerBuilder.setCallId(call.callId);
headerBuilder.setRetryCount(call.retryCount);
headerBuilder.setStatus(status);
headerBuilder.setServerIpcVersionNum(CURRENT_VERSION);
if (status == RpcStatusProto.SUCCESS) {
RpcResponseHeaderProto header = headerBuilder.build();
final int headerLen = header.getSerializedSize();
int fullLength = CodedOutputStream.computeRawVarint32Size(headerLen) +
headerLen;
try {
if (rv instanceof ProtobufRpcEngine.RpcWrapper) {
ProtobufRpcEngine.RpcWrapper resWrapper =
(ProtobufRpcEngine.RpcWrapper) rv;
fullLength += resWrapper.getLength();
out.writeInt(fullLength);
header.writeDelimitedTo(out);
rv.write(out);
} else { // Have to serialize to buffer to get len
final DataOutputBuffer buf = new DataOutputBuffer();
rv.write(buf);
byte[] data = buf.getData();
fullLength += buf.getLength();
out.writeInt(fullLength);
header.writeDelimitedTo(out);
out.write(data, 0, buf.getLength());
}
} catch (Throwable t) {
LOG.warn("Error serializing call response for call " + call, t);
// Call back to same function - this is OK since the
// buffer is reset at the top, and since status is changed
// to ERROR it won't infinite loop.
setupResponse(responseBuf, call, RpcStatusProto.ERROR,
RpcErrorCodeProto.ERROR_SERIALIZING_RESPONSE,
null, t.getClass().getName(),
StringUtils.stringifyException(t));
return;
}
} else { // Rpc Failure
headerBuilder.setExceptionClassName(errorClass);
headerBuilder.setErrorMsg(error);
headerBuilder.setErrorDetail(erCode);
RpcResponseHeaderProto header = headerBuilder.build();
int headerLen = header.getSerializedSize();
final int fullLength =
CodedOutputStream.computeRawVarint32Size(headerLen) + headerLen;
out.writeInt(fullLength);
header.writeDelimitedTo(out);
}
call.setResponse(ByteBuffer.wrap(responseBuf.toByteArray()));
}
首先是第6行用responseBuf创建一个输出流,然后是第7行创建一个headerbuilder,并在其中设置call对象的信息。然后是第15行,根据方法执行是否成功执行不同的方法,如果执行成功则执行if语句内的内容。先使用之前的headerbuilder创建一个header,然后再根据rv的类型来执行对应的方法将header和rv通过输出流写到responseBuf中。最后第59行将buf中的数据设置到call对象的response中。
然后再分析sendResponse方法,这个方法内容如下:
public void sendResponse() throws IOException {
int count = responseWaitCount.decrementAndGet();
assert count >= 0 : "response has already been sent";
if (count == 0) {
connection.sendResponse(this);
}
}
这里重点在第5行,这里会调用connection的sendResponse方法来发送数据。该方法的内容如下:
private void sendResponse(Call call) throws IOException {
responder.doRespond(call);
}
这里也很简单,就是调用responder的doRespond方法进行发送。该方法内容如下:
void doRespond(Call call) throws IOException {
synchronized (call.connection.responseQueue) {
// must only wrap before adding to the responseQueue to prevent
// postponed responses from being encrypted and sent out of order.
if (call.connection.useWrap) {
ByteArrayOutputStream response = new ByteArrayOutputStream();
wrapWithSasl(response, call);
call.setResponse(ByteBuffer.wrap(response.toByteArray()));
}
call.connection.responseQueue.addLast(call);
if (call.connection.responseQueue.size() == 1) {
processResponse(call.connection.responseQueue, true);
}
}
}
这里重点在第10行会将call添加到connection的responseQueue中,然后判断这个队列中是否只有一个对象,若是则执行processResponse方法。processResponse方法内容如下:
private boolean processResponse(LinkedList<Call> responseQueue,
boolean inHandler) throws IOException {
boolean error = true;
boolean done = false; // there is more data for this channel.
int numElements = 0;
Call call = null;
try {
synchronized (responseQueue) {
//
// If there are no items for this channel, then we are done
//
numElements = responseQueue.size();
if (numElements == 0) {
error = false;
return true; // no more data for this channel.
}
//
// Extract the first call
//
call = responseQueue.removeFirst();
SocketChannel channel = call.connection.channel;
if (LOG.isDebugEnabled()) {
LOG.debug(Thread.currentThread().getName() + ": responding to " + call);
}
//
// Send as much data as we can in the non-blocking fashion
//
int numBytes = channelWrite(channel, call.rpcResponse);
if (numBytes < 0) {
return true;
}
if (!call.rpcResponse.hasRemaining()) {
//Clear out the response buffer so it can be collected
call.rpcResponse = null;
call.connection.decRpcCount();
if (numElements == 1) { // last call fully processes.
done = true; // no more data for this channel.
} else {
done = false; // more calls pending to be sent.
}
if (LOG.isDebugEnabled()) {
LOG.debug(Thread.currentThread().getName() + ": responding to " + call
+ " Wrote " + numBytes + " bytes.");
}
} else {
//
// If we were unable to write the entire response out, then
// insert in Selector queue.
//
call.connection.responseQueue.addFirst(call);
if (inHandler) {
// set the serve time when the response has to be sent later
call.timestamp = Time.now();
incPending();
try {
// Wakeup the thread blocked on select, only then can the call
// to channel.register() complete.
writeSelector.wakeup();
channel.register(writeSelector, SelectionKey.OP_WRITE, call);
} catch (ClosedChannelException e) {
//Its ok. channel might be closed else where.
done = true;
} finally {
decPending();
}
}
if (LOG.isDebugEnabled()) {
LOG.debug(Thread.currentThread().getName() + ": responding to " + call
+ " Wrote partial " + numBytes + " bytes.");
}
}
error = false; // everything went off well
}
} finally {
if (error && call != null) {
LOG.warn(Thread.currentThread().getName()+", call " + call + ": output error");
done = true; // error. no more data for this channel.
closeConnection(call.connection);
}
}
return done;
}
首先是第3行到第6行,创建了一些变量。然后是第12行到第16行检查传入队列的大小。然后是第20行取出队列中的第一个数据。然后第21行从这个call对象中拿到该对象的channel。然后是第28行调用channelWrite方法将call对象的response通过channel写回客户端。然后是第32行的if语句,用来判断call对象中的response是否已经写完。若写完则执行if中的语句:主要是设置一些标识。若没写完则执行第45行的else语句:首先是第50行会将call对象放回队列的头部,然后是第52行判断这个方法是否是有handler调用的,若是则会执行第61行将channel注册到writeSelector中注册的事件为写事件,待channel可写之后,便接着写call中的数据。注册完之后便是一堆异常处理,然后就是第83行直接返回了。
从上面的分析可以知道,handler在执行完远程调用的方法后会直接调用responder的方法返回数据,若数据太多则会等待responder的下一次执行。这样handler便分析完成了,接着继续分析responder线程,这个线程的run方法如下:
public void run() {
LOG.info(Thread.currentThread().getName() + ": starting");
SERVER.set(Server.this);
try {
doRunLoop();
} finally {
LOG.info("Stopping " + Thread.currentThread().getName());
try {
writeSelector.close();
} catch (IOException ioe) {
LOG.error("Couldn't close write selector in " + Thread.currentThread().getName(), ioe);
}
}
}
重点是第5行调用的doRunLoop方法,该方法内容如下:
private void doRunLoop() {
long lastPurgeTime = 0; // last check for old calls.
while (running) {
try {
waitPending(); // If a channel is being registered, wait.
writeSelector.select(PURGE_INTERVAL);
Iterator<SelectionKey> iter = writeSelector.selectedKeys().iterator();
while (iter.hasNext()) {
SelectionKey key = iter.next();
iter.remove();
try {
if (key.isWritable()) {
doAsyncWrite(key);
}
} catch (CancelledKeyException cke) {
// something else closed the connection, ex. reader or the
// listener doing an idle scan. ignore it and let them clean
// up
Call call = (Call)key.attachment();
if (call != null) {
LOG.info(Thread.currentThread().getName() +
": connection aborted from " + call.connection);
}
} catch (IOException e) {
LOG.info(Thread.currentThread().getName() + ": doAsyncWrite threw exception " + e);
}
}
long now = Time.now();
if (now < lastPurgeTime + PURGE_INTERVAL) {
continue;
}
lastPurgeTime = now;
//
// If there were some calls that have not been sent out for a
// long time, discard them.
//
if(LOG.isDebugEnabled()) {
LOG.debug("Checking for old call responses.");
}
ArrayList<Call> calls;
// get the list of channels from list of keys.
synchronized (writeSelector.keys()) {
calls = new ArrayList<Call>(writeSelector.keys().size());
iter = writeSelector.keys().iterator();
while (iter.hasNext()) {
SelectionKey key = iter.next();
Call call = (Call)key.attachment();
if (call != null && key.channel() == call.connection.channel) {
calls.add(call);
}
}
}
for(Call call : calls) {
doPurge(call, now);
}
} catch (OutOfMemoryError e) {
//
// we can run out of memory if we have too many threads
// log the event and sleep for a minute and give
// some thread(s) a chance to finish
//
LOG.warn("Out of Memory in server select", e);
try { Thread.sleep(60000); } catch (Exception ie) {}
} catch (Exception e) {
LOG.warn("Exception in Responder", e);
}
}
}
首先是第4行,这里是一个与其他线程类相同的while循环,然后是第7行的 writeSelector的select方法,这个方法在之前的文档提到过,这个selector上注册的都是写事件,所以这里只会返回并处理写事件。发生写事件后会调用第14行的doAsyncWrite方法来处理。然后是一系列异常处理,之后是第29行检查当前时间与lastPurgeTime,若时间超过指定时间则向下执行清除call。
上述处理写事件的doAsyncWrite方法内容如下:
private void doAsyncWrite(SelectionKey key) throws IOException {
Call call = (Call)key.attachment();
if (call == null) {
return;
}
if (key.channel() != call.connection.channel) {
throw new IOException("doAsyncWrite: bad channel");
}
synchronized(call.connection.responseQueue) {
if (processResponse(call.connection.responseQueue, false)) {
try {
key.interestOps(0);
} catch (CancelledKeyException e) {
/* The Listener/reader might have closed the socket.
* We don't explicitly cancel the key, so not sure if this will
* ever fire.
* This warning could be removed.
*/
LOG.warn("Exception while changing ops : " + e);
}
}
}
}
这里的重点在第11行,这里会继续调用processResponse方法将call数据返回给客户端。