从本篇开始,我们一层一层对thrift源码进行解读。
从前文可知,Processor层是由idl编译而来,主要是进行数据流转,将协议层解析好的数据交由真正的服务去做,并将服务返回的数据交给协议层进行序列化处理。
为了保证约定的一致性,处理层的代码会在客户端和服务端被同时引入,为客户端提供代理类,为服务端提供接口供其实现。
我们先看下上一篇生成的HelloService的结构:
根据功能我们可以将其大致分为三个部分:
-
接口功能部分:Iface/AsyncIface/Client/AsyncClient
-
服务调用部分:Processor
-
参数部分:sayHello_args/sayHello_result
1.接口功能部分
接口功能部分包含了iface、client及其异步版本四个类,其中,Iface是接口,供服务端实现,Client是类,供客户端调用。
我们先看下Iface:
public interface Iface {
public String sayHello(String word) throws org.apache.thrift.TException;
}
可以看到,它里面只是提供了我们在idl中定义的方法,当我们在服务端对这个方法进行实现时,也需要实现这个接口:
public class HelloServiceImpl implements HelloService.Iface {
@Override
public String sayHello(String word) throws TException {
return String.format("%s,Hello!", word);
}
}
我们再看下client:
public static class Client extends org.apache.thrift.TServiceClient implements Iface {
...
}
可以看到,client也实现了iface接口,同时,client还继承了TServiceClient类。我们先看下client的构造方法:
public Client(org.apache.thrift.protocol.TProtocol iprot, org.apache.thrift.protocol.TProtocol oprot) {
super(iprot, oprot);
}
public Client(org.apache.thrift.protocol.TProtocol prot)
{
super(prot, prot);
}
其构造方法都是调用的父类的构造方法,那我们看下TServiceClient类:
public abstract class TServiceClient {
protected TProtocol iprot_;// 输入协议实体
protected TProtocol oprot_;// 输出协议实体
protected int seqid_;// 序列号
public TServiceClient(TProtocol prot) {
this(prot, prot);
}
public TServiceClient(TProtocol iprot, TProtocol oprot) {
this.iprot_ = iprot;
this.oprot_ = oprot;
}
...
}
TProctocol即thrift协议层类
我们再看下HelloService.Client对sayHello方法的实现:
public String sayHello(String word) throws org.apache.thrift.TException
{
send_sayHello(word);
return recv_sayHello();
}
调用了send_sayHello方法,返回recv_sayHello方法。
我们先看下send_sayHello方法:
public void send_sayHello(String word) throws org.apache.thrift.TException
{
sayHello_args args = new sayHello_args();// 接口输入参数封装实体
args.setWord(word);// 将实际请求参数赋值到封装的实体中
sendBase("sayHello", args);// 调用父类的sendBase方法发起请求
}
看下ServiceClient的sendBase实现:
protected void sendBase(String methodName, TBase args) throws TException {
this.oprot_.writeMessageBegin(new TMessage(methodName, (byte)1, ++this.seqid_));// oprot_创建一个TMessage实体并对其进行序列化,TMessage代表着一次rpc请求
args.write(this.oprot_);// 将输入参数写入协议层
this.oprot_.writeMessageEnd();// 写入结束
this.oprot_.getTransport().flush();// 对socket进行清理,准备下次传输
}
我们再看下recv_sayHello:
public String recv_sayHello() throws org.apache.thrift.TException
{
sayHello_result result = new sayHello_result();// 对返回结果进行封装
receiveBase(result, "sayHello");// 调用父类的receiveBase方法接受返回结果并加载到result中
if (result.isSetSuccess()) {
return result.success;// 如果返回成功,则返回数据
}
throw new org.apache.thrift.TApplicationException(org.apache.thrift.TApplicationException.MISSING_RESULT, "sayHello failed: unknown result");// 否则抛出异常
}
看下ServiceClient的receiveBase实现:
protected void receiveBase(TBase result, String methodName) throws TException {
TMessage msg = this.iprot_.readMessageBegin();// 通过TProtocol对象反序列化返回值并读取
if (msg.type == 3) {// 如果消息类型为3,说明服务端抛出异常
TApplicationException x = TApplicationException.read(this.iprot_);
this.iprot_.readMessageEnd();
throw x;
} else if (msg.seqid != this.seqid_) {// 如果序列号不对,说明返回的结果不是本次的请求
throw new TApplicationException(4, methodName + " failed: out of sequence response");
} else {
result.read(this.iprot_);// 从协议层中读取返回结果
this.iprot_.readMessageEnd();// 结束读取
}
}
2. 服务调用部分
服务调用部分仅包含一个Processor类,该类提供给服务端使用:
public static class Processor<I extends Iface> extends org.apache.thrift.TBaseProcessor<I> implements org.apache.thrift.TProcessor {
...
}
可以看到,Processor继承了TBaseProcesoor类,且实现了TProcessor接口,但是其并没有实现TProcessor接口的process方法,process方法的实现交由其父类完成。
Processor的构造方法也是调用了父类的构造方法:
private final I iface;// iface接口示例,我们服务的真实实现
private final Map<String, ProcessFunction<I, ? extends TBase>> processMap;// 该map维护了服务方法名和服务方法包装实体的映射关系
protected TBaseProcessor(I iface, Map<String, ProcessFunction<I, ? extends TBase>> processFunctionMap) {
this.iface = iface;
this.processMap = processFunctionMap;
}
代码中的ProcessFunction类是对服务端真正实现该接口的类即Iface的实现类的封装,主要负责与协议层交互,其功能类似于客户端中的client:
public abstract class ProcessFunction<I, T extends TBase> {
private final String methodName;
public ProcessFunction(String methodName) {
this.methodName = methodName;
}
public final void process(int seqid, TProtocol iprot, TProtocol oprot, I iface) throws TException {
TBase args = this.getEmptyArgsInstance();
try {
args.read(iprot);// 从协议层读取传过来的请求参数
} catch (TProtocolException var8) {
iprot.readMessageEnd();
TApplicationException x = new TApplicationException(7, var8.getMessage());
oprot.writeMessageBegin(new TMessage(this.getMethodName(), (byte)3, seqid));
x.write(oprot);
oprot.writeMessageEnd();
oprot.getTransport().flush();
return;
}
iprot.readMessageEnd();
TBase result = this.getResult(iface, args);// 调用iface的实现并获取返回结果
oprot.writeMessageBegin(new TMessage(this.getMethodName(), (byte)2, seqid));
result.write(oprot);
oprot.writeMessageEnd();
oprot.getTransport().flush();// 上面几行的实现与ServiceClient类的sendBase方法类似,将返回结果交个协议层并传输到客户端
}
protected abstract TBase getResult(I var1, T var2) throws TException;
protected abstract T getEmptyArgsInstance();
public String getMethodName() {
return this.methodName;
}
}
最后,我们看一下TBaseProcessor类对process方法的实现,当socket请求到来时,服务层会调用该方法与处理层进行交互:
public boolean process(TProtocol in, TProtocol out) throws TException {
TMessage msg = in.readMessageBegin();// 读取客户端请求数据
ProcessFunction fn = (ProcessFunction)this.processMap.get(msg.name);// 根据客户端请求的方法名从processMap中找到对应的processFunction
if (fn == null) {
TProtocolUtil.skip(in, (byte)12);
in.readMessageEnd();
TApplicationException x = new TApplicationException(1, "Invalid method name: '" + msg.name + "'");
out.writeMessageBegin(new TMessage(msg.name, (byte)3, msg.seqid));
x.write(out);
out.writeMessageEnd();
out.getTransport().flush();
return true;
} else {
fn.process(msg.seqid, in, out, this.iface);// 如果找到了,调用processFunction的process方法
return true;
}
}
3.参数部分
最后,我们看一下参数部分,很显然,参数部分含有两个类,一个是请求参数类(sayHello_args),一个是返回结果类(sayHello_result),二者都实现了TBase接口,我们先看下这个接口:
package org.apache.thrift;
import java.io.Serializable;
import org.apache.thrift.protocol.TProtocol;
public interface TBase<T extends TBase<?, ?>, F extends TFieldIdEnum> extends Comparable<T>, Serializable {
void read(TProtocol var1) throws TException;// 从协议层读取数据
void write(TProtocol var1) throws TException;// 将数据写入协议层
F fieldForId(int var1);
boolean isSet(F var1);
Object getFieldValue(F var1);
void setFieldValue(F var1, Object var2);
TBase<T, F> deepCopy();
void clear();
}
二者在实现上是类似的,我们就以sayHello_args为例:
我们可以将这个类分为两个部分,一部分是类的变量及其相关操作,在本例中即我们的入参,字符串word,thrift会将参数全部包装成一个_Fields的枚举类,_Fileds中通过id来标识变量,这也是为什么我们在idl中对变量名称进行了修改,但是依然可以正确传递参数的原因,thrift是通过idl中定义的id来标识变量的。
public enum _Fields implements org.apache.thrift.TFieldIdEnum {
WORD((short)1, "word");
private static final Map<String, _Fields> byName = new HashMap<String, _Fields>();
static {
for (_Fields field : EnumSet.allOf(_Fields.class)) {
byName.put(field.getFieldName(), field);
}
}
...
}
另一部分是变量与协议层的交互操作,这一部分交由Scheme来完成,我们在上图中可以看到两个scheme,一个是TupleScheme,一个是StandardScheme,二者皆实现了IScheme接口:
package org.apache.thrift.scheme;
import org.apache.thrift.TBase;
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TProtocol;
public interface IScheme<T extends TBase> {
void read(TProtocol var1, T var2) throws TException;
void write(TProtocol var1, T var2) throws TException;
}
该接口仅提供了两个方法,一个是read,一个是write。我们分别看下TupleScheme与StandardScheme的实现:
StandardScheme:
private static class sayHello_argsStandardScheme extends StandardScheme<sayHello_args> {
public void read(org.apache.thrift.protocol.TProtocol iprot, sayHello_args struct) throws org.apache.thrift.TException {
org.apache.thrift.protocol.TField schemeField;
iprot.readStructBegin();// 调用TProtocol标记结构体开始
while (true)
{
schemeField = iprot.readFieldBegin();// 循环读取结构体中的变量
if (schemeField.type == org.apache.thrift.protocol.TType.STOP) {
break;
}
switch (schemeField.id) {
case 1: // WORD // 读取word
if (schemeField.type == org.apache.thrift.protocol.TType.STRING) {
struct.word = iprot.readString();// 读取字符串类型
struct.setWordIsSet(true);
} else {
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, schemeField.type);
}
break;
default:
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, schemeField.type);
}
iprot.readFieldEnd();// 读取变量结束
}
iprot.readStructEnd();// 读取结构体结束
// check for required fields of primitive type, which can't be checked in the validate method
struct.validate();
}
public void write(org.apache.thrift.protocol.TProtocol oprot, sayHello_args struct) throws org.apache.thrift.TException {
struct.validate();
oprot.writeStructBegin(STRUCT_DESC);// 写入结构体描述对象
if (struct.word != null) {
oprot.writeFieldBegin(WORD_FIELD_DESC);// 写入变量描述对象
oprot.writeString(struct.word);// 写入变量值
oprot.writeFieldEnd();// 变量写入结束
}
oprot.writeFieldStop();// 所有变量写入完成
oprot.writeStructEnd();// 结构体写入完成
}
}
我们可以看到,StandardScheme是以一种结构化的方式对参数进行序列化,其底层的序列化方法还是交由协议层来处理,为了能够控制参数的结构,上面的代码中也看到了两个常量:STRUCT_DESC和WORD_FIELD_DESC,这两个常量分别是TStruct对象和TField对象,用来描述结构体的结构和变量的结构:
TStruct:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
public final class TStruct {
public final String name;// 结构体名称
public TStruct() {
this("");
}
public TStruct(String n) {
this.name = n;
}
}
TField:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
public class TField {
public final String name;// 变量名
public final byte type;// 变量类型
public final short id;// 变量id
public TField() {
this("", (byte)0, (short)0);
}
public TField(String n, byte t, short i) {
this.name = n;
this.type = t;
this.id = i;
}
public String toString() {
return "<TField name:'" + this.name + "' type:" + this.type + " field-id:" + this.id + ">";
}
public boolean equals(TField otherField) {
return this.type == otherField.type && this.id == otherField.id;
}
}
thrift定义的变量类型:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
public final class TType {
public static final byte STOP = 0;// 用来标识结束
public static final byte VOID = 1;
public static final byte BOOL = 2;
public static final byte BYTE = 3;
public static final byte DOUBLE = 4;
public static final byte I16 = 6;
public static final byte I32 = 8;
public static final byte I64 = 10;
public static final byte STRING = 11;
public static final byte STRUCT = 12;
public static final byte MAP = 13;
public static final byte SET = 14;
public static final byte LIST = 15;
public static final byte ENUM = 16;
public TType() {
}
}
这两个类是属于协议层的,拿到这里来讲只是为了帮助更好的理解StandardScheme。
TupleScheme:
// 该scheme涉及thrift协议相关,我们在协议层进行介绍
private static class sayHello_argsTupleScheme extends TupleScheme<sayHello_args> {
@Override
public void write(org.apache.thrift.protocol.TProtocol prot, sayHello_args struct) throws org.apache.thrift.TException {
TTupleProtocol oprot = (TTupleProtocol) prot;// TTupleProtocol是CompactProtocol的实现,thrift协议层支持四种协议,compactProtocol是其中一种压缩化的协议。从这里也可以说明,TupleScheme仅支持对CompactProtocol的io。
BitSet optionals = new BitSet();
if (struct.isSetWord()) {
optionals.set(0);
}
oprot.writeBitSet(optionals, 1);
if (struct.isSetWord()) {
oprot.writeString(struct.word);
}
}
@Override
public void read(org.apache.thrift.protocol.TProtocol prot, sayHello_args struct) throws org.apache.thrift.TException {
TTupleProtocol iprot = (TTupleProtocol) prot;
BitSet incoming = iprot.readBitSet(1);
if (incoming.get(0)) {
struct.word = iprot.readString();
struct.setWordIsSet(true);
}
}
}
回到sayHello_args实现的read/write方法,可以看到,二者都交给了底层的scheme去做,从这里我们可以看到,具体使用哪个scheme去解析参数,是根据传递过来的协议类型来确定的
public void read(org.apache.thrift.protocol.TProtocol iprot) throws org.apache.thrift.TException {
schemes.get(iprot.getScheme()).getScheme().read(iprot, this);
}
public void write(org.apache.thrift.protocol.TProtocol oprot) throws org.apache.thrift.TException {
schemes.get(oprot.getScheme()).getScheme().write(oprot, this);
}
至此,处理层分析结束。
4. 总结
总结一下处理层的流程: