SOFABolt源码分析-协议及消息编解码篇
1. 前言
在上篇文章中大致介绍了SOFABolt服务端怎么用自定义的UserProcessor去处理客户端的消息,能大致了解SOFABolt的工作原理和流程,那我们也知道,消息在传输的时候有几个问题也是必须要去解决的,如粘包拆包、消息类型、版本控制和请求本身的控制参数如超时时长等。
这里拿几个问题来讲解一下:
1.1 粘包拆包:
消息在网络传输一般使用TCP协议,而TCP协议又是面向流没有保护消息边界的,这就意味着假如我们有三个数据包要发送给服务端,它可能就会合并成一个包,这就是粘包。那为什么这么做,因为可以很大程度上提高传输效率。但是这也给消息接收方带来困扰,接收方需要去拆分数据包以保证数据不会读错乱码等,这就是拆包。一般解决方式就是自定义协议,约定好发送的数据包的大小,这样接收方就可以按照包大小准确拆分。更何况在NIO和Netty中都喜欢用Buffer来接收和处理消息,那粘包拆包的现象就尤为明显了。
1.2 请求类型
在SOFABolt中消息类型分为RESPONSE、REQUEST和REQUEST_ONEWAY,这也很好理解了,是请求消息还是响应消息还是请求了就不想管的消息。
1.3 版本控制
随着服务的升级,之前的协议可能满足不了现有服务的需求,那么就可能在协议上扩展一些其他的东西,但是之前的协议还是要继续使用,那就需要做版本控制了,不同的协议版本的消息需要做不同的处理,这里可以想到用策略模式去处理。
其他的就不去过多介绍了,总之一个底层的通信框架可以在协议上做很多文章,不仅仅SOFABolt,作为RocketMQ的通信模块remoting里面也去定义了自己的协议,不过RocketMQ的通信协议个人认为要比SOFABolt复杂一些,有兴趣的可以去比较一下。
接下来咱们进入正题,来剖析一下SOFABolt是怎么定义自己的协议的。
2. 协议-Protocol
目前SOFABolt中定义了两个协议版本,一个是RpcProtocol,另一个是RpcProtocolV2,V2上面加了协议的版本,这篇文章就V2版本来分析。
2.1 RpcProtocolV2
咱们来看下这个类里面有些什么
RpcProtocolV2
/**
* Request command protocol for v2
* 0 1 2 4 6 8 10 11 12 14 16
* +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+
* |proto| ver1|type | cmdcode |ver2 | requestId |codec|switch| timeout |
* +-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+
* |classLen |headerLen |contentLen | ... |
* +-----------+-----------+-----------+-----------+ +
* | className + header + content bytes |
* + +
* | ... ... | CRC32(optional) |
* +------------------------------------------------------------------------------------------------+
*
* proto: code for protocol(协议代码,用于区分V1还是v2)
* ver1: version for protocol(协议版本)
* type: request/response/request oneway(消息类型)
* cmdcode: code for remoting command(Command类型,RequestCommand还是ResponseComand还是HeartbeatCommand)
* ver2:version for remoting command(Command版本)
* requestId: id of request
* codec: code for codec(编解码器类型)
* switch: function switch for protocol(Protocol协议内置功能开关,如CRC冗余校验)
* classLen(类名长度)
* headerLen: length of header(协议头长度)
* contentLen: length of content(消息内容长度)
* CRC32: CRC32 of the frame(Exists when ver1 > 1)(CRC荣誉校验)
*
* Response command protocol for v2
* 0 1 2 3 4 6 8 10 11 12 14 16
* +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+
* |proto| ver1| type| cmdcode |ver2 | requestId |codec|switch|respstatus | classLen |
* +-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+
* |headerLen | contentLen | ... |
* +-----------------------------------+ +
* | className + header + content bytes |
* + +
* | ... ... | CRC32(optional) |
* +------------------------------------------------------------------------------------------------+
* respstatus: response status
*
* @author jiangping
* @version $Id: RpcProtocolV2.java, v 0.1 2017-05-27 PM7:04:04 tao Exp $
*/
public class RpcProtocolV2 implements Protocol {
/* because the design defect, the version is neglected in RpcProtocol, so we design RpcProtocolV2 and add protocol version. */
public static final byte PROTOCOL_CODE = (byte) 2;
/** version 1, is the same with RpcProtocol */
public static final byte PROTOCOL_VERSION_1 = (byte) 1;
/** version 2, is the protocol version for RpcProtocolV2 */
public static final byte PROTOCOL_VERSION_2 = (byte) 2;
/**
* in contrast to protocol v1,
* one more byte is used as protocol version,
* and another one is userd as protocol switch
*/
private static final int REQUEST_HEADER_LEN = 22 + 2;
private static final int RESPONSE_HEADER_LEN = 20 + 2;
private CommandEncoder encoder;
private CommandDecoder decoder;
private HeartbeatTrigger heartbeatTrigger;
private CommandHandler commandHandler;
private CommandFactory commandFactory;
public RpcProtocolV2() {
this.encoder = new RpcCommandEncoderV2();
this.decoder = new RpcCommandDecoderV2();
this.commandFactory = new RpcCommandFactory();
this.heartbeatTrigger = new RpcHeartbeatTrigger(this.commandFactory);
this.commandHandler = new RpcCommandHandler(this.commandFactory);
}
//省略getter/setter方法
}
静态变量REQUEST_HEADER_LEN表示请求头占24个字节,RESPONSE_HEADER_LEN表示响应头占22个字节,上面的注释解释了请求头或响应头的每个部分的含义。并且在构造函数中初始化了消息(RpcCommand)编解码器、消息(RpcCommand)创建工厂、消息(RpcCommand)处理器。
3. 消息载体-RpcCommand
SOFABolt中RpcCommand也是一个重要的概念,可以将它理解为消息的载体,在网络中传输的就是它,可以看下它的类继承结构 。
看下RpcCommand类中地成员变量:
public abstract class RpcCommand implements RemotingCommand {
/** For serialization */
private static final long serialVersionUID = -3570261012462596503L;
/**
* Code which stands for the command.
*/
private CommandCode cmdCode;
/* command version */
private byte version = 0x1;
private byte type;
/**
* Serializer, see the Configs.SERIALIZER_DEFAULT for the default serializer.
* Notice: this can not be changed after initialized at runtime.
*/
private byte serializer = ConfigManager.serializer;
/**
* protocol switches
*/
private ProtocolSwitch protocolSwitch = new ProtocolSwitch();
private int id;
/** The length of clazz */
private short clazzLength = 0;
private short headerLength = 0;
private int contentLength = 0;
/** The class of content */
private byte[] clazz;
/** Header is used for transparent transmission. */
private byte[] header;
/** The bytes format of the content of the command. */
private byte[] content;
/** invoke context of each rpc command. */
private InvokeContext invokeContext;
}
有两个继承RpcCommand的类RequestCommand和ResponseCommand,他们只是针对Request和Response类型的请求初始化了RpcCommand中的局部变量,从而使他们有不同的行为,有点策略模式的意思。比如在ResponseCommand中用构造函数将RpcCommand中的type初始化为RESPONSE类型。
4. 消息编解码器-CommandDecoder/CommandEncoder
如果有Netty的基础知识,我们肯定可以想到消息的编码和解码肯定是发生在出站和入站的时候,在SOFABolt中确实也是这样,但是SOFABolt中调用这个编码解码的动作有点隐晦,在ChannelHandler中它是在先将读取协议头的proto和version部分,从ProtocolManager中拿到对应的Protocol,结合第一篇文章,ProtocolManager中的Protocol是在服务端启动之前初始化的,再结合2.1节拿到Protocol中对应的编解码器去解码,这也是策略模式的体现。
RpcProtocolManager#initProtocols()
public static void initProtocols() {
ProtocolManager.registerProtocol(new RpcProtocol(), RpcProtocol.PROTOCOL_CODE);
ProtocolManager.registerProtocol(new RpcProtocolV2(), RpcProtocolV2.PROTOCOL_CODE);
}
那么为什么不直接在ChannelHandler中去编码解码呢?
这还是为了更好地扩展,如果想实现不同的编码解码器,只需要实现CommandEncoder和CommandDecoder即可而不用重新去写一个ChannelHandler。
简单看一下RpcCommandEncoderV2#encode()方法
public void encode(ChannelHandlerContext ctx, Serializable msg, ByteBuf out) throws Exception {
try {
if (msg instanceof RpcCommand) {
/*
* proto: magic code for protocol
* ver: version for protocol
* type: request/response/request oneway
* cmdcode: code for remoting command
* ver2:version for remoting command
* requestId: id of request
* codec: code for codec
* switch: function switch
* (req)timeout: request timeout.
* (resp)respStatus: response status
* classLen: length of request or response class name
* headerLen: length of header
* cotentLen: length of content
* className
* header
* content
* crc (optional)
*/
int index = out.writerIndex();
RpcCommand cmd = (RpcCommand) msg;
out.writeByte(RpcProtocolV2.PROTOCOL_CODE);
Attribute<Byte> version = ctx.channel().attr(Connection.VERSION);
byte ver = RpcProtocolV2.PROTOCOL_VERSION_1;
if (version != null && version.get() != null) {
ver = version.get();
}
out.writeByte(ver);
out.writeByte(cmd.getType());
out.writeShort(((RpcCommand) msg).getCmdCode().value());
out.writeByte(cmd.getVersion());
out.writeInt(cmd.getId());
out.writeByte(cmd.getSerializer());
out.writeByte(cmd.getProtocolSwitch().toByte());
if (cmd instanceof RequestCommand) {
//timeout
out.writeInt(((RequestCommand) cmd).getTimeout());
}
if (cmd instanceof ResponseCommand) {
//response status
ResponseCommand response = (ResponseCommand) cmd;
out.writeShort(response.getResponseStatus().getValue());
}
out.writeShort(cmd.getClazzLength());
out.writeShort(cmd.getHeaderLength());
out.writeInt(cmd.getContentLength());
if (cmd.getClazzLength() > 0) {
out.writeBytes(cmd.getClazz());
}
if (cmd.getHeaderLength() > 0) {
out.writeBytes(cmd.getHeader());
}
if (cmd.getContentLength() > 0) {
out.writeBytes(cmd.getContent());
}
if (ver == RpcProtocolV2.PROTOCOL_VERSION_2
&& cmd.getProtocolSwitch().isOn(ProtocolSwitch.CRC_SWITCH_INDEX)) {
// compute the crc32 and write to out
byte[] frame = new byte[out.readableBytes()];
out.getBytes(index, frame);
out.writeInt(CrcUtil.crc32(frame));
}
} else {
String warnMsg = "msg type [" + msg.getClass() + "] is not subclass of RpcCommand";
logger.warn(warnMsg);
}
} catch (Exception e) {
logger.error("Exception caught!", e);
throw e;
}
}
上面方法做的事情无非就是按照协议头规定的规则向ByteBuf中写入消息。可想而知解码器是做相反的事情,按照协议头规定的规则读取消息。
5. 总结
最后小小总结一下,我们要回到服务初始化的时候:
this.bootstrap.childHandler(new ChannelInitializer<SocketChannel>() {
@Override
protected void initChannel(SocketChannel channel) {
ChannelPipeline pipeline = channel.pipeline();
//省略其他代码
pipeline.addLast("decoder", codec.newDecoder());
pipeline.addLast("encoder", codec.newEncoder());
//省略其他代码
pipeline.addLast("connectionEventHandler", connectionEventHandler);
pipeline.addLast("handler", rpcHandler);
createConnection(channel);
}
这里向ChannelPipeline中添加了一个encoder和一个decoder,这两个东西是什么呢?
5.1 RpcCodec
RpcCodec
public class RpcCodec implements Codec {
@Override
public ChannelHandler newEncoder() {
//构造函数指定默认的协议编码
return new ProtocolCodeBasedEncoder(ProtocolCode.fromBytes(RpcProtocolV2.PROTOCOL_CODE));
}
@Override
public ChannelHandler newDecoder() {
return new RpcProtocolDecoder(RpcProtocolManager.DEFAULT_PROTOCOL_CODE_LENGTH);
}
}
下面将ProtocolCodeBasedEncoder作为例子
5.2 ProtocolCodeBasedEncoder
public class ProtocolCodeBasedEncoder extends MessageToByteEncoder<Serializable> {
/** default protocol code */
protected ProtocolCode defaultProtocolCode;
public ProtocolCodeBasedEncoder(ProtocolCode defaultProtocolCode) {
super();
this.defaultProtocolCode = defaultProtocolCode;
}
@Override
protected void encode(ChannelHandlerContext ctx, Serializable msg, ByteBuf out)
throws Exception {
Attribute<ProtocolCode> att = ctx.channel().attr(Connection.PROTOCOL);
ProtocolCode protocolCode;
if (att == null || att.get() == null) {
protocolCode = this.defaultProtocolCode;
} else {
protocolCode = att.get();
}
//拿到协议
Protocol protocol = ProtocolManager.getProtocol(protocolCode);
//拿到编码器进行编码
protocol.getEncoder().encode(ctx, msg, out);
}
}
5.3 流程图
这一下就清晰了吧,贫僧就不过多解释了。