thrift的TCompactProtocol和TMemoryBuffer类分析

前言

架构

Apache Thrift API CS架构
Thrift包含一套完整的栈来创建客户端和服务端程序。[7]顶层部分是由Thrift定义生成的代码。而服务则由这个文件客户端和处理器代码生成。在生成的代码里会创建不同于内建类型的数据结构,并将其作为结果发送。协议和传输层是运行时库的一部分。有了Thrift,就可以定义一个服务或改变通讯和传输协议,而无需重新编译代码。除了客户端部分之外,Thrift还包括服务器基础设施来集成协议和传输,如阻塞、非阻塞及多线程服务器。栈中作为I/O基础的部分对于不同的语言则有不同的实现。

Thrift支持众多通讯协议:[7]

  1. TBinaryProtocol – 一种简单的二进制格式,简单,但没有为空间效率而优化。比文本协议处理起来更快,但更难于调试。

  2. TCompactProtocol – 更紧凑的二进制格式,处理起来通常同样高效。

  3. TDebugProtocol – 一种人类可读的文本格式,用来协助调试。

  4. TDenseProtocol – 与TCompactProtocol类似,将传输数据的元信息剥离。

  5. TJSONProtocol – 使用JSON对数据编码。

  6. TSimpleJSONProtocol – 一种只写协议,它不能被Thrift解析,因为它使用JSON时丢弃了元数据。适合用脚本语言来解析。[8]
    支持的传输协议有:

  7. TFileTransport – 该传输协议会写文件。

  8. TFramedTransport – 当使用一个非阻塞服务器时,要求使用这个传输协议。它按帧来发送数据,其中每一帧的开头是长度信息。

  9. TMemoryTransport – 使用存储器映射输入输出。(Java的实现使用了一个简单的ByteArrayOutputStream。)

  10. TSocket – 使用阻塞的套接字I/O来传输。

  11. TZlibTransport – 用zlib执行压缩。用于连接另一个传输协议。
    Thrift还提供众多的服务器,包括:

  12. TNonblockingServer – 一个多线程服务器,它使用非阻塞I/O(Java的实现使用了NIO通道)。TFramedTransport必须跟这个服务器配套使用。

  13. TSimpleServer – 一个单线程服务器,它使用标准的阻塞I/O。测试时很有用。

  14. TThreadPoolServer – 一个多线程服务器,它使用标准的阻塞I/O。

优点
Thrift一些已经明确的优点包括:[来源请求]

跟一些替代选择,比如SOAP相比,跨语言序列化的代价更低,因为它使用二进制格式。
它有一个又瘦又干净的库,没有编码框架,没有XML配置文件。
绑定感觉很自然。例如,Java使用java.util.ArrayList;C++使用std::vectorstd::string。
应用层通讯格式与序列化层通讯格式是完全分离的。它们都可以独立修改。
预定义的序列化格式包括:二进制格式、对HTTP友好的格式,以及紧凑的二进制格式。
兼作跨语言文件序列化。
协议使用软版本号机制软件版本管理[需要解释]。Thrift不要求一个中心化的和显式的版本号机制,例如主版本号/次版本号。松耦合的团队可以轻松地控制RPC调用的演进。
没有构建依赖也不含非标准化的软件。不存在不兼容的软件许可证混用的情况。
创建一个Thrift服务
Thrift由C++编写,但可以为众多语言创建代码。要创建一个Thrift服务,必须写一些Thrift文件来描述它,为目标语言生成代码,并且写一些代码来启动服务器及从客户端调用它。下面就是一个这样的描述文件的代码示例:

enum PhoneType {
 HOME,
 WORK,
 MOBILE,
 OTHER
}

struct Phone {
 1: i32 id,
 2: string number,
 3: PhoneType type
}

正文

今天我们主要分析TCompactProtocol和TMemoryBuffer

一, TMemoryBuffer类分析

1. resetBuffer方法

申请内存 调用initCommon方法调用std::malloc

class TMemoryBuffer : public TVirtualTransport<TMemoryBuffer, TBufferBase> {
private:
  // Common initialization done by all constructors.
  void initCommon(uint8_t* buf, uint32_t size, bool owner, uint32_t wPos) {
    if (buf == NULL && size != 0) {
      assert(owner);
      buf = (uint8_t*)std::malloc(size);
      if (buf == NULL) {
        throw std::bad_alloc();
      }
    }

    buffer_ = buf;
    bufferSize_ = size;

    rBase_ = buffer_;
    rBound_ = buffer_ + wPos;
    // TODO(dreiss): Investigate NULL-ing this if !owner.
    wBase_ = buffer_ + wPos;
    wBound_ = buffer_ + bufferSize_;

    owner_ = owner;

    // rBound_ is really an artifact.  In principle, it should always be
    // equal to wBase_.  We update it in a few places (computeRead, etc.).
  }

public:
  static const uint32_t defaultSize = 1024;

  /**
   * This enum specifies how a TMemoryBuffer should treat
   * memory passed to it via constructors or resetBuffer.
   *
   * OBSERVE:
   *   TMemoryBuffer will simply store a pointer to the memory.
   *   It is the callers responsibility to ensure that the pointer
   *   remains valid for the lifetime of the TMemoryBuffer,
   *   and that it is properly cleaned up.
   *   Note that no data can be written to observed buffers.
   *
   * COPY:
   *   TMemoryBuffer will make an internal copy of the buffer.
   *   The caller has no responsibilities.
   *
   * TAKE_OWNERSHIP:
   *   TMemoryBuffer will become the "owner" of the buffer,
   *   and will be responsible for freeing it.
   *   The membory must have been allocated with malloc.
   */
  enum MemoryPolicy { OBSERVE = 1, COPY = 2, TAKE_OWNERSHIP = 3 };

  /**
   * Construct a TMemoryBuffer with a default-sized buffer,
   * owned by the TMemoryBuffer object.
   */
  TMemoryBuffer() { initCommon(NULL, defaultSize, true, 0); }

  /**
   * Construct a TMemoryBuffer with a buffer of a specified size,
   * owned by the TMemoryBuffer object.
   *
   * @param sz  The initial size of the buffer.
   */
  TMemoryBuffer(uint32_t sz) { initCommon(NULL, sz, true, 0); }

  /**
   * Construct a TMemoryBuffer with buf as its initial contents.
   *
   * @param buf    The initial contents of the buffer.
   *               Note that, while buf is a non-const pointer,
   *               TMemoryBuffer will not write to it if policy == OBSERVE,
   *               so it is safe to const_cast<uint8_t*>(whatever).
   * @param sz     The size of @c buf.
   * @param policy See @link MemoryPolicy @endlink .
   */
  TMemoryBuffer(uint8_t* buf, uint32_t sz, MemoryPolicy policy = OBSERVE) {
    if (buf == NULL && sz != 0) {
      throw TTransportException(TTransportException::BAD_ARGS,
                                "TMemoryBuffer given null buffer with non-zero size.");
    }

    switch (policy) {
    case OBSERVE:
    case TAKE_OWNERSHIP:
      initCommon(buf, sz, policy == TAKE_OWNERSHIP, sz);
      break;
    case COPY:
      initCommon(NULL, sz, true, 0);
      this->write(buf, sz);
      break;
    default:
      throw TTransportException(TTransportException::BAD_ARGS,
                                "Invalid MemoryPolicy for TMemoryBuffer");
    }
  }

  ~TMemoryBuffer() {
    if (owner_) {
      std::free(buffer_);
    }
  }

  bool isOpen() { return true; }

  bool peek() { return (rBase_ < wBase_); }

  void open() {}

  void close() {}

  // TODO(dreiss): Make bufPtr const.
  void getBuffer(uint8_t** bufPtr, uint32_t* sz) {
    *bufPtr = rBase_;
    *sz = static_cast<uint32_t>(wBase_ - rBase_);
  }

  std::string getBufferAsString() {
    if (buffer_ == NULL) {
      return "";
    }
    uint8_t* buf;
    uint32_t sz;
    getBuffer(&buf, &sz);
    return std::string((char*)buf, (std::string::size_type)sz);
  }

  void appendBufferToString(std::string& str) {
    if (buffer_ == NULL) {
      return;
    }
    uint8_t* buf;
    uint32_t sz;
    getBuffer(&buf, &sz);
    str.append((char*)buf, sz);
  }

  void resetBuffer() {
    rBase_ = buffer_;
    rBound_ = buffer_;
    wBase_ = buffer_;
    // It isn't safe to write into a buffer we don't own.
    if (!owner_) {
      wBound_ = wBase_;
      bufferSize_ = 0;
    }
  }

  /// See constructor documentation.
  void resetBuffer(uint8_t* buf, uint32_t sz, MemoryPolicy policy = OBSERVE) {
    // Use a variant of the copy-and-swap trick for assignment operators.
    // This is sub-optimal in terms of performance for two reasons:
    //   1/ The constructing and swapping of the (small) values
    //      in the temporary object takes some time, and is not necessary.
    //   2/ If policy == COPY, we allocate the new buffer before
    //      freeing the old one, precluding the possibility of
    //      reusing that memory.
    // I doubt that either of these problems could be optimized away,
    // but the second is probably no a common case, and the first is minor.
    // I don't expect resetBuffer to be a common operation, so I'm willing to
    // bite the performance bullet to make the method this simple.

    // Construct the new buffer.
    TMemoryBuffer new_buffer(buf, sz, policy);
    // Move it into ourself.
    this->swap(new_buffer);
    // Our old self gets destroyed.
  }

  /// See constructor documentation.
  void resetBuffer(uint32_t sz) {
    // Construct the new buffer.
    TMemoryBuffer new_buffer(sz);
    // Move it into ourself.
    this->swap(new_buffer);
    // Our old self gets destroyed.
  }

  std::string readAsString(uint32_t len) {
    std::string str;
    (void)readAppendToString(str, len);
    return str;
  }

  uint32_t readAppendToString(std::string& str, uint32_t len);

  // return number of bytes read
  uint32_t readEnd() {
    // This cast should be safe, because buffer_'s size is a uint32_t
    uint32_t bytes = static_cast<uint32_t>(rBase_ - buffer_);
    if (rBase_ == wBase_) {
      resetBuffer();
    }
    return bytes;
  }

  // Return number of bytes written
  uint32_t writeEnd() {
    // This cast should be safe, because buffer_'s size is a uint32_t
    return static_cast<uint32_t>(wBase_ - buffer_);
  }

  uint32_t available_read() const {
    // Remember, wBase_ is the real rBound_.
    return static_cast<uint32_t>(wBase_ - rBase_);
  }

  uint32_t available_write() const { return static_cast<uint32_t>(wBound_ - wBase_); }

  // Returns a pointer to where the client can write data to append to
  // the TMemoryBuffer, and ensures the buffer is big enough to accommodate a
  // write of the provided length.  The returned pointer is very convenient for
  // passing to read(), recv(), or similar. You must call wroteBytes() as soon
  // as data is written or the buffer will not be aware that data has changed.
  uint8_t* getWritePtr(uint32_t len) {
    ensureCanWrite(len);
    return wBase_;
  }

  // Informs the buffer that the client has written 'len' bytes into storage
  // that had been provided by getWritePtr().
  void wroteBytes(uint32_t len);

  /*
   * TVirtualTransport provides a default implementation of readAll().
   * We want to use the TBufferBase version instead.
   */
  uint32_t readAll(uint8_t* buf, uint32_t len) { return TBufferBase::readAll(buf, len); }

protected:
  void swap(TMemoryBuffer& that) {
    using std::swap;
    swap(buffer_, that.buffer_);
    swap(bufferSize_, that.bufferSize_);

    swap(rBase_, that.rBase_);
    swap(rBound_, that.rBound_);
    swap(wBase_, that.wBase_);
    swap(wBound_, that.wBound_);

    swap(owner_, that.owner_);
  }

  // Make sure there's at least 'len' bytes available for writing.
  void ensureCanWrite(uint32_t len);

  // Compute the position and available data for reading.
  void computeRead(uint32_t len, uint8_t** out_start, uint32_t* out_give);

  uint32_t readSlow(uint8_t* buf, uint32_t len);

  void writeSlow(const uint8_t* buf, uint32_t len);

  const uint8_t* borrowSlow(uint8_t* buf, uint32_t* len);

  // Data buffer
  uint8_t* buffer_;

  // Allocated buffer size
  uint32_t bufferSize_;

  // Is this object the owner of the buffer?
  bool owner_;

  // Don't forget to update constrctors, initCommon, and swap if
  // you add new members.
};
}

二, TCompactProtocol类 分析

template <class Transport_>
class TCompactProtocolT : public TVirtualProtocol<TCompactProtocolT<Transport_> > {

protected:
  static const int8_t PROTOCOL_ID = (int8_t)0x82u;
  static const int8_t VERSION_N = 1;
  static const int8_t VERSION_MASK = 0x1f;       // 0001 1111
  static const int8_t TYPE_MASK = (int8_t)0xE0u; // 1110 0000
  static const int8_t TYPE_BITS = 0x07;          // 0000 0111
  static const int32_t TYPE_SHIFT_AMOUNT = 5;

  Transport_* trans_;

  /**
   * (Writing) If we encounter a boolean field begin, save the TField here
   * so it can have the value incorporated.
   */
  struct {
    const char* name;
    TType fieldType;
    int16_t fieldId;
  } booleanField_;

  /**
   * (Reading) If we read a field header, and it's a boolean field, save
   * the boolean value here so that readBool can use it.
   */
  struct {
    bool hasBoolValue;
    bool boolValue;
  } boolValue_;

  /**
   * Used to keep track of the last field for the current and previous structs,
   * so we can do the delta stuff.
   */

  std::stack<int16_t> lastField_;
  int16_t lastFieldId_;

public:
  TCompactProtocolT(boost::shared_ptr<Transport_> trans)
    : TVirtualProtocol<TCompactProtocolT<Transport_> >(trans),
      trans_(trans.get()),
      lastFieldId_(0),
      string_limit_(0),
      string_buf_(NULL),
      string_buf_size_(0),
      container_limit_(0) {
    booleanField_.name = NULL;
    boolValue_.hasBoolValue = false;
  }

  TCompactProtocolT(boost::shared_ptr<Transport_> trans,
                    int32_t string_limit,
                    int32_t container_limit)
    : TVirtualProtocol<TCompactProtocolT<Transport_> >(trans),
      trans_(trans.get()),
      lastFieldId_(0),
      string_limit_(string_limit),
      string_buf_(NULL),
      string_buf_size_(0),
      container_limit_(container_limit) {
    booleanField_.name = NULL;
    boolValue_.hasBoolValue = false;
  }

  ~TCompactProtocolT() { free(string_buf_); }

  /**
   * Writing functions
   */

  virtual uint32_t writeMessageBegin(const std::string& name,
                                     const TMessageType messageType,
                                     const int32_t seqid);

  uint32_t writeStructBegin(const char* name);

  uint32_t writeStructEnd();

  uint32_t writeFieldBegin(const char* name, const TType fieldType, const int16_t fieldId);

  uint32_t writeFieldStop();

  uint32_t writeListBegin(const TType elemType, const uint32_t size);

  uint32_t writeSetBegin(const TType elemType, const uint32_t size);

  virtual uint32_t writeMapBegin(const TType keyType, const TType valType, const uint32_t size);

  uint32_t writeBool(const bool value);

  uint32_t writeByte(const int8_t byte);

  uint32_t writeI16(const int16_t i16);

  uint32_t writeI32(const int32_t i32);

  uint32_t writeI64(const int64_t i64);

  uint32_t writeDouble(const double dub);

  uint32_t writeString(const std::string& str);

  uint32_t writeBinary(const std::string& str);

  /**
  * These methods are called by structs, but don't actually have any wired
  * output or purpose
  */
  virtual uint32_t writeMessageEnd() { return 0; }
  uint32_t writeMapEnd() { return 0; }
  uint32_t writeListEnd() { return 0; }
  uint32_t writeSetEnd() { return 0; }
  uint32_t writeFieldEnd() { return 0; }

protected:
  int32_t writeFieldBeginInternal(const char* name,
                                  const TType fieldType,
                                  const int16_t fieldId,
                                  int8_t typeOverride);
  uint32_t writeCollectionBegin(const TType elemType, int32_t size);
  uint32_t writeVarint32(uint32_t n);
  uint32_t writeVarint64(uint64_t n);
  uint64_t i64ToZigzag(const int64_t l);
  uint32_t i32ToZigzag(const int32_t n);
  inline int8_t getCompactType(const TType ttype);

public:
  uint32_t readMessageBegin(std::string& name, TMessageType& messageType, int32_t& seqid);

  uint32_t readStructBegin(std::string& name);

  uint32_t readStructEnd();

  uint32_t readFieldBegin(std::string& name, TType& fieldType, int16_t& fieldId);

  uint32_t readMapBegin(TType& keyType, TType& valType, uint32_t& size);

  uint32_t readListBegin(TType& elemType, uint32_t& size);

  uint32_t readSetBegin(TType& elemType, uint32_t& size);

  uint32_t readBool(bool& value);
  // Provide the default readBool() implementation for std::vector<bool>
  using TVirtualProtocol<TCompactProtocolT<Transport_> >::readBool;

  uint32_t readByte(int8_t& byte);

  uint32_t readI16(int16_t& i16);

  uint32_t readI32(int32_t& i32);

  uint32_t readI64(int64_t& i64);

  uint32_t readDouble(double& dub);

  uint32_t readString(std::string& str);

  uint32_t readBinary(std::string& str);

  /*
   *These methods are here for the struct to call, but don't have any wire
   * encoding.
   */
  uint32_t readMessageEnd() { return 0; }
  uint32_t readFieldEnd() { return 0; }
  uint32_t readMapEnd() { return 0; }
  uint32_t readListEnd() { return 0; }
  uint32_t readSetEnd() { return 0; }

protected:
  uint32_t readVarint32(int32_t& i32);
  uint32_t readVarint64(int64_t& i64);
  int32_t zigzagToI32(uint32_t n);
  int64_t zigzagToI64(uint64_t n);
  TType getTType(int8_t type);

  // Buffer for reading strings, save for the lifetime of the protocol to
  // avoid memory churn allocating memory on every string read
  int32_t string_limit_;
  uint8_t* string_buf_;
  int32_t string_buf_size_;
  int32_t container_limit_;
};

typedef TCompactProtocolT<TTransport> TCompactProtocol;

结语

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值