Google Protocol Buffer (PB)简明入门

最新推荐文章于 2023-05-16 21:41:39 发布

鲁峰2012

最新推荐文章于 2023-05-16 21:41:39 发布

阅读量609

点赞数

分类专栏： c+

c+ 专栏收录该内容

14 篇文章 0 订阅

订阅专栏

写在最前：本文以在Windows下C++中使用Protocol Buffer（以下简称PB）为例展示如何安装和使用PB。其他语言和平台也可做为参考。

一．什么是PB

PB是Google开发的一种开源数据交换方式。特别适合于在RPC间交换对象及数据结构。与其相似的应用有XML、JSON、THRIFT等。

二．为什么要用PB

相对于其主要同类应用XML，PB的主要优势在于更小的数据size（比XML小3-10倍）和更快的解析速度（比XML快20-100倍），同时在使用上也更简单。PB的劣势主要是可读性比较差，由于其生成的是二进制数据，可读性要远低于XML的明文格式，同时编辑也要借助代码来完成（XML可以直接编辑）。

三．安装

1. 下载PB

至https://code.google.com/p/protobuf/downloads/list下载最新版PB。建议最好是下载源代码（也提供了Binary下载），然后自己来编译。以下主要以windows下的编译做示例，Linux下的可以参考自行完成。（Windows下的下载包protobuf-2.5.0.zip）

2. 编译

在vsprojects文件夹下可以找到解决方案protobuf.sln（该方案是用VS2008生成的，低版本的VS可以使用目录下提供的Linux脚本convert2008to2005.sh降级），用VS打开后直接启动编译即可。编译完成后会生成一个主要的exe和三个.lib库，分别是protoc.exe, libprotobuf.lib, libprotobuf-lite.lib, libprotoc.lib. 其中protoc.exe是用于编译.proto文件的，其他三个库是用于编译序列化/反序列化代码时使用的。（稍后会讲到）

四．使用

1. 构建对象描述文件（.proto文件）

要使用PB，首先你要构建一个对象描述文件。见下例：

person.proto

[cpp]view plaincopy 
     
 // 包名。编译后就是名空间名称  
 packagemrPerson;  
    
 // 对象名。在该例中，对象名称为Person  
 messagePerson  
 {  
 // 字段的属性、类型、名称、ID  
          optionalstring name = 1;  
          optionalint32 age = 2;   
 }  

上面是一个简单的对象描述文件。有点类似于伪代码。这里解释一下字段的属性、类型、名称及ID。字段的属性分为三种: required, optional, repeated. 分别表示该字段是必须的，可选的及重复的。具体含义如下：

Each field mustbe annotated with one of the following modifiers:

· required: a value forthe field must be provided, otherwise the message will be considered"uninitialized". If libprotobuf is compiled in debug mode, serializing an uninitialized message will causean assertion failure. In optimized builds, the check is skipped and the messagewill be written anyway. However, parsing an uninitialized message will alwaysfail (by returning false from the parse method). Other than this, a required field behaves exactlylike an optional field.

· optional: the fieldmay or may not be set. If an optional field value isn't set, a default value isused. For simple types, you can specify your own default value, as we've donefor the phone number type in the example. Otherwise, a system default is used: zero for numerictypes, the empty string for strings, false for bools. For embedded messages,the default value is always the "default instance" or"prototype" of the message, which has none of its fields set. Callingthe accessor to get the value of an optional (or required) field which has notbeen explicitly set always returns that field's default value.

· repeated: the fieldmay be repeated any number of times (including zero). The order of the repeatedvalues will be preserved in the protocol buffer. Think of repeated fields asdynamically sized arrays.

概括一下，required属性的字段是必须要给初值的，否则解析时会返回false；optional如果没给初值，PB会使用默认初值；repeated代表的是数组。注意：Google PB官方文档中特意指明，从Google内部目前使用情况来看，一个比较好的实践方式是只使用optional和repeated，而不使用required。这样可以达到最好的向前兼容性。

字段类型列表如下：（主要用到的是double,int64,int32,string,byte等）

.proto Type	Notes	C++ Type	Java Type	PythonType^[2]
double	double	double	float
float	float	float	float
int32	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.	int32	int	int
int64	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.	int64	long	int/long^[3]
uint32	Uses variable-length encoding.	uint32	int^[1]	int/long^[3]
uint64	Uses variable-length encoding.	uint64	long^[1]	int/long^[3]
sint32	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.	int32	int	int
sint64	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.	int64	long	int/long^[3]
fixed32	Always four bytes. More efficient than uint32 if values are often greater than 2²⁸.	uint32	int^[1]	int
fixed64	Always eight bytes. More efficient than uint64 if values are often greater than 2⁵⁶.	uint64	long^[1]	int/long^[3]
sfixed32	Always four bytes.	int32	int	int
sfixed64	Always eight bytes.	int64	long	int/long^[3]
bool	bool	boolean	boolean
string	A string must always contain UTF-8 encoded or 7-bit ASCII text.	string	String	str/unicode^[4]
bytes	May contain any arbitrary sequence of bytes.	string	ByteString	str

字段ID的主要作用是做向前兼容。.proto文件扩充后，可以通过定义不重复的id，实现向前兼容性。

2. 将对象描述文件编译成代码

使用如下命令行“protoc -I=$SRC_DIR --cpp_out=$DST_DIR $SRC_DIR/person.proto”来编译.proto文件。注意：$SRC_DIR表示.proto文件夹路径；$DST_DIR表示输出的.cc和.h文件夹路径；”.”号表示使用当前路径。编译完成后，会在$DST_DIR下生成两个文件“person.pb.h”和“person.pb.cc”。

3. 使用PB序列化和反序列化对象

代码说话：

[cpp]view plaincopy 
     
 #include <iostream>  
 #include <fstream>  
 #include <string>  
 #include "person.pb.h"  
 using namespace std;  
 using namespace mrPerson;  
    
 #pragma comment(lib,"libprotobuf.lib")  
 #pragma comment(lib,"libprotoc.lib")  
 #pragma comment(lib,"libprotobuf-lite.lib")  
    
 // Main function:  Reads the entire address book from a file,  
 //  adds one person based on user input, then writes it back out to the same  
 //  file.  
 int main(int argc, char* argv[]) {  
  // Verify that the version of the library that we linked against is  
  // compatible with the version of the headers we compiled against.  
  GOOGLE_PROTOBUF_VERIFY_VERSION;  
    
  // 定义Person对象  
  Person person;  
  // 设置字段  
  person.set_name("huangzhidan");  
  person.set_age(17);  
    
  // +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
  // 以下展示使用文件做为载体序列化和反序列化对象  
  // 将对象写入文件  
  fstream output("test_person.dat", ios::out | ios::trunc | ios::binary);  
  if(!person.SerializeToOstream(&output))  
  {  
    cerr << "Failed to write person." << endl;  
    return -1;  
  }  
  output.flush();  
    
  // 将文件反序列化成对象  
  Person person_out;  
  fstream input("test_person.dat", ios::in | ios::binary);  
  if(!person_out.ParseFromIstream(&input))  
  {  
    cerr << "Failed to read person." << endl;  
    return -1;  
  }  
    
  // 验证  
  string name = person_out.name();  
  int age = person_out.age();  
    
  // +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
  // 以下展示使用流数据做为载体序列化和反序列化对象  
  string streamData;  
  // 可以使用SerializeToString接口将对象序列化成流数据，然后通过Socket通信  
  person.SerializeToString(&streamData);  
    
  Person person_stream;  
  person_stream.ParseFromString(streamData);  
    
  // 验证  
  name = person_stream.name();  
  age = person_stream.age();  
    
  // Optional:  Delete all globalobjects allocated by libprotobuf.  
  google::protobuf::ShutdownProtobufLibrary();  
    
  return 0;  
 }  

以上代码演示了在PB中使用文件和流数据为载体来序列化和反序列化对象（库和头文件路径请自行配置）。可以看到，使用PB承载对象还是比较容易的，代码编写也很简单。

下面是使用PB来承载一个较复杂的嵌套对象过程：

(1) addressbook.proto

[cpp]view plaincopy 
     
 //See README.txt for information and build instructions.  
    
 packagetutorial;  
    
 optionjava_package = "com.example.tutorial";  
 optionjava_outer_classname = "AddressBookProtos";  
    
 messagePerson {  
   required string name = 1;  
   required int32 id = 2;        // Unique ID number for this person.  
   optional string email = 3;  
    
   enum PhoneType {  
     MOBILE = 0;  
     HOME = 1;  
     WORK = 2;  
   }  
    
   message PhoneNumber {  
     required string number = 1;  
     optional PhoneType type = 2 [default =HOME];  
   }  
    
   repeated PhoneNumber phone = 4;  
 }  
    
 //Our address book file is just one of these.  
 messageAddressBook {  
   repeated Person person = 1;  
 }  

(2) 参考以上编译成addressbook.pb.h和addressbook.pb.cc

(3) Writing A Message

[cpp]view plaincopy 
     
 #include <iostream>  
 #include <fstream>  
 #include <string>  
 #include "addressbook.pb.h"  
 using namespace std;  
    
 // This function fills in a Personmessage based on user input.  
 voidPromptForAddress(tutorial::Person* person) {  
  cout << "Enter person ID number: ";  
  int id;  
  cin >> id;  
  person->set_id(id);  
  cin.ignore(256, '\n');  
    
  cout << "Enter name: ";  
  getline(cin, *person->mutable_name());  
    
  cout << "Enter email address (blank for none): ";  
  string email;  
  getline(cin, email);  
  if (!email.empty()) {  
    person->set_email(email);  
  }  
    
  while (true) {  
    cout << "Enter a phone number (or leave blank to finish):";  
    string number;  
    getline(cin, number);  
    if (number.empty()) {  
      break;  
    }  
    
    tutorial::Person::PhoneNumber* phone_number = person->add_phone();  
    phone_number->set_number(number);  
    
    cout << "Is this a mobile, home, or work phone? ";  
    string type;  
    getline(cin, type);  
    if (type == "mobile") {  
      phone_number->set_type(tutorial::Person::MOBILE);  
    } else if (type == "home") {  
      phone_number->set_type(tutorial::Person::HOME);  
    } else if (type == "work") {  
      phone_number->set_type(tutorial::Person::WORK);  
    } else {  
      cout << "Unknown phone type. Using default." << endl;  
    }  
  }  
 }  
    
 // Main function:  Reads the entire address book from a file,  
 //  adds one person based on user input, then writes it back out to the same  
 //  file.  
 int main(int argc, char* argv[]) {  
  // Verify that the version of the library that we linked against is  
  // compatible with the version of the headers we compiled against.  
  GOOGLE_PROTOBUF_VERIFY_VERSION;  
    
  if (argc != 2) {  
    cerr << "Usage:  "<< argv[0] << " ADDRESS_BOOK_FILE" << endl;  
    return -1;  
  }  
    
  tutorial::AddressBook address_book;  
    
  {  
    // Read the existing address book.  
    fstream input(argv[1], ios::in | ios::binary);  
    if (!input) {  
      cout << argv[1] << ": File not found.  Creating a new file." << endl;  
    } else if (!address_book.ParseFromIstream(&input)) {  
      cerr << "Failed to parse address book." << endl;  
      return -1;  
    }  
  }  
    
  // Add an address.  
  PromptForAddress(address_book.add_person());  
    
  {  
    // Write the new address book back to disk.  
    fstream output(argv[1], ios::out | ios::trunc | ios::binary);  
    if (!address_book.SerializeToOstream(&output)) {  
      cerr << "Failed to write address book." << endl;  
      return -1;  
    }  
  }  
    
  // Optional:  Delete all globalobjects allocated by libprotobuf.  
  google::protobuf::ShutdownProtobufLibrary();  
    
  return 0;  
 }  

(4) Reading A Message

[cpp]view plaincopy 
     
 #include <iostream>  
 #include <fstream>  
 #include <string>  
 #include "addressbook.pb.h"  
 using namespace std;  
    
 // Iterates though all people in theAddressBook and prints info about them.  
 void ListPeople(consttutorial::AddressBook& address_book) {  
  for (int i = 0; i < address_book.person_size(); i++) {  
    const tutorial::Person& person = address_book.person(i);  
    
    cout << "Person ID: " << person.id() <<endl;  
    cout << "  Name: "<< person.name() << endl;  
    if (person.has_email()) {  
      cout << "  E-mailaddress: " << person.email() << endl;  
    }  
    
    for (int j = 0; j < person.phone_size(); j++) {  
      const tutorial::Person::PhoneNumber& phone_number = person.phone(j);  
    
      switch (phone_number.type()) {  
        case tutorial::Person::MOBILE:  
           cout << "  Mobile phone #: ";  
           break;  
        case tutorial::Person::HOME:  
           cout << "  Home phone #: ";  
           break;  
        case tutorial::Person::WORK:  
           cout << "  Work phone #: ";  
           break;  
      }  
      cout << phone_number.number() << endl;  
    }  
  }  
 }  
    
 // Main function:  Reads the entire address book from a file andprints all  
 //  the information inside.  
 int main(int argc, char* argv[]) {  
  // Verify that the version of the library that we linked against is  
  // compatible with the version of the headers we compiled against.  
  GOOGLE_PROTOBUF_VERIFY_VERSION;  
    
  if (argc != 2) {  
    cerr << "Usage:  "<< argv[0] << " ADDRESS_BOOK_FILE" << endl;  
    return -1;  
  }  
    
  tutorial::AddressBook address_book;  
    
  {  
    // Read the existing address book.  
    fstream input(argv[1], ios::in | ios::binary);  
    if (!address_book.ParseFromIstream(&input)) {  
      cerr << "Failed to parse address book." << endl;  
      return -1;  
    }  
  }  
    
  ListPeople(address_book);  
    
  // Optional:  Delete all globalobjects allocated by libprotobuf.  
  google::protobuf::ShutdownProtobufLibrary();  
    
  return 0;  
 }  

注意以上用例中，枚举类型和嵌套结构的用法。

以上只是展示了PB的基础用法，其实PB还有许多高级用法：

Advanced Usage

Protocol buffers have uses that gobeyond simple accessors and serialization. Be sure to explore the C++API reference to see what else you can do with them.

One key feature provided by protocolmessage classes is reflection. You can iterate over the fields of amessage and manipulate their values without writing your code against anyspecific message type. One very useful way to use reflection is for convertingprotocol messages to and from other encodings, such as XML or JSON. A moreadvanced use of reflection might be to find differences between two messages ofthe same type, or to develop a sort of "regular expressions for protocolmessages" in which you can write expressions that match certain messagecontents. If you use your imagination, it's possible to apply Protocol Buffersto a much wider range of problems than you might initially expect!

Reflection is provided by the Message::Reflection interface.