Protocol Buffer(proto2)及C++ API

前言

本篇基於Protocol Buffers - Language Guide(proto2)Protocol Buffer Basics: C++兩篇官方文章,並加上TensorRT中的例子做為參考。

Protocol Buffer

Protocol buffer(即protobuf)與XML一樣,都是一種儲存資料的格式,但是比起XML,它可以生成更小的文件,並且解析的速度更快。

Protobuf語法

為了定義欲序列化的資料結構,需要新建一個.proto檔。在.proto檔中,每個message表示一種資料結構。以下為addressbook.proto

syntax = "proto2";

package tutorial;

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}

package

參考Packages:為了避免名稱衝突,我們可以為.proto檔加上一個package specifier。在C++裡體現為命名空間。

scalar value types

參考Scalar Value Typesbool, int32, int64, uint32, float, double, string等都跟C++裡的一樣;bytes在C++裡則是string

message

message定義了欲序列化的資料結構。

在這個資料結構裡,我們可以宣告多個field,並為每個field指其name及type。

在上例中,我們定義了一個名為Person的message,裡面包含了nameidemailphones等四個field。

觀察Person裡的id這個field:

repeated PhoneNumber phones = 4;

其中PhoneNumber是field type,phones是field name。4則是"field number",用於在二進制編碼後的message裡識別出該field。

值得注意的是,field number是由1開始的。

我們可以看到,在Person這個message裡還定義了另一個名為PhoneNumber的message。這代表nested definition是允許的。

phones的型別是PhoneNumber,這代表field的型別不只可以為scalar value types,同樣也可以是自定義的message。

required, optional, repeated

requiredoptionalrepeated為field的修飾字,參考Specifying Field Rules

  • required:表示我們必須提供該field的value
  • optional:表示我們可以不為該field提供value。如果沒有為這類field提供value,那麼將會使用默認值。對於數值型別來說,默認值為0;對字串型別來說,默認值為空字串;對布林型別來說,默認值為false。根據Optional Fields And Default Values,可選參數的默認值是可以自己設定的,其語法如:
optional PhoneType type = 2 [default = HOME];
  • repeated:這種field可以出現0到多次,而它們的順序是會被保留的。我們可以將它想成動態大小的陣列。

enum

參考Enumerations:protobuf裡enum的概念跟c++裡的一樣,index也同樣是從0開始。

編譯

有了.proto檔之後,我們可以使用protoc編譯器來編譯,其寫法為:

protoc -I=$SRC_DIR --cpp_out=$DST_DIR --java_out=$DST_DIR --python_out=$DST_DIR $SRC_DIR/file.proto

由於我們的目標語言是C++,在此我們只需要指定cpp_out:

protoc --cpp_out=`pwd` addressbook.proto

這行指令會生成addressbook.pb.haddressbook.pb.cc兩個檔案,兩者分別為877行及1387行。我們在.proto檔裡定義的每個message經編譯後都會變成一個class。其中.pb.h檔宣告了我們自定義的類別(即.proto裡的message),.pb.cc檔則包含了那些類別的實作。

查看addressbook.pb.h裡定類了哪些類別:

class Person_PhoneNumber :
    public ::google::protobuf::Message /* @@protoc_insertion_point(class_definition:tutorial.Person.PhoneNumber) */ {
    //...
}
class Person :
    public ::google::protobuf::Message /* @@protoc_insertion_point(class_definition:tutorial.Person) */ {
    //...
}
class AddressBook :
    public ::google::protobuf::Message /* @@protoc_insertion_point(class_definition:tutorial.AddressBook) */ {
    //...
}

關於這三個類別,我們都可以在addressbook.proto裡找到對應的message。

MessageLite/Message class

參考class MessageLiteMessageLite類別所屬的header檔及命名空間為:

class MessageLite
#include <google/protobuf/message_lite.h>
namespace google::protobuf

其介紹為:

Interface to light weight protocol messages.

This interface is implemented by all protocol message objects. 
Non-lite messages additionally implement the Message interface, 
which is a subclass of MessageLite.

MessageLite有個名為Message的子類別。每個protocol message物件會實作Message這個interface;而lite protocol message物件則會實作MessageLite這個interface。

addressbook.pb.h 裡定義的 Person_PhoneNumberPersonAddressBook這三個類別可以作為佐證。

C++ protocol buffer API

以下摘自addressbook.pb.hPerson類別的部份:

// name
inline bool has_name() const;
inline void clear_name();
inline const ::std::string& name() const;
inline void set_name(const ::std::string& value);
inline void set_name(const char* value);
inline ::std::string* mutable_name();

// id
inline bool has_id() const;
inline void clear_id();
inline int32_t id() const;
inline void set_id(int32_t value);

// email
inline bool has_email() const;
inline void clear_email();
inline const ::std::string& email() const;
inline void set_email(const ::std::string& value);
inline void set_email(const char* value);
inline ::std::string* mutable_email();

// phones
inline int phones_size() const;
inline void clear_phones();
inline const ::google::protobuf::RepeatedPtrField< ::tutorial::Person_PhoneNumber >& phones() const;
inline ::google::protobuf::RepeatedPtrField< ::tutorial::Person_PhoneNumber >* mutable_phones();
inline const ::tutorial::Person_PhoneNumber& phones(int index) const;
inline ::tutorial::Person_PhoneNumber* mutable_phones(int index);
inline ::tutorial::Person_PhoneNumber* add_phones();

以上是編譯器自動為Person類別裡的各欄位(field)生成的accessor,根據功能可以分為以下幾種:

  • getter:與field同名,用於獲取該field。
  • setter:set_加上field的名字,用於設定該field。
  • has_:用於檢查該field是否有被set,僅限required或optional field。
  • clear_:用於將field回復到初始狀態。
  • mutable_:primitive type沒有(?),與getter功能類似,但回傳的是一個指向該物件(該field)的指標
  • _size:repeated field才有,回傳動態陣列的大小。
  • add_:repeated field才有,新增元素,之後需要set新元素的value。

enum

我們可以用:

Person::PhoneType

來指代生成代碼裡的PhoneType這個enum type。

並用:

Person::MOBILE
Person::HOME
Person::WORK

來指代生成代碼裡的enum值。

nested class

addressbook.proto中:PhoneNumber這個message被定義於Person內。

我們可以用以下兩種寫法來代指生成代碼裡的PhoneNumber類別:

Person::PhoneNumber
Person_PhoneNumber

Standard Message Methods

Protocol Buffer Basics: C++ - Standard Message Methods

Parsing and Serialization

Protocol Buffer Basics: C++ - Parsing and Serialization

寫message

新建一個名為addressbook_write.cpp的檔案,填入以下代碼:

#include <iostream>
#include <fstream>
#include <string>
#include "addressbook.pb.h"
using namespace std;

// This function fills in a Person message based on user input.
void PromptForAddress(tutorial::Person* person) {
  cout << "Enter person ID number: ";
  int id;
  cin >> id;
  person->set_id(id);
  cin.ignore(256, '\n');

  cout << "Enter name: ";
  getline(cin, *person->mutable_name());

  cout << "Enter email address (blank for none): ";
  string email;
  getline(cin, email);
  if (!email.empty()) {
    person->set_email(email);
  }

  while (true) {
    cout << "Enter a phone number (or leave blank to finish): ";
    string number;
    getline(cin, number);
    if (number.empty()) {
      break;
    }

    tutorial::Person::PhoneNumber* phone_number = person->add_phones();
    phone_number->set_number(number);

    cout << "Is this a mobile, home, or work phone? ";
    string type;
    getline(cin, type);
    if (type == "mobile") {
      phone_number->set_type(tutorial::Person::MOBILE);
    } else if (type == "home") {
      phone_number->set_type(tutorial::Person::HOME);
    } else if (type == "work") {
      phone_number->set_type(tutorial::Person::WORK);
    } else {
      cout << "Unknown phone type.  Using default." << endl;
    }
  }
}

// Main function:  Reads the entire address book from a file,
//   adds one person based on user input, then writes it back out to the same
//   file.
int main(int argc, char* argv[]) {
  // Verify that the version of the library that we linked against is
  // compatible with the version of the headers we compiled against.
  GOOGLE_PROTOBUF_VERIFY_VERSION;

  if (argc != 2) {
    cerr << "Usage:  " << argv[0] << " ADDRESS_BOOK_FILE" << endl;
    return -1;
  }

  tutorial::AddressBook address_book;

  {
    // Read the existing address book.
    fstream input(argv[1], ios::in | ios::binary);
    if (!input) {
      cout << argv[1] << ": File not found.  Creating a new file." << endl;
    } else if (!address_book.ParseFromIstream(&input)) {
      cerr << "Failed to parse address book." << endl;
      return -1;
    }
  }

  // Add an address.
  PromptForAddress(address_book.add_people());

  {
    // Write the new address book back to disk.
    fstream output(argv[1], ios::out | ios::trunc | ios::binary);
    if (!address_book.SerializeToOstream(&output)) {
      cerr << "Failed to write address book." << endl;
      return -1;
    }
  }

  // Optional:  Delete all global objects allocated by libprotobuf.
  google::protobuf::ShutdownProtobufLibrary();

  return 0;
}

編譯:

g++ addressbook_write.cpp addressbook.pb.cc -lprotobuf -pthread -std=c++11 -o writeproto

編譯過後會生成名為writeproto的執行檔。使用以下指令執行:

./writeproto addressbook.bin

筆者新建了一筆vivaldi的記錄:

addressbook.bin: File not found.  Creating a new file.
Enter person ID number: 123
Enter name: vivaldi
Enter email address (blank for none): vivaldi@gmail.com
Enter a phone number (or leave blank to finish): 123456
Is this a mobile, home, or work phone? mobile
Enter a phone number (or leave blank to finish): 123457
Is this a mobile, home, or work phone? work
Enter a phone number (or leave blank to finish): 123458
Is this a mobile, home, or work phone? home
Enter a phone number (or leave blank to finish): 123459
Is this a mobile, home, or work phone? home
Enter a phone number (or leave blank to finish):

結束後會生成一個名為addressbook.bin的二進制檔案。

讀message

新建一個名為addressbook_read.cpp的檔案,填入以下代碼:

#include <iostream>
#include <fstream>
#include <string>
#include "addressbook.pb.h"
using namespace std;

// Iterates though all people in the AddressBook and prints info about them.
void ListPeople(const tutorial::AddressBook& address_book) {
  for (int i = 0; i < address_book.people_size(); i++) {
    const tutorial::Person& person = address_book.people(i);

    cout << "Person ID: " << person.id() << endl;
    cout << "  Name: " << person.name() << endl;
    if (person.has_email()) {
      cout << "  E-mail address: " << person.email() << endl;
    }

    for (int j = 0; j < person.phones_size(); j++) {
      const tutorial::Person::PhoneNumber& phone_number = person.phones(j);

      switch (phone_number.type()) {
        case tutorial::Person::MOBILE:
          cout << "  Mobile phone #: ";
          break;
        case tutorial::Person::HOME:
          cout << "  Home phone #: ";
          break;
        case tutorial::Person::WORK:
          cout << "  Work phone #: ";
          break;
      }
      cout << phone_number.number() << endl;
    }
  }
}

// Main function:  Reads the entire address book from a file and prints all
//   the information inside.
int main(int argc, char* argv[]) {
  // Verify that the version of the library that we linked against is
  // compatible with the version of the headers we compiled against.
  GOOGLE_PROTOBUF_VERIFY_VERSION;

  if (argc != 2) {
    cerr << "Usage:  " << argv[0] << " ADDRESS_BOOK_FILE" << endl;
    return -1;
  }

  tutorial::AddressBook address_book;

  {
    // Read the existing address book.
    fstream input(argv[1], ios::in | ios::binary);
    if (!address_book.ParseFromIstream(&input)) {
      cerr << "Failed to parse address book." << endl;
      return -1;
    }
  }

  ListPeople(address_book);

  // Optional:  Delete all global objects allocated by libprotobuf.
  google::protobuf::ShutdownProtobufLibrary();

  return 0;
}

編譯:

g++ addressbook_read.cpp addressbook.pb.cc -lprotobuf -pthread -std=c++11 -o readproto

執行:

./readproto addressbook.bin

輸出:

Person ID: 123
  Name: vivaldi
  E-mail address: vivaldi@gmail.com
  Mobile phone #: 123456
  Work phone #: 123457
  Home phone #: 123458
  Home phone #: 123459

TensorRT裡的例子

如果我們將TensorRT中的TensorRT/parsers/caffe/proto/trtcaffe.proto拿去編譯,會生成trtcaffe.pb.cctrtcaffe.pb.h兩個檔案,兩者都是三萬多行。

下面依proto的語法特性,列舉幾個TensorRT裡的例子。

package

TensorRT/include/NvCaffeParser.h中,使用:

trtcaffe::LayerParameter

來指代trtcaffe.proto中定義的LayerParameter這個message:

package trtcaffe;
message LayerParameter {
    //...
}

getter

TensorRT/parsers/caffe/caffeWeightFactory/caffeWeightFactory.cppgetBlobsSize中,使用:

//const trtcaffe::NetParameter& mMsg;
/**/mMsg.layer(i)/**/;

來存取mMsg.layer這個陣列的第i個元素。摘自trtcaffe.proto

message NetParameter {
  //...
  repeated LayerParameter layer = 100;
  //...
}

layer這個field被加上了repeated修飾字,所以可以把它當成陣列。

enum

TensorRT/parsers/caffe/caffeWeightFactory/caffeWeightFactory.cpp的函數sizeOfCaffeType中,使用:

//trtcaffe::Type type;
/**/type == trtcaffe::FLOAT/**/;
/**/type == trtcaffe::FLOAT16/**/;

來存取Type這個enum type及它裡面的值。參考trtcaffe.proto

enum Type {
  DOUBLE = 0;
  FLOAT = 1;
  FLOAT16 = 2;
  INT = 3;  // math not supported
  UINT = 4;  // math not supported
}

has_

TensorRT/parsers/caffe/caffeParser/opParsers/parseInnerProduct.cpp中,使用:

ILayer* parseInnerProduct(INetworkDefinition& network, const trtcaffe::LayerParameter& msg, CaffeWeightFactory& weightFactory, BlobNameToTensor& tensors)
{
    const trtcaffe::InnerProductParameter& p = msg.inner_product_param();
    
    /**/!p.has_bias_term() || p.bias_term() /**/;
    
    //...
}

來調用protoc自動為InnerProductParameter這個類別所生成的getter:bias_term()及has_函數:has_bias_term()

摘自trtcaffe.proto

message InnerProductParameter {
  //...
  optional bool bias_term = 2 [default = true]; // whether to have bias terms
  //...
}

_size

TensorRT/parsers/caffe/caffeParser/opParsers/opParsers.h的函數checkBlobs中,使用:

//const trtcaffe::LayerParameter& msg
msg.bottom_size()
msg.top_size()

來獲取bottomtop這兩個陣列的大小。摘自trtcaffe.proto

message LayerParameter {
  optional string name = 1; // the layer name
  optional string type = 2; // the layer type
  repeated string bottom = 3; // the name of each bottom blob
  repeated string top = 4; // the name of each top blob
  
  //...
}

參考連結

Protocol Buffers - Language Guide(proto2)

Protocol Buffer Basics: C++

class MessageLite

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值