protobuf:技术

Techniques

技术

This topic describes some commonly-used design patterns for dealing with Protocol Buffers.

本主题描述了一些用于处理协议缓冲区的常用设计模式。

You can also send design and usage questions to the Protocol Buffers discussion group.

​还可以将设计和使用问题发送到“协议缓冲区”讨论组。

Streaming Multiple Messages

流式处理多条消息

If you want to write multiple messages to a single file or stream, it is up to you to keep track of where one message ends and the next begins. The Protocol Buffer wire format is not self-delimiting, so protocol buffer parsers cannot determine where a message ends on their own. The easiest way to solve this problem is to write the size of each message before you write the message itself. When you read the messages back in, you read the size, then read the bytes into a separate buffer, then parse from that buffer. (If you want to avoid copying bytes to a separate buffer, check out the CodedInputStream class (in both C++ and Java) which can be told to limit reads to a certain number of bytes.)

如果想将多条消息写入单个文件或流,则来跟踪一条消息的结尾和下一条消息开始的位置。协议缓冲区线格式不是自定界的,因此协议缓冲区解析器无法自行确定消息的结束位置。解决这个问题的最简单方法是在编写消息之前先编写每条消息的大小。当读取回消息时,读取大小,然后将字节读取到一个单独的缓冲区中,然后从该缓冲区进行解析。(如果想避免将字节复制到单独的缓冲区,请检查CodedInputStream类(在C++和Java中),它可以被告知将读取限制在一定数量的字节内。)

Large Data Sets

大型数据集

Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.

协议缓冲区不是为处理大型消息而设计的。根据一般经验,如果处理的每条消息都大于一兆字节,那么可能是时候考虑另一种策略了。

That said, Protocol Buffers are great for handling individual messages within a large data set. Usually, large data sets are a collection of small pieces, where each small piece is structured data. Even though Protocol Buffers cannot handle the entire set at once, using Protocol Buffers to encode each piece greatly simplifies your problem: now all you need is to handle a set of byte strings rather than a set of structures.

也就是说,协议缓冲区非常适合处理大型数据集中的单个消息。通常,大数据集是小块的集合,其中每个小块都是结构化数据。尽管协议缓冲区不能同时处理整个集合,但使用协议缓冲区对每一部分进行编码大大简化了问题:现在只需要处理一组字节字符串,而不是一组结构。

Protocol Buffers do not include any built-in support for large data sets because different situations call for different solutions. Sometimes a simple list of records will do while other times you want something more like a database. Each solution should be developed as a separate library, so that only those who need it need pay the costs.

协议缓冲区不包括对大型数据集的任何内置支持,因为不同的情况需要不同的解决方案。有时一个简单的记录列表就可以了,而其他时候你想要更像数据库的东西。每个解决方案都应该作为一个单独的库来开发,这样只有那些需要它的人才能支付成本。

Self-describing Messages

自述消息

Protocol Buffers do not contain descriptions of their own types. Thus, given only a raw message without the corresponding .proto file defining its type, it is difficult to extract any useful data.

协议缓冲区不包含其自身类型的描述。因此,如果只给出一个原始消息,而没有定义其类型的相应.proto文件,则很难提取任何有用的数据。

However, the contents of a .proto file can itself be represented using protocol buffers. The file src/google/protobuf/descriptor.proto in the source code package defines the message types involved. protoc can output a FileDescriptorSet—which represents a set of .proto files—using the --descriptor_set_out option. With this, you can define a self-describing protocol message like so:

但是,.proto文件的内容本身可以使用协议缓冲区来表示。源代码包中的src/google/protobuf/descriptor.proto文件定义了所涉及的消息类型。protoc可以使用--descriptor_set_out选项输出一个FileDescriptorSet,该文件表示一组proto文件。这样,就可以定义一个自我描述的协议消息,如下所示:

syntax = "proto3";

import "google/protobuf/any.proto";
import "google/protobuf/descriptor.proto";

message SelfDescribingMessage {
  // Set of FileDescriptorProtos which describe the type and its dependencies.
  google.protobuf.FileDescriptorSet descriptor_set = 1;

  // The message and its type, encoded as an Any message.
  google.protobuf.Any message = 2;
}

By using classes like DynamicMessage (available in C++ and Java), you can then write tools which can manipulate SelfDescribingMessages.

通过使用DynamicMessage(在C++和Java中可用)之类的类,可以编写可以操作SelfDescribeingMessages的工具。

All that said, the reason that this functionality is not included in the Protocol Buffer library is because we have never had a use for it inside Google.

尽管如此,这个功能没有包含在协议缓冲库中的原因是我们从未在谷歌内部使用过它。

This technique requires support for dynamic messages using descriptors. Check that your platforms support this feature before using self-describing messages.

这种技术需要支持使用描述符的动态消息。在使用自述消息之前,请检查平台是否支持此功能。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值