protobuf:技术

最新推荐文章于 2023-12-07 20:51:19 发布

꧁白杨树下꧂

最新推荐文章于 2023-12-07 20:51:19 发布

阅读量58

点赞数

分类专栏：其他文章标签： protobuf

原文链接：https://protobuf.dev/programming-guides/techniques/

版权

其他专栏收录该内容

27 篇文章 1 订阅

订阅专栏

Techniques

技术

This topic describes some commonly-used design patterns for dealing with Protocol Buffers.

本主题描述了一些用于处理协议缓冲区的常用设计模式。

You can also send design and usage questions to the Protocol Buffers discussion group.

还可以将设计和使用问题发送到“协议缓冲区”讨论组。

Streaming Multiple Messages

流式处理多条消息

If you want to write multiple messages to a single file or stream, it is up to you to keep track of where one message ends and the next begins. The Protocol Buffer wire format is not self-delimiting, so protocol buffer parsers cannot determine where a message ends on their own. The easiest way to solve this problem is to write the size of each message before you write the message itself. When you read the messages back in, you read the size, then read the bytes into a separate buffer, then parse from that buffer. (If you want to avoid copying bytes to a separate buffer, check out the CodedInputStream class (in both C++ and Java) which can be told to limit reads to a certain number of bytes.)

如果想将多条消息写入单个文件或流，则来跟踪一条消息的结尾和下一条消息开始的位置。协议缓冲区线格式不是自定界的，因此协议缓冲区解析器无法自行确定消息的结束位置。解决这个问题的最简单方法是在编写消息之前先编写每条消息的大小。当读取回消息时，读取大小，然后将字节读取到一个单独的缓冲区中，然后从该缓冲区进行解析。（如果想避免将字节复制到单独的缓冲区，请检查CodedInputStream类（在C++和Java中），它可以被告知将读取限制在一定数量的字节内。）

Large Data Sets

大型数据集

Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.

协议缓冲区不是为处理大型消息而设计的。根据一般经验，如果处理的每条消息都大于一兆字节，那么可能是时候考虑另一种策略了。

That said, Protocol Buffers are great for handling individual messages within a large data set. Usually, large data sets are a collection of small pieces, where each small piece is structured data. Even though Protocol Buffers cannot handle the entire set at once, using Protocol Buffers to encode each piece greatly simplifies your problem: now all you need is to handle a set of byte strings rather than a set of structures.

也就是说，协议缓冲区非常适合处理大型数据集中的单个消息。通常，大数据集是小块的集合，其中每个小块都是结构化数据。尽管协议缓冲区不能同时处理整个集合，但使用协议缓冲区对每一部分进行编码大大简化了问题：现在只需要处理一组字节字符串，而不是一组结构。

Protocol Buffers do not include any built-in support for large data sets because different situations call for different solutions. Sometimes a simple list of records will do while other times you want something more like a database. Each solution should be developed as a separate library, so that only those who need it need pay the costs.

协议缓冲区不包括对大型数据集的任何内置支持，因为不同的情况需要不同的解决方案。有时一个简单的记录列表就可以了，而其他时候你想要更像数据库的东西。每个解决方案都应该作为一个单独的库来开发，这样只有那些需要它的人才能支付成本。

Self-describing Messages

自述消息

Protocol Buffers do not contain descriptions of their own types. Thus, given only a raw message without the corresponding .proto file defining its type, it is difficult to extract any useful data.

协议缓冲区不包含其自身类型的描述。因此，如果只给出一个原始消息，而没有定义其类型的相应.proto文件，则很难提取任何有用的数据。

However, the contents of a .proto file can itself be represented using protocol buffers. The file src/google/protobuf/descriptor.proto in the source code package defines the message types involved. protoc can output a FileDescriptorSet—which represents a set of .proto files—using the --descriptor_set_out option. With this, you can define a self-describing protocol message like so:

但是，.proto文件的内容本身可以使用协议缓冲区来表示。源代码包中的src/google/protobuf/descriptor.proto文件定义了所涉及的消息类型。protoc可以使用--descriptor_set_out选项输出一个FileDescriptorSet，该文件表示一组proto文件。这样，就可以定义一个自我描述的协议消息，如下所示：

syntax = "proto3";

import "google/protobuf/any.proto";
import "google/protobuf/descriptor.proto";

message SelfDescribingMessage {
  // Set of FileDescriptorProtos which describe the type and its dependencies.
  google.protobuf.FileDescriptorSet descriptor_set = 1;

  // The message and its type, encoded as an Any message.
  google.protobuf.Any message = 2;
}

By using classes like DynamicMessage (available in C++ and Java), you can then write tools which can manipulate SelfDescribingMessages.

通过使用DynamicMessage（在C++和Java中可用）之类的类，可以编写可以操作SelfDescribeingMessages的工具。

All that said, the reason that this functionality is not included in the Protocol Buffer library is because we have never had a use for it inside Google.

尽管如此，这个功能没有包含在协议缓冲库中的原因是我们从未在谷歌内部使用过它。

This technique requires support for dynamic messages using descriptors. Check that your platforms support this feature before using self-describing messages.

这种技术需要支持使用描述符的动态消息。在使用自述消息之前，请检查平台是否支持此功能。