protobuf:API最佳实践

本文讨论了API设计中的一些最佳实践,如使用字段读取掩码、避免在请求或响应中包含原始数据、使用消息而非基本类型表示ID、避免布尔值表示多状态等。还强调了API的可维护性和客户端使用的便捷性,以及如何处理分页、错误处理和性能优化等方面的问题。
摘要由CSDN通过智能技术生成

API Best Practices

API最佳实践

A future-proof API is surprisingly hard to get right. The suggestions in this document make trade-offs to favor long-term, bug-free evolution.

一个面向未来的API令人惊讶地难以实现。本文档中的建议进行了权衡,以支持长期、无bug的进化。

Updated for proto3. Patches welcome!

已为proto3更新。欢迎补丁!

This doc is a complement to Proto Best Practices. It’s not a prescription for Java/C++/Go and other APIs.

​本文档是对Proto最佳实践的补充。它不是Java/C++/Go和其他API的处方。

If you see a proto straying from these guidelines in a code review, point the author to this topic and help spread the word.

如果在代码审查中看到一个原型偏离了这些准则,请将作者指向这个主题,并帮助传播这个词。

Note

These guidelines are just that and many have documented exceptions. For example, if you’re writing a performance-critical backend, you might want to sacrifice flexibility or safety for speed. This topic will help you better understand the trade-offs and make a decision that works for your situation.

这些指导方针仅此而已,许多都有记录在案的例外情况。例如,如果正在编写性能关键型后端,可能希望牺牲灵活性或安全性来换取速度。本主题将帮助您更好地理解权衡,并做出适合您情况的决定。

Precisely, Concisely Document Most Fields and Messages

精确、简洁地记录大多数字段和消息

Chances are good your proto will be inherited and used by people who don’t know what you were thinking when you wrote or modified it. Document each field in terms that will be useful to a new team-member or client with little knowledge of your system.

很有可能原型会被那些不知道写或修改它时在想什么的人继承和使用。用对系统知之甚少的新团队成员或客户有用的术语记录每个字段。

Some concrete examples:

一些具体的例子:

// Bad: Option to enable Foo
// Good: Configuration controlling the behavior of the Foo feature.
message FeatureFooConfig {
  // Bad: Sets whether the feature is enabled
  // Good: Required field indicating whether the Foo feature
  // is enabled for account_id.  Must be false if account_id's
  // FOO_OPTIN Gaia bit is not set.
  optional bool enabled;
}

// Bad: Foo object.
// Good: Client-facing representation of a Foo (what/foo) exposed in APIs.
message Foo {
  // Bad: Title of the foo.
  // Good: Indicates the user-supplied title of this Foo, with no
  // normalization or escaping.
  // An example title: "Picture of my cat in a box <3 <3 !!!"
  optional string title [(max_length) = 512];
}

// Bad: Foo config.
// Less-Bad: If the most useful comment is re-stating the name, better to omit
// the comment.
FooConfig foo_config = 3;

Document the constraints, expectations and interpretation of each field in as few words as possible.

用尽可能少的文字记录每个领域的限制、期望和解释。

You can use custom proto annotations. See Custom Options to define cross-language constants like max_length in the example above. Supported in proto2 and proto3.

​可以使用自定义原型注释。请参见自定义选项以定义跨语言常量,如上面示例中的max_length。在proto2和proto3中得到支持。

Over time, documentation of an interface can get longer and longer. The length takes away from the clarity. When the documentation is genuinely unclear, fix it, but look at it holistically and aim for brevity.

随着时间的推移,接口的文档可能会变得越来越长。长度影响了清晰度。当文档确实不清楚时,修复它,但要从整体上看,力求简洁。

Use Different Messages for Wire and Storage

对线和存储使用不同的消息

If a top-level proto you expose to clients is the same one you store on disk, you’re headed for trouble. More and more binaries will depend on your API over time, making it harder to change. You’ll want the freedom to change your storage format without impacting your clients. Layer your code so that modules deal either with client protos, storage protos, or translation.

如果向客户端公开的顶级原型与存储在磁盘上的原型相同,那么将面临麻烦。随着时间的推移,越来越多的二进制文件将依赖于API,这使得更改变得更加困难。希望能够在不影响客户端的情况下自由更改存储格式。将代码分层,以便模块处理客户端原型、存储原型或转换。

Why? You might want to swap your underlying storage system. You might want to normalize—or denormalize—data differently. You might realize that parts of your client-exposed proto make sense to store in RAM while other parts make sense to go on disk.

为什么?可能需要交换基础存储系统。可能希望以不同的方式规范化或取消规范化数据。可能会意识到,客户端暴露的proto的部分存储在RAM中是有意义的,而其他部分存储在磁盘上是有意义。

When it comes to protos nested one or more levels within a top-level request or response, the case for separating storage and wire protos isn’t as strong, and depends on how closely you’re willing to couple your clients to those protos.

当涉及到在顶级请求或响应中嵌套一个或多个级别的protos时,将存储和线protos分离的理由就不那么充分了,这取决于你愿意将客户端与这些protos结合得有多紧密。

There’s a cost in maintaining the translation layer, but it quickly pays off once you have clients and have to do your first storage changes.

维护翻译层是有成本的,但一旦有了客户并必须进行第一次存储更改,它就会很快得到回报。

You might be tempted to share protos and diverge “when you need to.” With a perceived high cost to diverge and no clear place to put internal fields, your API will accrue fields clients either don’t understand or begin to depend on without your knowledge.

可能会被诱惑分享原型并“在需要的时候”发散。由于发散的成本很高,而且没有明确的位置放置内部字段,API将累积客户不理解的字段或在不知情的情况下开始依赖的字段。

By starting with separate proto files, your team will know where to add internal fields without polluting your API. In the early days, the wire proto can be tag-for-tag identical with an automatic translation layer (think: byte copying or proto reflection). Proto annotations can also power an automatic translation layer.

通过从单独的原型文件开始,团队将知道在哪里添加内部字段,而不会污染API。在早期,wire proto可以是与自动翻译层完全相同的标签对标签(想想:字节复制或proto反射)。Proto注释还可以为自动翻译层提供动力。

The following are exceptions to the rule:

以下是该规则的例外情况:

  • If the proto field is one of a common type, such as google.type or google.protobuf, then using that type both as storage and API is acceptable.

  • 如果proto字段是一种常见类型,例如google.type或google.protobuf,那么可以将该类型同时用作存储和API。

  • If your service is extremely performance-sensitive, it may be worth trading flexibility for execution speed. If your service doesn’t have millions of QPS with millisecond latency, you’re probably not the exception.

  • 如果服务对性能非常敏感,那么用灵活性换取执行速度可能是值得的。如果服务没有数百万毫秒延迟的QPS,可能也不例外。

  • If all of the following are true:

  • 如果以下所有情况均为真:

    • your service is the storage system
    • 服务是存储系统
    • your system doesn’t make decisions based on your clients’ structured data
    • 系统不会根据客户的结构化数据做出决策
    • your system simply stores, loads, and perhaps provides queries at your client’s request
    • 系统只是根据客户的请求存储、加载并可能提供查询

    Note that if you are implementing something like a logging system or a proto-based wrapper around a generic storages system, then you probably want to aim to have your clients’ messages transit into your storage backend as opaquely as possible so that you don’t create a dependency nexus. Consider using extensions or Encode Opaque Data in Strings by Web-safe Encoding Binary Proto Serialization.

  • ​请注意,如果正在围绕通用存储系统实现日志记录系统或基于原型的包装器之类的东西,那么可能希望让客户端的消息尽可能不透明地传输到存储后端,这样就不会创建依赖关系。考虑使用扩展或通过Web安全编码二进制协议序列化对字符串中的不透明数据进行编码。

For Mutations, Support Partial Updates or Append-Only Updates, Not Full Replaces

对于变体,支持部分更新或仅附加更新,而不是完全替换

Don’t make an UpdateFooRequest that only takes a Foo.

不要生成只接受Foo的UpdateFooRequest。

If a client doesn’t preserve unknown fields, they will not have the newest fields of GetFooResponse leading to data loss on a round-trip. Some systems don’t preserve unknown fields. Proto2 and proto3 implementations do preserve unknown fields unless the application drops the unknown fields explicitly. In general, public APIs should drop unknown fields on server-side to prevent security attack via unknown fields. For example, garbage unknown fields may cause a server to fail when it starts to use them as new fields in the future.

如果客户端不保留未知字段,那么它们将不会有GetFooResponse的最新字段,从而导致往返过程中的数据丢失。有些系统不保存未知字段。Proto2和proto3实现确实保留了未知字段,除非应用程序明确地删除未知字段。一般来说,公共API应该在服务器端丢弃未知字段,以防止通过未知字段进行安全攻击。例如,垃圾未知字段可能会导致服务器在将来开始将它们用作新字段时失败。

Absent documentation, handling of optional fields is ambiguous. Will UpdateFoo clear the field? That leaves you open to data loss when the client doesn’t know about the field. Does it not touch a field? Then how can clients clear the field? Neither are good.

在没有文档的情况下,对可选字段的处理是不明确的。UpdateFoo会清除字段吗?当客户端不知道该字段时,这会导致数据丢失。它不会碰到字段吗?那么,客户如何清理这一字段?两者都不好。

Fix #1: Use an Update Field-mask

修复#1:使用更新字段掩码

Have your client pass which fields it wants to modify and include only those fields in the update request. Your server leaves other fields alone and updates only those specified by the mask. In general, the structure of your mask should mirror the structure of the response proto; that is, if Foo contains BarFooMask contains BarMask.

让您的客户端传递要修改的字段,并在更新请求中只包含这些字段。您的服务器保留其他字段,只更新掩码指定的字段。一般来说,你的面具的结构应该反映反应原型的结构;也就是说,如果Foo包含Bar,那么FooMask包含BarMask。

Fix #2: Expose More Narrow Mutations That Change Individual Pieces

修复#2:暴露更窄的突变,改变单个片段

For example, instead of UpdateEmployeeRequest, you might have: PromoteEmployeeRequestSetEmployeePayRequestTransferEmployeeRequest, etc.

例如,可能有:PromoteEmployeeRequest、SetEmployeePayRequest、TransferEmployeeSequest等,而不是UpdateEmployeeRequest。

Custom update methods are easier to monitor, audit, and secure than a very flexible update method. They’re also easier to implement and call. A large number of them can increase the cognitive load of an API.

自定义更新方法比非常灵活的更新方法更易于监控、审核和安全。它们也更容易实现和调用。其中大量的错误会增加API的认知负荷。

Don’t Include Primitive Types in a Top-level Request or Response Proto

不要在顶级请求或响应协议中包含基元类型

Many of the pitfalls described elsewhere in this doc are solved with this rule. For example:

本文档其他地方描述的许多陷阱都通过该规则解决了。例如:

Telling clients that a repeated field is unset in storage versus not-populated in this particular call can be done by wrapping the repeated field in a message.

告诉客户端重复字段在存储中未设置,而不是在此特定调用中未填充,可以通过将重复字段封装在消息中来完成。

Common request options that are shared between requests naturally fall out of following this rule. Read and write field masks fall out of this.

请求之间共享的常见请求选项自然不符合此规则。读写字段掩码由此脱落。

Your top-level proto should almost always be a container for other messages that can grow independently.

顶级proto几乎应该始终是其他可以独立增长的消息的容器。

Even when you only need a single primitive type today, having it wrapped in a message gives you a clear path to expand that type and share the type among other methods that return the similar values. For example:

即使现在只需要一个基元类型,将其封装在消息中也可以提供一个明确的路径来扩展该类型,并在返回类似值的其他方法之间共享该类型。例如:

message MultiplicationResponse {
  // Bad: What if you later want to return complex numbers and have an
  // AdditionResponse that returns the same multi-field type?
  optional double result;


  // Good: Other methods can share this type and it can grow as your
  // service adds new features (units, confidence intervals, etc.).
  optional NumericResult result;
}

message NumericResult {
  optional double real_value;
  optional double complex_value;
  optional UnitType units;
}

One exception to top-level primitives: Opaque strings (or bytes) that encode a proto but are only built and parsed on the server. Continuation tokens, version info tokens and IDs can all be returned as strings if the string is actually an encoding of a structured proto.

顶级原语的一个例外是:不透明字符串(或字节)对proto进行编码,但仅在服务器上构建和解析。如果字符串实际上是结构化proto的编码,则继续令牌、版本信息令牌和ID都可以作为字符串返回。

Never Use Booleans for Something That Has Two States Now, but Might Have More Later

永远不要对现在有两个状态,但以后可能有更多状态的东西使用布尔值

If you are using boolean for a field, make sure that the field is indeed describing just two possible states (for all time, not just now and the near future). Often, the flexibility of an enum, int, or message turns out to be worth it.

如果对字段使用布尔值,请确保该字段确实只描述了两种可能的状态(对于所有时间,而不仅仅是现在和不久的将来)。通常,枚举、int或消息的灵活性是值得的。

For example, in returning a stream of posts a developer may need to indicate whether a post should be rendered in two-columns or not based on the current mocks from UX. Even though a boolean is all that’s needed today, nothing prevents UX from introducing two-row posts, three-column posts or four-square posts in a future version.

例如,在返回帖子流时,开发人员可能需要根据用户体验的当前模拟来指示帖子是否应该分两列呈现。尽管现在只需要布尔值,但没有什么能阻止用户体验在未来的版本中引入两行、三列或四方形的帖子。

message GooglePlusPost {
  // Bad: Whether to render this post across two columns.
  optional bool big_post;

  // Good: Rendering hints for clients displaying this post.
  // Clients should use this to decide how prominently to render this
  // post. If absent, assume a default rendering.
  optional LayoutConfig layout_config;
}

message Photo {
  // Bad: True if it's a GIF.
  optional bool gif;

  // Good: File format of the referenced photo (for example, GIF, WebP, PNG).
  optional PhotoType type;
}

Be cautious about adding states to an enum that conflates concepts.

在混淆概念的枚举中添加状态时要小心。

If a state introduces a new dimension to the enum or implies multiple application behaviors, you almost certainly want another field.

如果一个状态在枚举中引入了一个新的维度,或者暗示了多个应用程序行为,那么几乎肯定需要另一个字段。

Rarely Use an Integer Field for an ID

ID很少使用整数字段

It’s tempting to use an int64 as an identifier for an object. Opt instead for a string.

使用int64作为对象的标识符是很诱人的。选择字符串。

This lets you change your ID space if you need to and reduces the chance of collisions. 2^64 isn’t as big as it used to be.

这样可以在需要时更改ID空间,并减少碰撞的机会。2^64没有以前那么大了。

You can also encode a structured identifier as a string which encourages clients to treat it as an opaque blob. You still must have a proto backing the string, but you can serialize the proto to a string field (encoded as web-safe Base64) which removes any of the internal details from the client-exposed API. In this case follow the guidelines below.

​还可以将结构化标识符编码为字符串,以鼓励客户端将其视为不透明的blob。仍然必须有一个支持字符串的协议,但可以将该协议序列化为一个字符串字段(编码为web-safe Base64),该字段将从客户端暴露的API中删除任何内部细节。在这种情况下,请遵循以下指南。

message GetFooRequest {
  // Which Foo to fetch.
  optional string foo_id;
}

// Serialized and websafe-base64-encoded into the GetFooRequest.foo_id field.
message InternalFooRef {
  // Only one of these two is set. Foos that have already been
  // migrated use the spanner_foo_id and Foos still living in
  // Caribou Storage Server have a classic_foo_id.
  optional bytes spanner_foo_id;
  optional int64 classic_foo_id;
}

If you start off with your own serialization scheme to represent your IDs as strings, things can get weird quickly. That’s why it’s often best to start with an internal proto that backs your string field.

如果从自己的序列化方案开始,将ID表示为字符串,事情可能会很快变得奇怪。这就是为什么通常最好从支持字符串字段的内部proto开始。

Don’t Encode Data in a String That You Expect a Client to Construct or Parse

不要在期望客户端构造或分析的字符串中对数据进行编码

It’s less efficient over the wire, more work for the consumer of the proto, and confusing for someone reading your documentation. Your clients also have to wonder about the encoding: Are lists comma-separated? Did I escape this untrusted data correctly? Are numbers base-10? Better to have clients send an actual message or primitive type. It’s more compact over the wire and clearer for your clients.

它的效率较低,原型的消费者需要做更多的工作,阅读文档的人会感到困惑。客户还必须知道编码:列表是逗号分隔的吗?我是否正确地逃离了这些不可信的数据?数字是以10为基数的吗?最好让客户端发送一个实际的消息或基元类型。它在线上更紧凑,对客户来说更清晰。

This gets especially bad when your service acquires clients in several languages. Now each will have to choose the right parser or builder—or worse—write one.

当服务获取多种语言的客户端时,情况会变得特别糟糕。现在,每个人都必须选择正确的解析器或构建器,或者更糟的是编写一个。

More generally, choose the right primitive type. See the Scalar Value Types table in the Protocol Buffer Language Guide.

​更一般地说,选择正确的基元类型。请参阅《协议缓冲区语言指南》中的标量值类型表。

Returning HTML in a Front-End Proto

在前端协议中返回HTML

With a JavaScript client, it’s tempting to return HTML or JSON in a field of your API. This is a slippery slope towards tying your API to a specific UI. Here are three concrete dangers:

使用JavaScript客户端,很容易在API的字段中返回HTML或JSON。这是一个将API绑定到特定UI的斜坡。以下是三个具体的危险:

  • A “scrappy” non-web client will end up parsing your HTML or JSON to get the data they want leading to fragility if you change formats and vulnerabilities if their parsing is bad.
  • 一个“杂乱无章”的非web客户端最终会解析的HTML或JSON以获得他们想要的数据,如果改变格式,就会导致脆弱性,如果解析不好,就会导致漏洞。
  • Your web-client is now vulnerable to an XSS exploit if that HTML is ever returned unsanitized.
  • 如果HTML返回时未初始化,web客户端现在很容易受到XSS攻击。
  • The tags and classes you’re returning expect a particular style-sheet and DOM structure. From release to release, that structure will change, and you risk a version-skew problem where the JavaScript client is older than the server and the HTML the server returns no longer renders properly on old clients. For projects that release often, this is not an edge case.
  • 返回的标记和类需要一个特定的样式表和DOM结构。从一个版本到另一个版本,这种结构将发生变化,并且可能会面临版本偏斜问题,即JavaScript客户端比服务器旧,并且服务器返回的HTML不再在旧客户端上正确呈现。对于经常发布的项目来说,这不是一个边缘案例。

Other than the initial page load, it’s usually better to return data and use client-side templating to construct HTML on the client .

除了初始页面加载之外,通常最好返回数据并使用客户端模板在客户端上构建HTML。

Encode Opaque Data in Strings by Web-Safe Encoding Binary Proto Serialization

通过Web安全编码二进制协议序列化将不透明数据编码为字符串

If you do encode opaque data in a client-visible field (continuation tokens, serialized IDs, version infos, and so on), document that clients should treat it as an opaque blob. Always use binary proto serialization, never text-format or something of your own devising for these fields. When you need to expand the data encoded in an opaque field, you’ll find yourself reinventing protocol buffer serialization if you’re not already using it.

如果确实在客户端可见字段中编码不透明数据(延续令牌、序列化ID、版本信息等),请记录客户端应将其视为不透明blob。始终使用二进制proto序列化,而不是文本格式或自己为这些字段设计的格式。当需要扩展不透明字段中编码的数据时,如果还没有使用协议缓冲区序列化,将发现自己正在重新设计它。

Define an internal proto to hold the fields that will go in the opaque field (even if you only need one field), serialize this internal proto to bytes then web-safe base-64 encode the result into your string field .

定义一个内部proto来保存不透明字段中的字段(即使你只需要一个字段),将这个内部proto序列化为字节,然后web安全base-64将结果编码到字符串字段中。

One rare exception to using proto serialization: Very occasionally, the compactness wins from a carefully constructed alternative format are worth it.

使用proto序列化的一个罕见例外是:偶尔,从精心构建的替代格式中获得的紧凑性是值得的。

Don’t Include Fields that Your Clients Can’t Possibly Have a Use for

不要包括客户可能无法使用的字段

The API you expose to clients should only be for describing how to interact with your system. Including anything else in it adds cognitive overhead to someone trying to understand it.

向客户端公开的API应该只用于描述如何与系统交互。包括其他任何内容都会给试图理解它的人增加认知开销。

Returning debug data in response protos used to be a common practice, but we have a better way. RPC response extensions (also called “side channels”) let you describe your client interface with one proto and your debugging surface with another.

在响应protos中返回调试数据曾经是一种常见的做法,但我们有更好的方法。RPC响应扩展(也称为“侧通道”)允许用一个原型来描述客户端接口,用另一个原型描述调试表面。

Similarly, returning experiment names in response protos used to be a logging convenience–the unwritten contract was the client would send those experiments back on subsequent actions. The accepted way of accomplishing the same is to do log joining in the analysis pipeline.

类似地,在响应protos中返回实验名称曾经是一种日志记录的便利——不成文的约定是客户端将在后续操作中返回这些实验。实现这一点的公认方法是在分析管道中进行日志连接。

One exception:

一个例外:

If you need continuous, real-time analytics and are on a small machine budget, running log joins might be prohibitive. In cases where cost is a deciding factor, denormalizing log data ahead of time can be a win. If you need log data round-tripped to you, send it to clients as an opaque blob and document the request and response fields.

如果需要连续的实时分析,并且机器预算很小,那么运行日志联接可能会令人望而却步。在成本是决定性因素的情况下,提前取消日志数据的规范化可能是一种胜利。如果需要往返的日志数据,请将其作为不透明的blob发送到客户端,并记录请求和响应字段。

Caution: If you need to return or round-trip hidden data on every request , you’re hiding the true cost of using your service and that’s not good either.

注意:如果您需要在每个请求中返回或往返隐藏数据,那么就隐藏了使用服务的真实成本,这也不好。

Rarely Define a Pagination API Without a Continuation Token

很少定义没有延续标记的分页API

message FooQuery {
  // Bad: If the data changes between the first query and second, each of
  // these strategies can cause you to miss results. In an eventually
  // consistent world (that is, storage backed by Bigtable), it's not uncommon
  // to have old data appear after the new data. Also, the offset- and
  // page-based approaches all assume a sort-order, taking away some
  // flexibility.
  optional int64 max_timestamp_ms;
  optional int32 result_offset;
  optional int32 page_number;
  optional int32 page_size;

  // Good: You've got flexibility! Return this in a FooQueryResponse and
  // have clients pass it back on the next query.
  optional string next_page_token;
}

The best practice for a pagination API is to use an opaque continuation token (called next_page_token ) backed by an internal proto that you serialize and then WebSafeBase64Escape (C++) or BaseEncoding.base64Url().encode (Java). That internal proto could include many fields. The important thing is it buys you flexibility and–if you choose–it can buy your clients stability in the results.

分页API的最佳实践是使用一个不透明的延续标记(称为next_page_token),该标记由序列化的内部协议支持,然后是WebSafeBase64Escape(C++)或BaseEncoding.base64Url().encode(Java)。该内部协议可以包括许多字段。重要的是,它为你带来了灵活性,如果你选择的话,它可以为客户带来结果的稳定性。

Do not forget to validate the fields of this proto as untrustworthy inputs (see note in Encode opaque data in strings).

​不要忘记将这个proto的字段验证为不可信的输入(请参阅字符串中编码不透明数据中的注释)。

message InternalPaginationToken {
  // Track which IDs have been seen so far. This gives perfect recall at the
  // expense of a larger continuation token--especially as the user pages
  // back.
  repeated FooRef seen_ids;

  // Similar to the seen_ids strategy, but puts the seen_ids in a Bloom filter
  // to save bytes and sacrifice some precision.
  optional bytes bloom_filter;

  // A reasonable first cut and it may work for longer. Having it embedded in
  // a continuation token lets you change it later without affecting clients.
  optional int64 max_timestamp_ms;
}

将相关字段分组到新消息中。具有高内聚力的仅嵌套字段

message Foo {
  // Bad: The price and currency of this Foo.
  optional int price;
  optional CurrencyType currency;

  // Better: Encapsulates the price and currency of this Foo.
  optional CurrencyAmount price;
}

Only fields with high cohesion should be nested. If the fields are genuinely related, you’ll often want to pass them around together inside a server. That’s easier if they’re defined together in a message. Think:

只有具有高内聚性的字段才应该嵌套。如果字段是真正相关的,通常会希望在服务器中一起传递它们。如果它们在一条消息中被定义在一起,那就更容易了。思考:

CurrencyAmount calculateLocalTax(CurrencyAmount price, Location where)

If your CL introduces one field, but that field might have related fields later, preemptively put it in its own message to avoid this:

如果CL引入了一个字段,但该字段稍后可能有相关字段,请先将其放在自己的消息中以避免这种情况:

message Foo {
  // DEPRECATED! Use currency_amount.
  optional int price [deprecated = true];

  // The price and currency of this Foo.
  optional google.type.Money currency_amount;
}

The problem with a nested message is that while CurrencyAmount might be a popular candidate for reuse in other places of your API, Foo.CurrencyAmount might not. In the worst case, Foo.CurrencyAmount is reused, but Foo-specific fields leak into it.

嵌套消息的问题是,虽然CurrencyAmount可能是在API的其他位置重用的热门候选项,但Foo.CurrencyAmount可能不是。在最坏的情况下,Foo.CurrentAmount会被重用,但Foo特定的字段会泄漏到其中。

While loose coupling is generally accepted as a best practice when developing systems, that practice may not always apply when designing .proto files. There may be cases in which tightly coupling two units of information (by nesting one unit inside of the other) may make sense. For example, if you are creating a set of fields that appear fairly generic right now but which you anticipate adding specialized fields into at a later time, nesting the message would dissuade others from referencing that message from elsewhere in this or other .proto files.

​虽然松耦合通常被认为是开发系统时的最佳实践,但这种实践可能并不总是适用于设计.proto文件。在某些情况下,紧密耦合两个信息单元(通过将一个单元嵌套在另一个单元中)可能是有意义的。例如,如果正在创建一组字段,这些字段现在看起来相当通用,但预计稍后会将专用字段添加到其中,则嵌套消息会阻止其他人从该文件或其他.proto文件的其他位置引用该消息。

message Photo {
  // Bad: It's likely PhotoMetadata will be reused outside the scope of Photo,
  // so it's probably a good idea not to nest it and make it easier to access.
  message PhotoMetadata {
    optional int32 width = 1;
    optional int32 height = 2;
  }
  optional PhotoMetadata metadata = 1;
}

message FooConfiguration {
  // Good: Reusing FooConfiguration.Rule outside the scope of FooConfiguration
  // tightly-couples it with likely unrelated components, nesting it dissuades
  // from doing that.
  message Rule {
    optional float multiplier = 1;
  }
  repeated Rule rules = 1;
}

Include a Field Read Mask in Read Requests

在读取请求中包括字段读取掩码

// Recommended: use google.protobuf.FieldMask

// Alternative one:
message FooReadMask {
  optional bool return_field1;
  optional bool return_field2;
}

// Alternative two:
message BarReadMask {
  // Tag numbers of the fields in Bar to return.
  repeated int32 fields_to_return;
}

If you use the recommended google.protobuf.FieldMask, you can use the FieldMaskUtil (Java/C++) libraries to automatically filter a proto.

​如果使用推荐的google.protobuf.FieldMask,你可以使用FieldMaskUtil(Java/C++)库来自动过滤proto。

Read masks set clear expectations on the client side, give them control of how much data they want back and allow the backend to only fetch data the client needs.

读取掩码在客户端设置了明确的期望,使他们能够控制想要返回的数据量,并允许后端只获取客户端需要的数据。

The acceptable alternative is to always populate every field; that is, treat the request as if there were an implicit read mask with all fields set to true. This can get costly as your proto grows.

可接受的替代方案是始终填充每个字段;也就是说,将请求视为存在所有字段都设置为true的隐式读取掩码。随着原型的增长,这可能会变得昂贵。

The worst failure mode is to have an implicit (undeclared) read mask that varies depending on which method populated the message. This anti-pattern leads to apparent data loss on clients that build a local cache from response protos.

最糟糕的失败模式是具有隐式(未声明)读取掩码,该掩码根据填充消息的方法而变化。这种反模式会导致从响应原型构建本地缓存的客户端明显的数据丢失。

Include a Version Field to Allow for Consistent Reads

包括允许一致读取的版本字段

When a client does a write followed by a read of the same object, they expect to get back what they wrote–even when the expectation isn’t reasonable for the underlying storage system.

当客户端对同一对象先写后读时,即使对底层存储系统的期望不合理,他们也希望能收回所写的内容。

Your server will read the local value and if the local version_info is less than the expected version_info, it will read from remote replicas to find the latest value. Typically version_info is a proto encoded as a string that includes the datacenter the mutation went to and the timestamp at which it was committed.

​服务器将读取本地值,如果本地版本信息小于预期版本信息,它将从远程副本中读取以查找最新值。通常,version_info是一个编码为字符串的proto,其中包括突变所指向的数据中心和提交突变的时间戳。

Even systems backed by consistent storage often want a token to trigger the more expensive read-consistent path rather than incurring the cost on every read.

即使是由一致性存储支持的系统,也经常希望令牌触发更昂贵的读取一致性路径,而不是每次读取都会产生成本。

Use Consistent Request Options for RPCs that Return the Same Data Type

对返回相同数据类型的RPC使用一致请求选项

An example failure pattern is the request options for a service in which each RPC returns the same data type, but has separate request options for specifying things like maximum comments, embeds supported types list, and so on.

一个失败模式的例子是服务的请求选项,其中每个RPC返回相同的数据类型,但有单独的请求选项用于指定最大注释、嵌入支持的类型列表等内容。

The cost of approaching this ad hoc is increased complexity on the client from figuring out how to fill out each request and increased complexity on the server transforming the N request options into a common internal one. A not-small number of real-life bugs are traceable to this example.

采用这种即席方式的成本是,客户端由于弄清楚如何填写每个请求而增加了复杂性,而服务器将N个请求选项转换为公共内部请求选项的复杂性也增加了。不小的现实生活中的bug可以追溯到这个例子。

Instead, create a single, separate message to hold request options and include that in each of the top-level request messages. Here’s a better-practices example:

相反,创建一个单独的消息来保存请求选项,并将其包含在每个顶级请求消息中。下面是一个更好的实践示例:

message FooRequestOptions {
  // Field-level read mask of which fields to return. Only fields that
  // were requested will be returned in the response. Clients should only
  // ask for fields they need to help the backend optimize requests.
  optional FooReadMask read_mask;

  // Up to this many comments will be returned on each Foo in the response.
  // Comments that are marked as spam don't count towards the maximum
  // comments. By default, no comments are returned.
  optional int max_comments_to_return;

  // Foos that include embeds that are not on this supported types list will
  // have the embeds down-converted to an embed specified in this list. If no
  // supported types list is specified, no embeds will be returned. If an embed
  // can't be down-converted to one of the supplied supported types, no embed
  // will be returned. Clients are strongly encouraged to always include at
  // least the THING_V2 embed type from EmbedTypes.proto.
  repeated EmbedType embed_supported_types_list;
}

message GetFooRequest {
  // What Foo to read. If the viewer doesn't have access to the Foo or the
  // Foo has been deleted, the response will be empty but will succeed.
  optional string foo_id;

  // Clients are required to include this field. Server returns
  // INVALID_ARGUMENT if FooRequestOptions is left empty.
  optional FooRequestOptions params;
}

message ListFooRequest {
  // Which Foos to return. Searches have 100% recall, but more clauses
  // impact performance.
  optional FooQuery query;

  // Clients are required to include this field. The server returns
  // INVALID_ARGUMENT if FooRequestOptions is left empty.
  optional FooRequestOptions params;
}

Batch/multi-phase Requests

批量/多阶段请求

Where possible, make mutations atomic. Even more important, make mutations idempotent. A full retry of a partial failure shouldn’t corrupt/duplicate data.

​在可能的情况下,使突变成为原子。更重要的是,使突变幂等。部分失败的完全重试不应损坏/重复数据。

Occasionally, you’ll need a single RPC that encapsulates multiple operations for performance reasons. What to do on a partial failure? If some succeeded and some failed, it’s best to let clients know.

有时,出于性能原因,需要一个封装多个操作的RPC。部分故障怎么办?如果有些成功了,有些失败了,最好让客户知道。

Consider setting the RPC as failed and return details of both the successes and failures in an RPC status proto.

考虑将RPC设置为失败,并在RPC状态原型中返回成功和失败的详细信息。

In general, you want clients who are unaware of your handling of partial failures to still behave correctly and clients who are aware to get extra value.

一般来说,你希望那些不知道处理部分故障的客户仍然表现正确,而那些知道获得额外价值的客户。

Create Methods that Return or Manipulate Small Bits of Data and Expect Clients to Compose UIs from Batching Multiple Such Requests

创建返回或操作少量数据的方法,并期望客户端通过批量处理多个此类请求来编写UI

The ability to query many narrowly specified bits of data in a single round-trip allows a wider range of UX options without server changes by letting the client compose what they need.

在一次往返中查询许多狭义数据位的能力允许更广泛的用户体验选项,而无需更改服务器,让客户端编写所需内容。

This is most relevant for front-end and middle-tier servers.

这与前端和中间层服务器最相关。

Many services expose their own batching API.

许多服务公开了自己的批处理API。

Make a One-off RPC when the Alternative is Serial Round-trips on Mobile or Web

当替代方案是在移动设备或网络上进行串行往返时,进行一次性RPC

In cases where a web or mobile client needs to make two queries with a data dependency between them, the current best practice is to create a new RPC that protects the client from the round trip.

如果web或移动客户端需要进行两次查询,并且查询之间存在数据依赖关系,则当前的最佳做法是创建一个新的RPC,以保护客户端免受往返的影响。

In the case of mobile, it’s almost always worth saving your client the cost of an extra round-trip by bundling the two service methods together in one new one. For server-to-server calls, the case may not be as clear; it depends on how performance-sensitive your service is and how much cognitive overhead the new method introduces.

在移动端的情况下,通过将这两种服务方法捆绑在一个新方法中,几乎总是值得为客户节省额外往返的成本。对于服务器到服务器的调用,情况可能不那么清楚;这取决于服务对性能的敏感程度,以及新方法引入了多少认知开销。

Make Repeated Fields Messages, Not Scalars or Enums

生成重复字段消息,而不是标量或枚举

A common evolution is that a single repeated field needs to become multiple related repeated fields. If you start with a repeated primitive your options are limited–you either create parallel repeated fields, or define a new repeated field with a new message that holds the values and migrate clients to it.

一种常见的进化是,单个重复字段需要变成多个相关的重复字段。如果重复基元开始,选择是有限的——可以创建并行重复字段,或者使用包含值的新消息定义新的重复字段,并将客户端迁移到该字段。

If you start with a repeated message, evolution becomes trivial.

如果你从一个重复的信息开始,进化就会变得微不足道。

// Describes a type of enhancement applied to a photo
enum EnhancementType {
  ENHANCEMENT_TYPE_UNSPECIFIED;
  RED_EYE_REDUCTION;
  SKIN_SOFTENING;
}

message PhotoEnhancement {
  optional EnhancementType type;
}

message PhotoEnhancementReply {
  // Good: PhotoEnhancement can grow to describe enhancements that require
  // more fields than just an enum.
  repeated PhotoEnhancement enhancements;

  // Bad: If we ever want to return parameters associated with the
  // enhancement, we'd have to introduce a parallel array (terrible) or
  // deprecate this field and introduce a repeated message.
  repeated EnhancementType enhancement_types;
}

Imagine the following feature request: “We need to know which enhancements were performed by the user and which enhancements were automatically applied by the system.”

想象一下以下功能请求:“我们需要知道哪些增强是由用户执行的,哪些增强是系统自动应用的。”

If the enhancement field in PhotoEnhancementReply were a scalar or enum, this would be much harder to support.

如果PhotoEnhancementReply中的增强字段是标量或枚举,这将更难支持。

This applies equally to maps. It is much easier to add additional fields to a map value if it’s already a message rather than having to migrate from map<string, string> to map<string, MyProto>.

这同样适用于映射。如果映射值已经是一条消息,那么向映射值添加其他字段要容易得多,而不必从map<string,string>迁移到map<string、MyProto>。

One exception:

一个例外:

Latency-critical applications will find parallel arrays of primitive types are faster to construct and delete than a single array of messages; they can also be smaller over the wire if you use [packed=true] (eliding field tags). Allocating a fixed number of arrays is less work than allocating N messages. Bonus: in Proto3, packing is automatic; you don’t need to explicitly specify it.

​延迟关键型应用程序会发现,基元类型的并行数组比单个消息数组更快地构建和删除;如果使用[packed=true](省略字段标记),它们也可以在线上更小。分配固定数量的数组比分配N条消息要少。额外奖励:在Proto3中,包装是自动的;不需要明确指定它。

Use Proto Maps

使用原型映射

Prior to the introduction in Proto3 of Proto3 maps, services would sometimes expose data as pairs using an ad-hoc KVPair message with scalar fields. Eventually clients would need a deeper structure and would end up devising keys or values that need to be parsed in some way. See Don’t encode data in a string.

​在Proto3中引入Proto3映射之前,服务有时会使用带有标量字段的自组织KVPair消息将数据公开为对。最终,客户端将需要一个更深层次的结构,并最终设计出需要以某种方式解析的键或值。请参阅不要将数据编码为字符串。

So, using a (extensible) message type for the value is an immediate improvement over the naive design.

因此,使用(可扩展的)消息类型作为值是对原始设计的直接改进。

Maps were back-ported to proto2 in all languages, so using map<scalar, **message**> is better than inventing your own KVPair for the same purpose1.

​所有语言中的映射都被反向移植到了proto2,因此使用map<scalar,**message**>比出于同样的目的发明自己的KVPair要好1。

If you want to represent arbitrary data whose structure you don’t know ahead of time, use google.protobuf.Any.

​如果想提前表示你不知道其结构的任意数据,可以使用google.protobuf.Any。

Prefer Idempotency

更佳的等时性

Somewhere in the stack above you, a client may have retry logic. If the retry is a mutation, the user could be in for a surprise. Duplicate comments, build requests, edits, and so on aren’t good for anyone.

在上面的堆栈中的某个位置,客户端可能具有重试逻辑。如果重试是一个突变,用户可能会大吃一惊。重复的评论、构建请求、编辑等等对任何人都没有好处。

A simple way to avoid duplicate writes is to allow clients to specify a client-created request ID that your server dedupes on (for example, hash of content or UUID).

避免重复写入的一个简单方法是允许客户端指定服务器对其进行重复数据消除的客户端创建的请求ID(例如,内容哈希或UUID)。

Be Mindful of Your Service Name, and Make it Globally Unique

注意您的服务名称,使其在全球范围内独一无二

A service name (that is, the part after the service keyword in your .proto file) is used in surprisingly many places, not just to generate the service class name. This makes this name more important than one might think.

服务名称(即.proto文件中服务关键字后面的部分)在许多地方使用,而不仅仅是用于生成服务类名。这使得这个名字比人们想象的更重要。

What’s tricky is that these tools make the implicit assumption that your service name is unique across a network . Worse, the service name they use is the unqualified service name (for example, MyService), not the qualified service name (for example, my_package.MyService).

棘手的是,这些工具隐含地假设您的服务名称在整个网络中是唯一的。更糟糕的是,他们使用的服务名称是不合格的服务名称(例如MyService),而不是合格的服务名(例如my_package.MyService)。

For this reason, it makes sense to take steps to prevent naming conflicts on your service name, even if it is defined inside a specific package. For example, a service named Watcher is likely to cause problems; something like MyProjectWatcher would be better.

因此,采取措施防止服务名称上的命名冲突是有意义的,即使它是在特定的包中定义的。例如,名为Watcher的服务可能会导致问题;像MyProjectWatcher这样的东西会更好。

Ensure Every RPC Specifies and Enforces a (Permissive) Deadline

确保每个RPC指定并强制执行(允许的)截止日期

By default, an RPC does not have a timeout. Since a request may tie up backend resources that are only released on completion, setting a default deadline that allows all well-behaving requests to finish is a good defensive practice. Not enforcing one has in the past caused severe problems for major services . RPC clients should still set a deadline on outgoing RPCs and will typically do so by default when they use standard frameworks. A deadline may and typically will be overwritten by a shorter deadline attached to a request.

默认情况下,RPC没有超时。由于请求可能会占用只在完成时释放的后端资源,因此设置一个默认的截止日期以允许所有表现良好的请求完成是一种很好的防御做法。过去,不执行一项规定给主要服务业带来了严重问题。RPC客户端仍然应该设置传出RPC的截止日期,并且通常在使用标准框架时默认会这样做。截止日期可能而且通常会被附在请求上的较短截止日期覆盖。

Setting the deadline option clearly communicates the RPC deadline to your clients, and is respected and enforced by standard frameworks:

设置最后期限选项可以清楚地将RPC最后期限传达给您的客户,并受到标准框架的尊重和强制执行:

rpc Foo(FooRequest) returns (FooResponse) {
  option deadline = x; // there is no globally good default
}

Choosing a deadline value will especially impact how your system acts under load. For existing services, it is critical to evaluate existing client behavior before enforcing new deadlines to avoid breaking clients (consult SRE). In some cases, it may not be possible to enforce a shorter deadline after the fact.

选择截止日期值将特别影响系统在负载下的操作方式。对于现有服务,在强制执行新的截止日期之前评估现有客户的行为至关重要,以避免破坏客户(请咨询SRE)。在某些情况下,事后可能无法强制执行更短的最后期限。

Bound Request and Response Sizes

绑定请求和响应大小

Request and response sizes should be bounded. We recommend a bound in the ballpark of 8 MiB, and 2 GiB is a hard limit at which many proto implementations break . Many storage systems have a limit on message sizes .

请求和响应的大小应该有界限。我们建议边界大约为8MIB,2GIB是许多原型实现所突破的硬限制。许多存储系统对消息大小都有限制。

Also, unbounded messages

此外,无限制消息

  • bloat both client and server,
  • 客户端和服务器都膨胀,
  • cause high and unpredictable latency,
  • 导致高且不可预测的延迟,
  • decrease resiliency by relying on a long-lived connection between a single client and a single server.
  • 通过依赖单个客户端和单个服务器之间的长期连接来降低弹性。

Here are a few ways to bound all messages in an API:

以下是绑定API中所有消息的几种方法:

  • Define RPCs that return bounded messages, where each RPC call is logically independent from the others.
  • 定义返回有界消息的RPC,其中每个RPC调用在逻辑上独立于其他调用。
  • Define RPCs that operate on a single object, instead of on an unbounded, client-specified list of objects.
  • 定义在单个对象上操作的RPC,而不是在客户端指定的无边界对象列表上操作。
  • Avoid encoding unbounded data in string, byte, or repeated fields.
  • 避免在字符串、字节或重复字段中编码无边界数据。
  • Define a long-running operation . Store the result in a storage system designed for scalable, concurrent reads .
  • 定义长时间运行的操作。将结果存储在专为可扩展的并发读取而设计的存储系统中。
  • Use a pagination API (see Rarely define a pagination API without a continuation token).
  • ​使用分页API(请参阅很少定义没有延续标记的分页API)。
  • Use streaming RPCs.
  • 使用流式RPC。

If you are working on a UI, see also Create methods that return or manipulate small bits of data.

​如果正在处理UI,请参阅创建返回或操作小数据位的方法。

Propagate Status Codes Carefully

小心传播状态代码

RPC services should take care at RPC boundaries to interrogate errors, and return meaningful status errors to their callers.

RPC服务应该注意RPC边界,以查询错误,并向其调用方返回有意义的状态错误。

Let’s examine a toy example to illustrate the point:

让我们来看看一个玩具的例子来说明这一点:

Consider a client that calls ProductService.GetProducts, which takes no arguments. As part of GetProductsProductService might get all the products, and call LocaleService.LocaliseNutritionFacts for each product.

考虑一个调用ProductService.GetProducts的客户端,该客户端不接受任何参数。作为GetProducts的一部分,ProductService可能会获取所有产品,并为每个产品调用LocaleService.LocalizeNaturationFacts。

digraph toy_example {
  node [style=filled]
  client [label="Client"];
  product [label="ProductService"];
  locale [label="LocaleService"];
  client -> product [label="GetProducts"]
  product -> locale [label="LocaliseNutritionFacts"]
}

If ProductService is incorrectly implemented, it might send the wrong arguments to LocaleService, resulting in an INVALID_ARGUMENT.

如果ProductService未正确实现,它可能会向LocaleService发送错误的参数,从而导致INVALID_ARGUMENT。

If ProductService carelessly returns errors to its callers, the client will receive INVALID_ARGUMENT, since status codes propagate across RPC boundaries. But, the client didn’t pass any arguments to ProductService.GetProducts. So, the error is worse than useless: it will cause a great deal of confusion!

如果ProductService不小心向其调用方返回错误,则客户端将收到INVALID_ARGUMENT,因为状态代码会跨RPC边界传播。但是,客户端没有向ProductService.GetProducts传递任何参数。因此,这个错误比无用的错误更糟糕:它会导致大量混乱!

Instead, ProductService should interrogate errors it receives at the RPC boundary; that is, the ProductService RPC handler it implements. It should return meaningful errors to users: if it received invalid arguments from the caller, it should return INVALID_ARGUMENT. If something downstream received invalid arguments, it should convert the INVALID_ARGUMENT to INTERNAL before returning the error to the caller.

相反,ProductService应该询问它在RPC边界处接收到的错误;也就是它实现的ProductService RPC处理程序。它应该向用户返回有意义的错误:如果它从调用方接收到无效的参数,它应该返回INVALID_ARGUMENT。如果某个下游接收到无效参数,则应将INVALID_ARGUMENT转换为INTERNAL,然后再将错误返回给调用者。

Carelessly propagating status errors leads to confusion, which can be very expensive to debug. Worse, it can lead to an invisible outage where every service forwards a client error without causing any alerts to happen .

随意传播状态错误会导致混乱,调试成本可能非常高。更糟糕的是,它可能会导致无形的停机,每个服务都会转发客户端错误,而不会引起任何警报。

The general rule is: at RPC boundaries, take care to interrogate errors, and return meaningful status errors to callers, with appropriate status codes. To convey meaning, each RPC method should document what error codes it returns in which circumstances. The implementation of each method should conform to the documented API contract.

一般规则是:在RPC边界,注意询问错误,并使用适当的状态代码向调用方返回有意义的状态错误。为了传达意义,每个RPC方法都应该记录它在什么情况下返回的错误代码。每种方法的实施应符合API合同的文件规定。

Appendix

附录

Returning Repeated Fields

返回重复字段

When a repeated field is empty, the client can’t tell if the field just wasn’t populated by the server or if the backing data for the field is genuinely empty. In other words, there’s no hasFoo method for repeated fields.

当重复字段为空时,客户端无法判断该字段是否只是由服务器填充的,或者该字段的支持数据是否真的为空。换句话说,对于重复字段没有hasFoo方法。

Wrapping a repeated field in a message is an easy way to get a hasFoo method.

在消息中封装重复字段是获得hasFoo方法的一种简单方法。

message FooList {
  repeated Foo foos;
}

The more holistic way to solve it is with a field read mask. If the field was requested, an empty list means there’s no data. If the field wasn’t requested the client should ignore the field in the response.

​更全面的解决方法是使用字段读取掩码。如果请求了该字段,则空列表表示没有数据。如果没有请求该字段,则客户端应忽略响应中的字段。

Updating Repeated Fields

更新重复字段

The worst way to update a repeated field is to force the client to supply a replacement list. The dangers with forcing the client to supply the entire array are manyfold. Clients that don’t preserve unknown fields cause data loss. Concurrent writes cause data loss. Even if those problems don’t apply, your clients will need to carefully read your documentation to know how the field is interpreted on the server side. Does an empty field mean the server won’t update it, or that the server will clear it?

更新重复字段的最糟糕方法是强制客户端提供替换列表。强制客户端提供整个阵列的危险有很多倍。不保留未知字段的客户端会导致数据丢失。并发写入会导致数据丢失。即使这些问题不适用,客户也需要仔细阅读文档,以了解服务器端如何解释该字段。空字段是意味着服务器不会更新它,还是意味着服务器会清除它?

Fix #1: Use a repeated update mask that permits the client to replace, delete, or insert elements into the array without supplying the entire array on a write.

修复#1:使用重复更新掩码,允许客户端在写入时替换、删除或向数组中插入元素,而无需提供整个数组。

Fix #2: Create separate append, replace, delete arrays in the request proto.

修复#2:在请求原型中创建单独的附加、替换、删除数组。

Fix #3: Allow only appending or clearing. You can do this by wrapping the repeated field in a message. A present, but empty, message means clear, otherwise, any repeated elements mean append.

修复#3:只允许追加或清除。可以通过将重复的字段包装在消息中来完成此操作。当前但为空的消息表示清楚,否则,任何重复的元素都表示追加。

Order Independence in Repeated Fields

重复字段中的顺序独立性

Try to avoid order dependence in general. It’s an extra layer of fragility. An especially bad type of order dependence is parallel arrays. Parallel arrays make it more difficult for clients to interpret the results and make it unnatural to pass the two related fields around inside your own service.

一般来说,尽量避免顺序依赖性。这是一层额外的脆弱。一种特别糟糕的顺序依赖类型是并行数组。并行数组使客户端更难解释结果,并使在自己的服务中传递两个相关字段变得不自然。

message BatchEquationSolverResponse {
  // Bad: Solved values are returned in the order of the equations given in
  // the request.
  repeated double solved_values;
  // (Usually) Bad: Parallel array for solved_values.
  repeated double solved_complex_values;
}

// Good: A separate message that can grow to include more fields and be
// shared among other methods. No order dependence between request and
// response, no order dependence between multiple repeated fields.
message BatchEquationSolverResponse {
  // Deprecated, this will continue to be populated in responses until Q2
  // 2014, after which clients must move to using the solutions field below.
  repeated double solved_values [deprecated = true];

  // Good: Each equation in the request has a unique identifier that's
  // included in the EquationSolution below so that the solutions can be
  // correlated with the equations themselves. Equations are solved in
  // parallel and as the solutions are made they are added to this array.
  repeated EquationSolution solutions;
}

Leaking Features Because Your Proto is in a Mobile Build

由于Proto处于移动构建中而泄露功能

Android and iOS runtimes both support reflection. To do that, the unfiltered names of fields and messages are embedded in the application binary (APK, IPA) as strings.

Android和iOS运行时都支持反射。为此,未过滤的字段和消息的名称作为字符串嵌入到应用程序二进制文件(APK、IPA)中。

message Foo {
  // This will leak existence of Google Teleport project on Android and iOS
  optional FeatureStatus google_teleport_enabled;
}

Several mitigation strategies:

几种缓解策略:

  • ProGuard obfuscation on Android. As of Q3 2014. iOS has no obfuscation option: once you have the IPA on a desktop, piping it through strings will reveal field names of included protos. iOS Chrome tear-down
  • ​安卓系统上的ProGuard模糊处理。截至2014年第三季度。iOS没有模糊选项:一旦你在桌面上有了IPA,通过字符串将其管道传输就会显示所包含原型的字段名
  • Curate precisely which fields are sent to mobile clients .
  • 精确控制发送给移动客户端的字段。
  • If plugging the leak isn’t feasible on an acceptable timescale, get buy-in from the feature owner to risk it.
  • 如果在可接受的时间范围内封堵泄漏是不可行的,请获得功能所有者的支持,冒着风险。

Never use this as an excuse to obfuscate the meaning of a field with a code-name. Either plug the leak or get buy-in to risk it.

永远不要以此为借口混淆带有代码名称的字段的含义。要么堵住漏洞,要么接受风险。

Performance Optimizations

性能优化

You can trade type safety or clarity for performance wins in some cases. For example, a proto with hundreds of fields–particularly message-type fields–is going to be slower to parse than one with fewer fields. A very deeply-nested message can be slow to deserialize just from the memory management. A handful of techniques teams have used to speed deserialization:

在某些情况下,可以用类型安全性或清晰度换取性能优势。例如,一个包含数百个字段(尤其是消息类型字段)的proto将比包含较少字段的proto更难解析。嵌套非常深的消息仅从内存管理进行反序列化可能会很慢。团队使用了一些技术来加速反序列化:

  • Create a parallel, trimmed proto that mirrors the larger proto but has only some of the tags declared. Use this for parsing when you don’t need all the fields. Add tests to enforce that tag numbers continue to match as the trimmed proto accumulates numbering “holes.”
  • 创建一个并行的、经过修剪的proto,该proto与较大的proto镜像,但只声明了一些标记。当不需要所有字段时,可以使用它进行解析。添加测试以强制标记号在修剪后的原型累积编号“孔”时继续匹配
  • Annotate the fields as “lazily parsed” with [lazy=true].
  • ​用[lazy=true]将字段注释为“惰性解析”。
  • Declare a field as bytes and document its type. Clients who care to parse the field can do so manually. The danger with this approach is there’s nothing preventing someone from putting a message of the wrong type in the bytes field. You should never do this with a proto that’s written to any logs, as it prevents the proto from being vetted for PII or scrubbed for policy or privacy reasons.
  • 将字段声明为字节,并记录其类型。想要解析字段的客户端可以手动进行解析。这种方法的危险在于,没有什么可以阻止某人在字节字段中放入错误类型的消息。永远不应该对写在任何日志中的proto这样做,因为它可以防止proto因PII而被审查或因政策或隐私原因而被删除。

1.A gotcha with protos that contain map<k,v> fields. Don’t use them as reduce keys in a MapReduce. The wire format and iteration order of proto3 map items are unspecified which leads to inconsistent map shards. ↩︎

​1.一个包含map<k,v>字段的protos的gotcha。不要将它们用作MapReduce中的减少关键点。proto3映射项的线格式和迭代顺序未指定,导致映射碎片不一致↩︎

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值