Protocol Buffers, Avro, Thrift,MessagePack区别

1 篇文章 0 订阅
0 篇文章 0 订阅

 Perhaps one of the first inescapable observations that a newGoogle developer (Noogler) makes once they dive into the code is that ProtocolBuffers (PB) is the "language of data" at Google. Put simply,Protocol Buffers are used for serialization, RPC, and about everything inbetween.

 

Initially developed in early 2000's as an optimized serverrequest/response protocol (hence the name), they have become the de-facto datapersistence format and RPC protocol. Later, following a major (v2) rewrite in2008, Protocol Buffers was open sourced by Google and now, through a number ofthird party extensions, can be used across dozens of languages - includingRuby, of course.

 

But, Protocol Buffers for everything? Well, it appears towork for Google, but more importantly I think this is a great example of whereunderstanding the historical context in which each was developed is just asinstrumental as comparing features and benchmarking speed.

 

Protocol Buffers vs. Thrift

 

Let's take a step back and compare Protocol Buffers to the"competitors", of which there are plenty. Between PB, Thrift, Avroand MessagePack, which is the best? Truth of the matter is, they are all verygood and each has its own strong points. Hence, the answer is as much of apersonal choice, as well as understanding of the historical context for each,and correctly identifying your own, individual requirements.

 

When Protocol Buffers was first being developed (early 2000's),the preferred language at Google was C++ (nowadays, Java is on par). Hence itshould not be surprising that PB is strongly typed, has a separate schema file,and also requires a compilation step to output the language-specificboilerplate to read and serialize messages. To achieve this, Google definedtheir own language (IDL) for specifying the proto files, and limited PB'sdesign scope to efficient serialization of common types and attributes found inJava, C++ and Python. Hence, PB was designed to be layered over an (existing)RPC mechanism.

 

By comparison, Thrift which was open sourced by Facebook inlate 2007, looks and feels very similar to Protocol Buffers - in alllikelihood, there was some design influence from PB there. However, unlike PB,Thrift makes RPC a first class citizen: Thrift compiler provides a variety oftransport options (network, file, memory), and also tries to target many morelanguages.

 

Which is the "better" of the two? Both have beenproduction tested at scale, so it really depends on your own situation. If youare primarily interested in the binary serialization, or if you already have anRPC mechanism then Protocol Buffers is a great place to start. Conversely, ifyou don't yet have an RPC mechanism and are looking for one, then Thrift may bea good choice. (Word of warning: historically, Thrift has not been consistentin their feature support and performance across all the languages, so do someresearch).

 

Protocol Buffers vs. Avro, MessagePack

 

While Thrift and PB differ primarily in their scope, Avroand MessagePack should really be compared in light of the more recent trends:rising popularity of dynamic languages, and JSON over XML. As most every webdevelopers knows, JSON is now ubiquitous, and easy to parse, generate, andread, which explains its popularity. JSON also requires no schema, provides notype checking, and it is a UTF-8 based protocol - in other words, easy to workwith, but not very efficient when put on the wire.

 

MessagePack is effectively JSON, but with efficient binaryencoding. Like JSON, there is no type checking or schemas, which depending onyour application can be either be a pro or a con. But, if you are alreadystreaming JSON via an API or using it for storage, then MessagePack can be adrop-in replacement.

 

Avro, on the other hand, is somewhat of a hybrid. In itsscope and functionality it is close to PB and Thrift, but it was designed withdynamic languages in mind. Unlike PB and Thrift, the Avro schema is embeddeddirectly in the header of the messages, which eliminates the need for the extracompile stage. Additionally, the schema itself is just a JSON blob - no customparser required! By enforcing a schema Avro allows us to do data projections(read individual fields out of each record), perform type checking, and enforcethe overall message structure.

 

"The Best" Serialization Format

 

Reflecting on the use of Protocol Buffers at Google and allof the above competitors it is clear that there is no one definitive,"best" option. Rather, each solution makes perfect sense in thecontext it was developed and hence the same logic should be applied to your ownsituation.

 

If you are looking for a battle-tested, strongly typedserialization format, then Protocol Buffers is a great choice. If you also needa variety of built-in RPC mechanisms, then Thrift is worth investigating. Ifyou are already exchanging or working with JSON, then MessagePack is almost adrop-in optimization. And finally, if you like the strongly typed aspects, butwant the flexibility of easy interoperability with dynamic languages, then Avromay be your best bet at this point in time.

 

Ilya GrigorikIlya Grigorik is a web performance engineer anddeveloper advocate at Google, where his focus is on making the web fast anddriving adoption of performance best practices — follow on Twitter, Google+.

View comments (18), share on Twitter (239), Google+,Facebook (2).

High-Performance Browser Networking (O'Reilly)

What every web developer must know about networking andbrowser performance: impact of latency and bandwidth, TCP, UDP, and TLSoptimization, performance tips for mobile networks, and an under the hood lookat performance of HTTP 1.1/2.0, XMLHttpRequest, WebSocket, WebRTC, DataChannel,and other transports.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值