分布式系统中的跨语言通信：Protocol Buffers 和 gRPC

划碎、时光

于 2024-09-16 22:36:39 发布

阅读量987

点赞数 12

分类专栏：系统软件文章标签： rpc 分布式

本文链接：https://blog.csdn.net/weixin_49599385/article/details/141337004

版权

系统软件专栏收录该内容

2 篇文章 0 订阅

订阅专栏

引言

在现代软件开发中，跨语言和跨进程的通信是系统设计中的一个关键挑战。如何高效地在不同编程语言（如 C++、Python、JavaScript）之间进行数据交换，并实现稳定、可靠的远程过程调用（RPC），对构建复杂的分布式系统和微服务架构至关重要。为了解决这些问题，Protocol Buffers（protobuf）和 gRPC 提供了强大而灵活的解决方案。本文将探讨这两个工具如何通过简化数据交换和服务调用，以实现高效、可靠的跨语言通信，从而优化系统设计和提高开发效率。

Protocol Buffers 和 gRPC

Protocol Buffers 是由 Google 开发的数据序列化协议，它旨在提供高效、灵活的序列化机制。相比于传统的 JSON 或 XML，protobuf 生成的二进制格式更加紧凑，传输速度更快，占用空间更小。
gRPC 是一个高性能的远程过程调用框架，建立在 HTTP/2 协议之上，利用 protobuf 作为默认的序列化协议。gRPC 支持多种语言，具有流控制、双向流和高效的并发处理能力，使得它在需要高效、跨平台的服务调用时成为理想选择。

Protocol Buffers 协议

前面提到，Protocol Buffers（protobuf）作为 gRPC 默认的序列化协议，想要使用 gRPC，我们需要深入了解 protobuf 协议。protobuf 协议通常由以下几个关键部分组成：

syntax

指定使用的 Protobuf 语法版本。Protobuf 3 是最新的版本，支持许多新特性。

syntax = "proto3"; // 指定使用 Protobuf 3 语法
// syntax 指令必须放在 Protobuf 文件的最开始，决定了文件所用的语法版本。proto3 是推荐使用的版本，因为它包含许多改进和新特性。

package

定义 Protobuf 文件的命名空间，防止命名冲突。

syntax = "proto3";

package mypackage; // 定义命名空间

message Person {
  int32 id = 1;
  string name = 2;
}
// package 指令为 Protobuf 文件指定一个命名空间。所有定义在这个文件中的消息、服务等都属于 mypackage 命名空间。这样可以避免与其他文件中相同名称的定义冲突。

import

引入其他 Protobuf 文件，允许在当前文件中使用其他文件定义的消息或服务。

syntax = "proto3";

import "other.proto"; // 引入另一个 Protobuf 文件

message Person {
  int32 id = 1;
  string name = 2;
}
// import 指令用于包含其他 Protobuf 文件的定义。在这个例子中，other.proto 文件的内容可以在当前文件中使用，这对于跨文件的定义和复用很有用。

message

消息是 Protobuf 的基本数据结构，用于定义要传输的数据。消息定义包含字段，每个字段都有一个唯一的标识符（即字段编号）和一个数据类型。字段编号在 Protobuf 中是必需的，它们用来在序列化和反序列化过程中标识字段。

syntax = "proto3";

message Person {
  int32 id = 1;        // ID field
  string name = 2;    // Name field
  string email = 3;   // Email field
}
// 在这个示例中，Person 是一个消息类型，包含三个字段：id、name 和 email。id 是整数类型，name 和 email 是字符串类型。每个字段都有一个唯一的编号（1、2、3），这个编号用于在序列化时识别字段。

enum

枚举类型用于定义一组预定义的常量值。它们通常在消息中用作字段的类型，以限制字段的值范围。

syntax = "proto3";

enum Status {
  ACTIVE = 0;
  INACTIVE = 1;
  PENDING = 2;
}
// 在这个示例中，Status 是一个枚举类型，定义了三个可能的状态值：ACTIVE、INACTIVE 和 PENDING。这些值分别对应 0、1 和 2。

service

服务定义了一组 RPC 方法，这些方法可以被客户端调用并在服务器端执行。每个服务方法都有一个请求消息和一个响应消息。

一个 .proto 文件中可以定义多个服务，每个服务可以包含不同的 RPC 方法。

syntax = "proto3";

service PersonService {
  rpc GetPerson (PersonRequest) returns (Person);
}

message PersonRequest {
  int32 id = 1;
}
// 在这个示例中，PersonService 是一个服务，定义了一个名为 GetPerson 的 RPC 方法。该方法接受一个 PersonRequest 消息作为请求，并返回一个 Person 消息作为响应。

字段规则（Field Rules）

字段规则用于定义字段的可选性。在 Protobuf 2 中有三种规则：optional、required 和 repeated。在 Protobuf 3 中，字段默认为可选的，并且新增了 oneof 类型。

optional: 字段是可选的，可以省略。
required: 字段是必需的，必须提供。
repeated: 字段可以出现多次，即字段是一个列表。
oneof: 字段是互斥的，在同一时间只能有一个字段被设置。

syntax = "proto3";

message Person {
  required int32 id = 1;
  optional string name = 2;
  repeated string email = 3;
  oneof details {
   string email = 4;
   string phone = 5;
 }
}
// 在这个示例中，id 是必需的，name 是可选的，而 email 是一个可以有多个值的字段（列表）,details 只能包含 email 或 phone 中的一个字段。

option

自定义 Protobuf 的行为，例如设置特定选项或标志。可以为消息、字段等指定选项。

syntax = "proto3"; 

import "google/protobuf/descriptor.proto"; // 引入 Protobuf 描述符

message Person {
  option (my_option) = "example"; // 自定义选项
  int32 id = 1; 
}
// option 可以用于设置或引用自定义的选项，在这个例子中我们假设存在一个自定义选项 (my_option)。这通常用于指定自定义的元数据或行为。

map

定义一个键值对集合，类似于字典，用于存储具有唯一键的值。

syntax = "proto3";

message Person {
  map<string, string> attributes = 1; // 字符串到字符串的映射
}
// 这里，attributes 是一个 map 类型字段，用于存储任意数量的键值对，其中键和值都是字符串。这在需要动态存储额外信息时很有用。

extend

在现有消息类型中添加额外字段（主要用于 Protobuf 2，Protobuf 3 不再推荐使用）。

在 Protobuf 3 中被废弃主要是因为它容易导致版本管理问题和定义冲突。Protobuf 3 强调更严格的消息定义，使得消息结构在定义后保持不变，增强了代码的稳定性和一致性。使用 Protobuf 3 时，如果需要扩展消息类型，建议使用继承机制或其他设计模式来实现。

syntax = "proto2";

message Person {
  optional string name = 1;
}

extend Person {
  optional string nickname = 100; // 为 Person 添加额外字段
}
// extend 用于在已定义的消息类型中添加新字段。这里我们给 Person 消息添加了一个新的字段 nickname。这种方法在 Protobuf 3 中不再推荐使用，因为它可能导致定义不一致的问题。

嵌套类型（Nested Types）

消息和枚举可以嵌套在其他消息中，允许创建复杂的结构。

message Company {
  message Address {
    string street = 1;
    string city = 2;
  }

  string name = 1;
  Address headquarters = 2;
}
// 在这个示例中，Company 消息包含一个嵌套的 Address 消息。Address 消息定义了公司总部的地址信息。

基本类型

Protobuf 提供了几种基本数据类型，用于定义字段的类型：

int32：有符号的 32 位整数。
int64：有符号的 64 位整数。
uint32：无符号的 32 位整数。
uint64：无符号的 64 位整数。
sint32：有符号的 32 位整数，使用 ZigZag 编码。
sint64：有符号的 64 位整数，使用 ZigZag 编码。
fixed32：无符号的 32 位整数，固定长度。
fixed64：无符号的 64 位整数，固定长度。
sfixed32：有符号的 32 位整数，固定长度。
sfixed64：有符号的 64 位整数，固定长度.
float：单精度浮点数。
double：双精度浮点数.
bool：布尔值（true 或 false）。
string：UTF-8 编码的字符串。
bytes：原始字节序列。

如何使用 gRPC

以下是一个简单的示例，展示如何在 Python 和 C++ 之间实现 gRPC 通信。

定义 example .proto 文件

在 .proto 文件中定义数据结构和 RPC 服务。

syntax = "proto3";

package example;

// 定义一个枚举
enum Status {
  UNKNOWN = 0;
  ACTIVE = 1;
  INACTIVE = 2;
}

// 定义一个消息
message Request {
  int32 id = 1;
  string name = 2;
  repeated string tags = 3;
  oneof detail {
    string email = 4;
    string phone = 5;
  }
  // 使用 map
  map<string, string> metadata = 6;
  // 使用枚举
  Status status = 7;
}

// 定义一个响应消息
message Response {
  string message = 1;
}

// 定义一个服务
service ExampleService {
  rpc GetExampleInfo (Request) returns (Response);
}

生成代码

使用 Protocol Buffers 编译器 protoc 生成所需的客户端和服务器代码。根据所使用的编程语言，生成的代码可能会有所不同，但通常包括消息和服务的实现代码。

# 安装 grpcio-tools:
pip install grpcio-tools

# 生成 Python 代码
python -m grpc_tools.protoc --python_out=. --grpc_python_out=. -I. example.proto
# --python_out=.：指定生成的 Python 消息类文件保存的目录（. 表示当前目录）。
# --grpc_python_out=.：指定生成的 Python gRPC 服务文件保存的目录（. 表示当前目录）。
# -I.：指定 .proto 文件的导入路径（. 表示当前目录）。

# 它会生成以下文件：
# example_pb2.py：包含从 .proto 文件定义的消息类型和枚举的 Python 类。
# example_pb2_grpc.py：包含从 .proto 文件定义的服务和 RPC 方法的 Python 类和接口。

# 安装 Protocol Buffers 编译器：
sudo apt install -y protobuf-compiler
sudo apt-get install libgrpc++-dev

# 生成 C++ 代码
protoc --cpp_out=. --grpc_out=. --plugin=protoc-gen-grpc=$(which grpc_cpp_plugin) example.proto

实现服务器和客户端

Python 服务器 (server.py)

实现 Python 版本的服务端逻辑，包括处理请求并返回响应。

import grpc
from concurrent import futures
import example_pb2
import example_pb2_grpc

class ExampleServiceServicer(example_pb2_grpc.ExampleServiceServicer):
    def GetExampleInfo(self, request, context):
        metadata = ', '.join([f"{k}: {v}" for k, v in request.metadata.items()])
        status = example_pb2.Status.Name(request.status)
        response_message = (
            f"Received ID: {request.id}, Name: {request.name}, Tags: {', '.join(request.tags)}, "
            f"Metadata: {metadata}, Status: {status}"
        )
        if request.HasField('email'):
            response_message += f", Email: {request.email}"
        if request.HasField('phone'):
            response_message += f", Phone: {request.phone}"
        return example_pb2.Response(message=response_message)

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    example_pb2_grpc.add_ExampleServiceServicer_to_server(ExampleServiceServicer(), server)
    server.add_insecure_port('[::]:50051')
    server.start()
    server.wait_for_termination()

if __name__ == '__main__':
    serve()

Python 客户端 (client.py)

实现 Python 版本的客户端逻辑，包括向服务器发送请求并接收响应。

import grpc
import example_pb2
import example_pb2_grpc

def run():
    with grpc.insecure_channel('localhost:50051') as channel:
        stub = example_pb2_grpc.ExampleServiceStub(channel)
        request = example_pb2.Request(
            id=123,
            name="Test",
            tags=["tag1", "tag2"],
            email="test@example.com",
            metadata={"key1": "value1", "key2": "value2"},
            status=example_pb2.ACTIVE
        )
        response = stub.GetExampleInfo(request)
        print(f"Response: {response.message}")

if __name__ == '__main__':
    run()

C++ 服务器 (server.cpp)

实现 C++ 版本的服务端逻辑，处理请求并返回响应。

#include <grpcpp/grpcpp.h>
#include "example.grpc.pb.h"

class ExampleServiceImpl final : public example::ExampleService::Service {
    grpc::Status GetExampleInfo(grpc::ServerContext* context, const example::Request* request, example::Response* response) override {
        std::string metadata;
        for (const auto& pair : request->metadata()) {
            if (!metadata.empty()) metadata += ", ";
            metadata += pair.first() + ": " + pair.second();
        }
        std::string status = example::Status_Name(request->status());
        std::string message = "Received ID: " + std::to_string(request->id()) +
                              ", Name: " + request->name() +
                              ", Tags: " + [&]() {
                                  std::string tags;
                                  for (const auto& tag : request->tags()) {
                                      if (!tags.empty()) tags += ", ";
                                      tags += tag;
                                  }
                                  return tags;
                              }() +
                              ", Metadata: " + metadata +
                              ", Status: " + status +
                              (request->has_email() ? ", Email: " + request->email() : "") +
                              (request->has_phone() ? ", Phone: " + request->phone() : "");
        response->set_message(message);
        return grpc::Status::OK;
    }
};

void RunServer() {
    std::string server_address("0.0.0.0:50051");
    ExampleServiceImpl service;

    grpc::ServerBuilder builder;
    builder.AddListeningPort(server_address, grpc::InsecureServerCredentials());
    builder.RegisterService(&service);

    std::unique_ptr<grpc::Server> server(builder.BuildAndStart());
    server->Wait();
}

int main(int argc, char** argv) {
    RunServer();
    return 0;
}

C++ 客户端 (client.cpp)

实现 C++ 版本的客户端逻辑，向服务器发送请求并接收响应。

#include <grpcpp/grpcpp.h>
#include "example.grpc.pb.h"

void RunClient() {
    auto channel = grpc::CreateChannel("localhost:50051", grpc::InsecureChannelCredentials());
    auto stub = example::ExampleService::NewStub(channel);

    example::Request request;
    request.set_id(123);
    request.set_name("Test");
    request.add_tags("tag1");
    request.add_tags("tag2");
    request.set_email("test@example.com");
    (*request.mutable_metadata())["key1"] = "value1";
    (*request.mutable_metadata())["key2"] = "value2";
    request.set_status(example::ACTIVE);

    example::Response response;
    grpc::ClientContext context;

    grpc::Status status = stub->GetExampleInfo(&context, request, &response);

    if (status.ok()) {
        std::cout << "Response: " << response.message() << std::endl;
    } else {
        std::cout << "RPC failed" << std::endl;
    }
}

int main(int argc, char** argv) {
    RunClient();
    return 0;
}

总结

Protocol Buffers 和 gRPC 提供了一种高效的方式来实现跨语言的远程过程调用。protobuf 的数据序列化机制和 gRPC 的高性能通信能力，结合起来，可以大大简化分布式系统中的服务调用和数据传输。通过掌握这些工具，我们可以构建更快速、可靠的微服务架构，提高系统的整体性能和可维护性。