【RPC】水星Mercury RPC

目录

前言

概述

初始化Mercury

RPC 简单 Hello World例子

服务器代码

客户端代码

传递上下文

Client

Server

RPC 参数和返回值 

输入/输出结构

Client code

Server code 

RDMA 传输

输入/输出结构

Client code

Server code

序列化复杂的数据结构

编译水星库

了解 RPC 和 ULT 模型


前言

代码库:

https://github.com/mercury-hpc/mercury

用户手册:

用户手册:https://mercury-hpc.github.io/user/overview/

概述

Mercury 由三个主要层组成:

网络抽象层(network abstraction layer),它在较低级别的网络结构之上提供高性能通信接口。

RPC 层( RPC layer),它为用户提供发送和接收 RPC 元数据(小消息)的必要组件。这包括函数参数的序列化和反序列化;

批量层( bulk layer),它为处理大参数提供了必要的组件——这意味着通过 RMA 传输大量数据;

(可选)高层 RPC 层( high-level RPC layer),旨在提供方便的 API,构建在较低层之上,并提供用于生成 RPC 存根以及序列化和反序列化功能的宏。

这三个主要层可以总结在下图中:

根据定义,一个 RPC 调用由一个进程发起,称为源(origin),然后转发到另一个进程,该进程将执行该调用,称为目标(target)。

每一方,origin和target,都使用一个 RPC 处理器来序列化和反序列化 (通过接口发送的)参数。调用函数时,如果使用的参数相对较小时 网络抽象层会使用短消息机制,如果参数包含大数据,则使用远程内存访问 (RMA) 机制。请注意,当批量数据足够小时,Mercury 会自动将其与元数据一起嵌入(如果它适合的话)。

初始化Mercury

https://mochi.readthedocs.io/en/latest/mercury/01_init.html

在本教程中,您将学习如何将 Mercury 初始化为客户端和服务器。

初始化为客户端

以下代码举例说明了如何将 Mercury 初始化为客户端。我们首先需要调用 HG_Init 创建一个 hg_class 实例,然后调用 HG_Context_create 创建一个上下文。然后,在终止之前分别立即调用 HG_Context_destroy 和 HG_Finalize 来销毁上下文并和指定的Mercury。

client.c

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>

static hg_class_t*     hg_class     = NULL; /* Pointer to the Mercury class */
static hg_context_t*   hg_context   = NULL; /* Pointer to the Mercury context */

int main(int argc, char** argv)
{
    hg_return_t ret;
    /*
     * Initialize an hg_class.
     * Here we only specify the protocal since this is a client
     * (no need for an address and a port). HG_FALSE indicates that
     * the client will not listen for incoming requests.
     */
    hg_class = HG_Init("tcp", HG_FALSE);
    assert(hg_class != NULL);

    /* Creates a context for the hg_class. */
    hg_context = HG_Context_create(hg_class);
    assert(hg_context != NULL);

    /* Destroy the context. */
    ret = HG_Context_destroy(hg_context);
    assert(ret == HG_SUCCESS);

    /* Finalize the hg_class. */
    hg_return_t err = HG_Finalize(hg_class);
    assert(err == HG_SUCCESS);
    return 0;
}

初始化为服务器

以下代码举例说明了如何将 Mercury 初始化为服务器。和客户端一样,我们调用 HG_Init 和 HG_Context_create,但这次我们将 HG_TRUE 传递给 HG_Init 以指示该进程将侦听到来的请求。

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>

static hg_class_t*     hg_class   = NULL; /* the mercury class */
static hg_context_t*   hg_context = NULL; /* the mercury context */

int main(int argc, char** argv)
{
    hg_return_t ret;

    /* Initialize Mercury and get an hg_class handle.
     * bmi+tcp is the protocol to use.
     * localhost is the address of the server (not useful at the server itself).
     * HG_TRUE is here to specify that mercury will listen for incoming requests.
     * (HG_TRUE on servers, HG_FALSE on clients).
     */
    hg_class = HG_Init("tcp", HG_TRUE);
    assert(hg_class != NULL);

    /* Get the address of the server */
    char hostname[128];
    hg_size_t hostname_size;
    hg_addr_t self_addr;
    HG_Addr_self(hg_class, &self_addr);
    HG_Addr_to_string(hg_class, hostname, &hostname_size, self_addr);
    printf("Server running at address %s\n",hostname);
    HG_Addr_free(hg_class, self_addr);

    /* Creates a Mercury context from the Mercury class. */
    hg_context = HG_Context_create(hg_class);
    assert(hg_context != NULL);

    /* Progress loop */
    do
    {
        /* count will keep track of how many RPCs were treated in a given
         * call to HG_Trigger.
         */
        unsigned int count;
        do {
            /* Executes callbacks.
             * 0 = no timeout, the function just returns if there is nothing to process.
             * 1 = the max number of callbacks to execute before returning.
             * After the call, count will hold the number of callbacks executed.
             */
            ret = HG_Trigger(hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count);
        /* Exit the loop if no event has been processed. */

        /* Make progress on receiving/sending data.
         * 100 is the timeout in milliseconds, for which to wait for network events. */
        HG_Progress(hg_context, 100);
    } while(1); /* another condition should be put here for the loop to terminate */

    /* Destroys the Mercury context. */
    ret = HG_Context_destroy(hg_context);
    assert(ret == HG_SUCCESS);

    /* Finalize Mercury. */
    ret = HG_Finalize(hg_class);
    assert(ret == HG_SUCCESS);

    return 0;
}

此代码还举例说明了典型的 Mercury progress 循环。这个progress 循环交替使用 HG_Progress(它处理网络事件(发送和接收数据)<which makes progress on network events (sending and receiving data)>)和 HG_Trigger(它根据 HG_Progress 中发生的事件调用注册的回调函数。)

HG_Progress:接收面有点类似于poll

HG_Trigger:从context->completion_queue中取出entry,调用回调函数处理。

HG_Forward: RPC 函数和参数封装成handle,调用HG_Forward将handle发送到对端请求RPC。

注意:由于这个服务器还没有提供任何 RPC,它会一直运行直到你杀死它。

RPC 简单 Hello World例子

https://mochi.readthedocs.io/en/latest/mercury/02_hello.html)

在本教程中,我们将注册一个简单地在服务器的标准输出上打印“Hello World”的 RPC。

服务器代码

以下代码显示了如何在服务器上注册 RPC

server.c

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>

static hg_class_t*     hg_class   = NULL; /* the mercury class */
static hg_context_t*   hg_context = NULL; /* the mercury context */

/* after serving this number of rpcs, the server will shut down. */
static const int TOTAL_RPCS = 10;
/* number of RPCS already received. */
static int num_rpcs = 0;

/* 
 * hello_world function to expose as an RPC.
 * This function just prints "Hello World"
 * and increment the num_rpcs variable.
 *
 * All Mercury RPCs must have a signature
 *   hg_return_t f(hg_handle_t h)
 */
hg_return_t hello_world(hg_handle_t h);

/*
 * main function.
 */
int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 2) {
        printf("Usage: %s <protocol>\n", argv[0]);
        exit(0);
    }

    hg_class = HG_Init(argv[1], HG_TRUE);
    assert(hg_class != NULL);

    char hostname[128];
    hg_size_t hostname_size;
    hg_addr_t self_addr;
    HG_Addr_self(hg_class, &self_addr);
    HG_Addr_to_string(hg_class, hostname, &hostname_size, self_addr);
    printf("Server running at address %s\n",hostname);
    HG_Addr_free(hg_class, self_addr);

    hg_context = HG_Context_create(hg_class);
    assert(hg_context != NULL);

    /* Register the RPC by its name ("hello").
     * The two NULL arguments correspond to the functions user to
     * serialize/deserialize the input and output parameters
     * (hello_world doesn't have parameters and doesn't return anything, hence NULL).
     */
    hg_id_t rpc_id = HG_Register_name(hg_class, "hello", NULL, NULL, hello_world);

    /* We call this function to tell Mercury that hello_world will not
     * send any response back to the client.
     */
    HG_Registered_disable_response(hg_class, rpc_id, HG_TRUE);

    do
    {
        unsigned int count;
        do {
            ret = HG_Trigger(hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count);
        HG_Progress(hg_context, 100);
    } while(num_rpcs < TOTAL_RPCS);
    /* Exit the loop if we have reached the given number of RPCs. */

    ret = HG_Context_destroy(hg_context);
    assert(ret == HG_SUCCESS);

    ret = HG_Finalize(hg_class);
    assert(ret == HG_SUCCESS);

    return 0;
}

/* Implementation of the hello_world RPC. */
hg_return_t hello_world(hg_handle_t h)
{
    hg_return_t ret;

    printf("Hello World!\n");
    num_rpcs += 1;
    /* We are not going to use the handle anymore, so we should destroy it. */
    ret = HG_Destroy(h);
    assert(ret == HG_SUCCESS);
    return HG_SUCCESS;
}

要将函数注册为 RPC,它必须将 hg_handle_t 作为参数并返回 hg_return_t 类型的值(如果处理程序正确执行,则通常为 HG_SUCCESS)。此函数(在我们的例子中为 hello_workd)使用 HG_Register_name 注册为 RPC。它返回 RPC 的标识符。我们还调用 HG_Registered_disable_response 来指示此 RPC 不会将任何响应发送回客户端。在 hello_world 的定义中,我们只需在标准输出上打印“Hello World”,然后调用 HG_Destroy 来销毁传递给函数的 RPC 句柄。

客户端代码

下面是对应的客户端代码。

client.c

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>

static hg_class_t*     hg_class   = NULL; /* Pointer to the Mercury class */
static hg_context_t*   hg_context = NULL; /* Pointer to the Mercury context */
static hg_id_t         hello_rpc_id;      /* ID of the RPC */
static int completed = 0;                 /* Variable indicating if the call has completed */

/*
 * This callback will be called after looking up for the server's address.
 * This is the function that will also send the RPC to the servers, then
 * set the completed variable to 1.
 */
hg_return_t lookup_callback(const struct hg_cb_info *callback_info);

int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 3) {
        printf("Usage: %s <protocol> <server_address>\n",argv[0]);
        printf("Example: %s tcp ofi+tcp://1.2.3.4:1234\n",argv[0]);
        exit(0);
    }

    char* protocol = argv[1];
    char* server_address = argv[2];

    hg_class = HG_Init(protocol, HG_FALSE);
    assert(hg_class != NULL);

    hg_context = HG_Context_create(hg_class);
    assert(hg_context != NULL);

    /* Register a RPC function.
     * The first two NULL correspond to what would be pointers to
     * serialization/deserialization functions for input and output datatypes
     * (not used in this example).
     * The third NULL is the pointer to the function (which is on the server,
     * so NULL here on the client).
     */
    hello_rpc_id = HG_Register_name(hg_class, "hello", NULL, NULL, NULL);

    /* Indicate Mercury that we shouldn't expect a response from the server
     * when calling this RPC.
     */
    HG_Registered_disable_response(hg_class, hello_rpc_id, HG_TRUE);

    /* Lookup the address of the server, this is asynchronous and
     * the result will be handled by lookup_callback once we start the progress loop.
     * NULL correspond to a pointer to user data to pass to lookup_callback (we don't use
     * any here). The 4th argument is the address of the server.
     * The 5th argument is a pointer a variable of type hg_op_id_t, which identifies the operation.
     * It can be useful to get this identifier if we want to be able to cancel it using
     * HG_Cancel. Here we don't use it so we pass HG_OP_ID_IGNORE.
     */
    ret = HG_Addr_lookup(hg_context, lookup_callback, NULL, server_address, HG_OP_ID_IGNORE);

    /* Main event loop: we do some progress until completed becomes TRUE. */
    while(!completed)
    {
        unsigned int count;
        do {
            ret = HG_Trigger(hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count && !completed);
        HG_Progress(hg_context, 100);
    }

    ret = HG_Context_destroy(hg_context);
    assert(ret == HG_SUCCESS);

    /* Finalize the hg_class. */
    hg_return_t err = HG_Finalize(hg_class);
    assert(err == HG_SUCCESS);
    return 0;
}

/*
 * This function is called when the address lookup operation has completed.
 */
hg_return_t lookup_callback(const struct hg_cb_info *callback_info)
{
    hg_return_t ret;

    /* First, check that the lookup went fine. */
    assert(callback_info->ret == 0);

    /* Get the address of the server. */
    hg_addr_t addr = callback_info->info.lookup.addr;

    /* Create a call to the hello_world RPC. */
    hg_handle_t handle;
    ret = HG_Create(hg_context, addr, hello_rpc_id, &handle);
    assert(ret == HG_SUCCESS);

    /* Send the RPC. The first NULL correspond to the callback
     * function to call when receiving the response from the server
     * (we don't expect a response, hence NULL here).
     * The second NULL is a pointer to user-specified data that will
     * be passed to the response callback.
     * The third NULL is a pointer to the RPC's argument (we don't
     * use any here).
     */
    ret = HG_Forward(handle, NULL, NULL, NULL);
    assert(ret == HG_SUCCESS);

    /* Free the handle */
    ret = HG_Destroy(handle);
    assert(ret == HG_SUCCESS);

    /* Set completed to 1 so we terminate the loop. */
    completed = 1;
    return HG_SUCCESS;
}

服务端注册函数名hello,指派对应的RPC函数hello_word,客户端注册函数hello。

就像服务器端一样,我们使用 HG_Register_name 来注册 RPC,这次传递 NULL 而不是函数指针作为最后一个参数。我们还调用 HG_Registered_disable_response 表示服务器不会发回响应。

HG_Addr_lookup 用于查找服务器的地址。此函数将回调作为其第二个参数。此回调必须是一个接受 const struct hg_cb_info* 并返回 hg_return_t 类型值的函数。当地址查找完成时将调用它。

接下来,我们进入一个类似于服务器的进度循环。这是因为我们正在等待 HG_Addr_lookup 完成。提供的回调将从 HG_Trigger 内部执行。在lookup_callback 函数中,我们可以从callback_info->info.lookup.addr 获取服务器的地址。此地址可用于使用 HG_Create 创建 RPC 实例,并使用 HG_Forward 转发它。

由于我们不期望任何响应,我们可以立即调用 HG_Destroy 来销毁我们刚刚转发的 RPC 句柄。我们将完成设置为 1 以退出 main 中的进度循环。 

传递上下文

上一个教程使用全局静态变量来使诸如 hg_context 和 hg_class 之类的资源可以在回调中访问。但任何优秀的开发人员都会禁止这种做法,因此我们将使用局部变量来代替前面的做法。

Client

在客户端,我们将 一个上下文封装在client_data_t 结构中。通过将指向该结构的指针作为 HG_Addr_lookup 的第三个参数传递,我们可以在回调中将其恢复为 callback_info->arg。这让我们可以将 hg_class、hg_context 和 hello_rpc_id 从 main 传递到 lookup_callback 函数。

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>

typedef struct {
    hg_class_t*   hg_class;
    hg_context_t* hg_context;
    hg_id_t       hello_rpc_id;
    int           completed;
} client_data_t;

hg_return_t lookup_callback(const struct hg_cb_info *callback_info);

int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 3) {
        printf("Usage: %s <protocol> <server_address>\n",argv[0]);
        printf("Example: %s tcp ofi+tcp://1.2.3.4:1234\n",argv[0]);
        exit(0);
    }

    client_data_t client_data = {
        .hg_class     = NULL,
        .hg_context   = NULL,
        .hello_rpc_id = 0,
        .completed    = 0
    };

    char* protocol = argv[1];
    char* server_address = argv[2];

    client_data.hg_class = HG_Init(protocol, HG_FALSE);
    assert(client_data.hg_class != NULL);

    client_data.hg_context = HG_Context_create(client_data.hg_class);
    assert(client_data.hg_context != NULL);

    client_data.hello_rpc_id = HG_Register_name(client_data.hg_class, "hello", NULL, NULL, NULL);

    HG_Registered_disable_response(client_data.hg_class, client_data.hello_rpc_id, HG_TRUE);

    /* We pass a pointer to the client's data as 3rd argument */
    ret = HG_Addr_lookup(client_data.hg_context, lookup_callback, &client_data, server_address, HG_OP_ID_IGNORE);

    while(!client_data.completed)
    {
        unsigned int count;
        do {
            ret = HG_Trigger(client_data.hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count && !client_data.completed);
        HG_Progress(client_data.hg_context, 100);
    }

    ret = HG_Context_destroy(client_data.hg_context);
    assert(ret == HG_SUCCESS);

    hg_return_t err = HG_Finalize(client_data.hg_class);
    assert(err == HG_SUCCESS);
    return 0;
}

hg_return_t lookup_callback(const struct hg_cb_info *callback_info)
{
    hg_return_t ret;
    assert(callback_info->ret == 0);

    /* Get the client's data */
    client_data_t* client_data = (client_data_t*)(callback_info->arg);

    hg_addr_t addr = callback_info->info.lookup.addr;

    hg_handle_t handle;
    ret = HG_Create(client_data->hg_context, addr, client_data->hello_rpc_id, &handle);
    assert(ret == HG_SUCCESS);

    ret = HG_Forward(handle, NULL, NULL, NULL);
    assert(ret == HG_SUCCESS);

    ret = HG_Destroy(handle);
    assert(ret == HG_SUCCESS);

    client_data->completed = 1;
    return HG_SUCCESS;
}

Server

在服务器端,我们将我们的信息封装在一个 server_data_t 结构中。我们使用 HG_Register_data 将指向该结构的指针附加到 RPC 处理程序(第四个参数 NULL,对应于在 RPC 处理程序注销时调用以释放指针的函数。由于我们的结构在堆栈上,所以我们不需要提供任何此类功能)。在 hello_world 处理程序中,我们使用 HG_Get_info 和 HG_Registered_data 恢复指向 server_data_t 结构的指针。

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>

typedef struct {
    hg_class_t*   hg_class;
    hg_context_t* hg_context;
    int           max_rpcs;
    int           num_rpcs;
} server_data_t;

hg_return_t hello_world(hg_handle_t h);

int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 2) {
        printf("Usage: %s <protocol>\n", argv[0]);
        exit(0);
    }

    server_data_t server_data = {
        .hg_class = NULL,
        .hg_context = NULL,
        .max_rpcs = 4,
        .num_rpcs = 0
    };

    server_data.hg_class = HG_Init(argv[1], HG_TRUE);
    assert(server_data.hg_class != NULL);

    char hostname[128];
    hg_size_t hostname_size;
    hg_addr_t self_addr;
    HG_Addr_self(server_data.hg_class, &self_addr);
    HG_Addr_to_string(server_data.hg_class, hostname, &hostname_size, self_addr);
    printf("Server running at address %s\n",hostname);
    HG_Addr_free(server_data.hg_class, self_addr);

    server_data.hg_context = HG_Context_create(server_data.hg_class);
    assert(server_data.hg_context != NULL);

    hg_id_t rpc_id = HG_Register_name(server_data.hg_class, "hello", NULL, NULL, hello_world);

    /* Register data with the RPC handler */
    HG_Register_data(server_data.hg_class, rpc_id, &server_data, NULL);

    HG_Registered_disable_response(server_data.hg_class, rpc_id, HG_TRUE);

    do
    {
        unsigned int count;
        do {
            ret = HG_Trigger(server_data.hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count);
        HG_Progress(server_data.hg_context, 100);
    } while(server_data.num_rpcs < server_data.max_rpcs);

    ret = HG_Context_destroy(server_data.hg_context);
    assert(ret == HG_SUCCESS);

    ret = HG_Finalize(server_data.hg_class);
    assert(ret == HG_SUCCESS);

    return 0;
}

/* Implementation of the hello_world RPC. */
hg_return_t hello_world(hg_handle_t h)
{
    hg_return_t ret;

    /* Get the hg_class_t instance from the handle */
    const struct hg_info *info = HG_Get_info(h);
    hg_class_t* hg_class = info->hg_class;
    hg_id_t     rpc_id   = info->id;

    /* Get the data attached to the RPC handle */
    server_data_t* server_data = (server_data_t*)HG_Registered_data(hg_class, rpc_id);

    printf("Hello World!\n");
    server_data->num_rpcs += 1;

    ret = HG_Destroy(h);
    assert(ret == HG_SUCCESS);
    return HG_SUCCESS;
}

RPC 参数和返回值 

在之前的教程中,我们没有向 RPC 处理程序传递或返回任何数据。在本教程中,我们将了解如何将数据作为参数发送到 RPC,并从 RPC 返回数据。我们将以计算客户端发送的两个数字之和的 RPC 为例。

输入/输出结构

首先,我们需要声明 RPC 参数和返回值的类型。这是使用mercury 宏在mercury_macros.h 头文件中完成的,如下所示。

types.h

#ifndef PARAM_H
#define PARAM_H

#include <mercury.h>
#include <mercury_macros.h>

MERCURY_GEN_PROC(sum_in_t,
        ((int32_t)(x))\
        ((int32_t)(y)))

MERCURY_GEN_PROC(sum_out_t, ((int32_t)(ret)))

#endif

Client code

以下代码查找服务器的地址,然后向服务器发送一个 RPC。

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>
#include "types.h"

typedef struct {
    hg_class_t*   hg_class;
    hg_context_t* hg_context;
    hg_id_t       sum_rpc_id;
    int           completed;
} client_state_t;

hg_return_t lookup_callback(const struct hg_cb_info *callback_info);
hg_return_t sum_completed(const struct hg_cb_info *info);

int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 3) {
        printf("Usage: %s <protocol> <server_address>\n",argv[0]);
        printf("Example: %s tcp tcp://1.2.3.4:1234\n",argv[0]);
        exit(0);
    }
    char* protocol = argv[1];
    char* server_address = argv[2];

    client_state_t state;
    state.completed = 0;

    state.hg_class = HG_Init(protocol, HG_FALSE);
    assert(state.hg_class != NULL);

    state.hg_context = HG_Context_create(state.hg_class);
    assert(state.hg_context != NULL);

    state.sum_rpc_id = MERCURY_REGISTER(state.hg_class, "sum", sum_in_t, sum_out_t, NULL);

    ret = HG_Addr_lookup(state.hg_context, lookup_callback, &state, server_address, HG_OP_ID_IGNORE);

    while(!state.completed)
    {
        unsigned int count;
        do {
            ret = HG_Trigger(state.hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count && !state.completed);
        HG_Progress(state.hg_context, 100);
    }

    ret = HG_Context_destroy(state.hg_context);
    assert(ret == HG_SUCCESS);

    hg_return_t err = HG_Finalize(state.hg_class);
    assert(err == HG_SUCCESS);
    return 0;
}


hg_return_t lookup_callback(const struct hg_cb_info *callback_info)
{
    hg_return_t ret;

    /* We get the pointer to the engine_state here. */
    client_state_t* state = (client_state_t*)(callback_info->arg);

    assert(callback_info->ret == 0);
    hg_addr_t addr = callback_info->info.lookup.addr;

    hg_handle_t handle;
    ret = HG_Create(state->hg_context, addr, state->sum_rpc_id, &handle);
    assert(ret == HG_SUCCESS);

    sum_in_t in;
    in.x = 42;
    in.y = 23;

    ret = HG_Forward(handle, sum_completed, state, &in);
    assert(ret == HG_SUCCESS);

    ret = HG_Addr_free(state->hg_class, addr);
    assert(ret == HG_SUCCESS);

    return HG_SUCCESS;
}

hg_return_t sum_completed(const struct hg_cb_info *info)
{
    hg_return_t ret;

    client_state_t* state = (client_state_t*)(info->arg);

    sum_out_t out;
    assert(info->ret == HG_SUCCESS);

    ret = HG_Get_output(info->info.forward.handle, &out);
    assert(ret == HG_SUCCESS);

    printf("Got response: %d\n", out.ret);

    ret = HG_Free_output(info->info.forward.handle, &out);
    assert(ret == HG_SUCCESS);

    ret = HG_Destroy(info->info.forward.handle);
    assert(ret == HG_SUCCESS);

    state->completed = 1;

    return HG_SUCCESS;
}

与上一个教程相比的主要区别在于,我们将指向 sum_in_t 结构的指针传递以及完成回调 sum_completed给 HG_Forward,当服务器响应时,将调用此完成回调。在此回调中,HG_Get_output 用于取回服务器发送的output数据。我们需要调用 HG_Free_output 来释放使用后的output 。另请注意,HG_Destroy 现在在完成回调中使用,而不是在 HG_Forward 之后使用。

Server code 

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>
#include "types.h"

typedef struct {
    hg_class_t*     hg_class;
    hg_context_t*   hg_context;
    int             num_rpcs;
} server_state;

static const int TOTAL_RPCS = 10;

hg_return_t sum(hg_handle_t h);

int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 2) {
        printf("Usage: %s <server address>\n", argv[0]);
        exit(0);
    }

    const char* server_address = argv[1];

    server_state state; // Instance of the server's state
    state.num_rpcs = 0;

    state.hg_class = HG_Init(server_address, HG_TRUE);
    assert(state.hg_class != NULL);

    char hostname[128];
    hg_size_t hostname_size;
    hg_addr_t self_addr;
    HG_Addr_self(state.hg_class, &self_addr);
    HG_Addr_to_string(state.hg_class, hostname, &hostname_size, self_addr);
    printf("Server running at address %s\n",hostname);
    HG_Addr_free(state.hg_class, self_addr);

    state.hg_context = HG_Context_create(state.hg_class);
    assert(state.hg_context != NULL);

    hg_id_t rpc_id = MERCURY_REGISTER(state.hg_class, "sum", sum_in_t, sum_out_t, sum);

    /* Attach the local server_state to the RPC so we can get a pointer to it when
     * the RPC is invoked. */
    ret = HG_Register_data(state.hg_class, rpc_id, &state, NULL);

    do
    {
        unsigned int count;
        do {
            ret = HG_Trigger(state.hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count);

        HG_Progress(state.hg_context, 100);
    } while(state.num_rpcs < TOTAL_RPCS);

    ret = HG_Context_destroy(state.hg_context);
    assert(ret == HG_SUCCESS);

    ret = HG_Finalize(state.hg_class);
    assert(ret == HG_SUCCESS);

    return 0;
}

hg_return_t sum(hg_handle_t handle)
{
    hg_return_t ret;
    sum_in_t in;
    sum_out_t out;

    const struct hg_info* info = HG_Get_info(handle);
    server_state* state = HG_Registered_data(info->hg_class, info->id);

    ret = HG_Get_input(handle, &in);
    assert(ret == HG_SUCCESS);

    out.ret = in.x + in.y;
    printf("%d + %d = %d\n",in.x,in.y,in.x+in.y);
    state->num_rpcs += 1;

    ret = HG_Respond(handle,NULL,NULL,&out);
    assert(ret == HG_SUCCESS);

    ret = HG_Free_input(handle, &in);
    assert(ret == HG_SUCCESS);
    ret = HG_Destroy(handle);
    assert(ret == HG_SUCCESS);

    return HG_SUCCESS;
}

 在服务器端,我们使用 HG_Get_input 将input数据反序列化为 sum_in_t 结构。用完input数据后,我们使用 HG_Free_input。然后HG_Respond将一个指向 sum_out_t 对象的指针返回给客户端。

RDMA 传输

Mercury 可以使用 RDMA 传输大量数据。在本教程中,我们将通过将文件内容从客户端传输到服务器来演示如何使用此功能。

输入/输出结构

就像我们前面的例子一样,我们需要定义用于 RPC 输入和输出的结构。这些如下。

types.h

#ifndef PARAM_H
#define PARAM_H

#include <mercury.h>
#include <mercury_bulk.h>
#include <mercury_proc_string.h>
#include <mercury_macros.h>

MERCURY_GEN_PROC(save_in_t,
    ((hg_string_t)(filename))\
	((hg_size_t)(size))\
    ((hg_bulk_t)(bulk_handle)))

MERCURY_GEN_PROC(save_out_t, ((int32_t)(ret)))

#endif

 客户端将发送文件名 (hg_string_t)、文件大小 (hg_size_t) 和表示客户端公开的内存区域并包含文件内容的bulk handle。服务器将简单地响应一个整数,指示操作是否成功。

Client code

客户端代码如下

#include <assert.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <mercury.h>
#include "types.h"

typedef struct {
    hg_class_t*   hg_class;
    hg_context_t* hg_context;
    hg_id_t       save_rpc_id;
    int           completed;
} client_state;

typedef struct {
    client_state*   state;
    hg_bulk_t       bulk_handle;
    void*           buffer;
    size_t          size;
    char*           filename;
} save_operation;

hg_return_t lookup_callback(const struct hg_cb_info *callback_info);
hg_return_t save_completed(const struct hg_cb_info *info);

int main(int argc, char** argv)
{
    if(argc != 4) {
        fprintf(stderr,"Usage: %s <protocol> <server address> <filename>\n", argv[0]);
        exit(0);
    }

    hg_return_t ret;

    const char* protocol = argv[1];

    /* Local instance of the client_state. */
    client_state state;
    state.completed = 0;
    // Initialize an hg_class.
    state.hg_class = HG_Init(protocol, HG_FALSE);
    assert(state.hg_class != NULL);

    // Creates a context for the hg_class.
    state.hg_context = HG_Context_create(state.hg_class);
    assert(state.hg_context != NULL);

    // Register a RPC function
    state.save_rpc_id = MERCURY_REGISTER(state.hg_class, "save", save_in_t, save_out_t, NULL);

    // Create the save_operation structure
    save_operation save_op;
    save_op.state = &state;
    save_op.filename = argv[3];
    if(access(save_op.filename, F_OK) == -1) {
        fprintf(stderr,"File %s doesn't exist or cannot be accessed.\n",save_op.filename);
        exit(-1);
    } 

    char* server_address = argv[2];
    ret = HG_Addr_lookup(state.hg_context, lookup_callback, &save_op, server_address, HG_OP_ID_IGNORE);

    // Main event loop
    while(!state.completed)
    {
        unsigned int count;
        do {
            ret = HG_Trigger(state.hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count && !state.completed);
        HG_Progress(state.hg_context, 100);
    }

    // Destroy the context
    ret = HG_Context_destroy(state.hg_context);
    assert(ret == HG_SUCCESS);

    // Finalize the hg_class.
    hg_return_t err = HG_Finalize(state.hg_class);
    assert(err == HG_SUCCESS);
    return 0;
}


hg_return_t lookup_callback(const struct hg_cb_info *callback_info)
{
    hg_return_t ret;

    assert(callback_info->ret == 0);

    /* We get the pointer to the client_state here. */
    save_operation* save_op = (save_operation*)(callback_info->arg);
    client_state* state = save_op->state;

    /* Check file size to allocate buffer. */
    FILE* file = fopen(save_op->filename,"r");
    fseek(file, 0L, SEEK_END);
    save_op->size = ftell(file);
    fseek(file, 0L, SEEK_SET);
    save_op->buffer = calloc(1, save_op->size);
    size_t bytes_read = fread(save_op->buffer,1,save_op->size,file);
    fclose(file);

    hg_addr_t addr = callback_info->info.lookup.addr;
    hg_handle_t handle;
    ret = HG_Create(state->hg_context, addr, state->save_rpc_id, &handle);
    assert(ret == HG_SUCCESS);

    save_in_t in;
    in.filename = save_op->filename;
    in.size     = save_op->size; 

    ret = HG_Bulk_create(state->hg_class, 1, (void**) &(save_op->buffer), &(save_op->size),
            HG_BULK_READ_ONLY, &(save_op->bulk_handle));
    assert(ret == HG_SUCCESS);
    in.bulk_handle = save_op->bulk_handle;

    /* The state pointer is passed along as user argument. */
    ret = HG_Forward(handle, save_completed, save_op, &in);
    assert(ret == HG_SUCCESS);

    /* Free the address. */
    ret = HG_Addr_free(state->hg_class, addr);
    assert(ret == HG_SUCCESS);

    return HG_SUCCESS;
}

hg_return_t save_completed(const struct hg_cb_info *info)
{
    hg_return_t ret;

    /* Get the state pointer from the user-provided arguments. */
    save_operation* save_op = (save_operation*)(info->arg);
    client_state* state = (client_state*)(save_op->state);

    save_out_t out;
    assert(info->ret == HG_SUCCESS);

    ret = HG_Get_output(info->info.forward.handle, &out);
    assert(ret == HG_SUCCESS);

    printf("Got response: %d\n", out.ret);

    ret = HG_Bulk_free(save_op->bulk_handle);
    assert(ret == HG_SUCCESS);

    ret = HG_Free_output(info->info.forward.handle, &out);
    assert(ret == HG_SUCCESS);

    ret = HG_Destroy(info->info.forward.handle);
    assert(ret == HG_SUCCESS);

    state->completed = 1;

    return HG_SUCCESS;
}

我们定义了一个 save_operation 结构来保存有关正在进行的操作的信息。该结构将通过指针作为用户提供的参数传递给回调。在lookup回调中,我们打开文件并将其内容读入缓冲区。然后我们使用 HG_Bulk_create 为 RDMA 操作公开缓冲区。这给了我们一个可以通过 RPC 发送到服务器的 hg_bulk_t 对象。RPC 完成并收到响应后,使用 HG_Bulk_free 释放 hg_bulkt_t 对象。

Server code

以下代码对应服务器。

Server.c

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mercury.h>
#include "types.h"

/* This structure will encapsulate data about the server. */
typedef struct {
    hg_class_t*     hg_class;
    hg_context_t*   hg_context;
} server_state;

typedef struct {
    char*       filename;
    hg_size_t   size;
    void*       buffer;
    hg_bulk_t   bulk_handle;
    hg_handle_t handle;
} rpc_state;

static hg_return_t save_bulk_completed(const struct hg_cb_info *info);
static hg_return_t save(hg_handle_t h);

int main(int argc, char** argv)
{
    hg_return_t ret;

    if(argc != 2) {
        printf("Usage: %s <server address>\n", argv[0]);
        exit(0);
    }

    const char* server_address = argv[1];

    server_state state; // Instance of the server's state

    state.hg_class = HG_Init(server_address, HG_TRUE);
    assert(state.hg_class != NULL);

    /* Get the address of the server */
    char hostname[128];
    hg_size_t hostname_size;
    hg_addr_t self_addr;
    HG_Addr_self(state.hg_class,&self_addr);
    HG_Addr_to_string(state.hg_class, hostname, &hostname_size, self_addr);
    printf("Server running at address %s\n",hostname);

    state.hg_context = HG_Context_create(state.hg_class);
    assert(state.hg_context != NULL);

    hg_id_t rpc_id = MERCURY_REGISTER(state.hg_class, "save", save_in_t, save_out_t, save);

    /* Attach the local server_state to the RPC so we can get a pointer to it when
     * the RPC is invoked. */
    ret = HG_Register_data(state.hg_class, rpc_id, &state, NULL);

    do
    {
        unsigned int count;
        do {
            ret = HG_Trigger(state.hg_context, 0, 1, &count);
        } while((ret == HG_SUCCESS) && count);

        HG_Progress(state.hg_context, 100);
    } while(1);

    ret = HG_Context_destroy(state.hg_context);
    assert(ret == HG_SUCCESS);

    ret = HG_Finalize(state.hg_class);
    assert(ret == HG_SUCCESS);

    return 0;
}

hg_return_t save(hg_handle_t handle)
{
    hg_return_t ret;
    save_in_t in;

    // Get the server_state attached to the RPC.
    const struct hg_info* info = HG_Get_info(handle);
    server_state* stt = HG_Registered_data(info->hg_class, info->id);

    ret = HG_Get_input(handle, &in);
    assert(ret == HG_SUCCESS);

    rpc_state* my_rpc_state = (rpc_state*)calloc(1,sizeof(rpc_state));
    my_rpc_state->handle = handle;
    my_rpc_state->filename = strdup(in.filename);
    my_rpc_state->size = in.size;
    my_rpc_state->buffer = calloc(1,in.size);

    ret = HG_Bulk_create(stt->hg_class, 1, &(my_rpc_state->buffer),
            &(my_rpc_state->size), HG_BULK_WRITE_ONLY, &(my_rpc_state->bulk_handle));
    assert(ret == HG_SUCCESS);

    /* initiate bulk transfer from client to server */
    ret = HG_Bulk_transfer(stt->hg_context, save_bulk_completed,
            my_rpc_state, HG_BULK_PULL, info->addr, in.bulk_handle, 0,
            my_rpc_state->bulk_handle, 0, my_rpc_state->size, HG_OP_ID_IGNORE);
    assert(ret == HG_SUCCESS);

    ret = HG_Free_input(handle, &in);
    assert(ret == HG_SUCCESS);
    return HG_SUCCESS;
}

hg_return_t save_bulk_completed(const struct hg_cb_info *info)
{
    assert(info->ret == 0);

    rpc_state* my_rpc_state = info->arg;
    hg_return_t ret;

    FILE* f = fopen(my_rpc_state->filename,"w+");
    fwrite(my_rpc_state->buffer, 1, my_rpc_state->size, f);
    fclose(f);

    printf("Writing file %s\n", my_rpc_state->filename);

    save_out_t out;
    out.ret = 0;

    ret = HG_Respond(my_rpc_state->handle, NULL, NULL, &out);
    assert(ret == HG_SUCCESS);
    (void)ret;

    HG_Bulk_free(my_rpc_state->bulk_handle);
    HG_Destroy(my_rpc_state->handle);
    free(my_rpc_state->filename);
    free(my_rpc_state->buffer);
    free(my_rpc_state);

    return HG_SUCCESS;
}

在服务器上,rpc_state 结构将用于跟踪有关正在进行的operation的信息。特别是,它包含正在进行的 RPC 的 hg_handle_t 对象,以及为接收数据而暴露的本地缓冲区的 hg_bulk_t 对象。

收到 RPC 后,我们进入save 回调。此函数分配一个本地缓冲区来接收数据并使用 HG_Bulk_create 公开它。我们使用 HG_Bulk_transfer 发出 RDMA 操作,指定 HG_BULK_PULL 类型的操作,并将 save_bulk_completed 作为回调,以便在 RDMA 操作完成后调用。

需要注意的是,此函数立即返回,此时 RDMA 操作尚未完成。save 回调将返回,Mercury progress 循环将继续运行,最终在 RDMA 操作完成时调用 save_bulk_completed。

请注意,我们不会在save 回调中响应客户端,而是在 save_bulk_completed 回调中响应,因此save 回调不会破坏 RPC 的 hg_handle_t 对象。该对象在 save_bulk_completed 中保留和释放。

啊,回调……(现在你明白了 Margo 和 Thallium 是多么容易)。

序列化复杂的数据结构

让我们回到序列化/反序列化数据结构。在之前的教程中,我们一直使用可以使用 Mercury 的 MERCURY_GEN_PROC 宏定义的结构。如果结构包含指针,事情会变得更加复杂。假设我们有一个 int_list_t 类型,它表示一个指向整数链表的指针。

typedef struct int_list {
   int32_t          value;
   struct int_list* next;
} *int_list_t;

我们需要定义一个函数 hg_return_t hg_proc_int_list_t(hg_proc_t proc, void *data)。更一般地,对于我们想要发送或接收的任何自定义类型 X,并且还没有使用 Mercury 宏创建,我们需要一个形式为 hg_return_t hg_proc_X(hg_proc_t proc, void *data) 的函数。在我们的例子中,这个函数如下所示。

types.h
#ifndef __TYPES_H
#define __TYPES_H

#include <mercury.h>

typedef struct int_list {
    int32_t          value;
    struct int_list* next;
} *int_list_t;

static inline hg_return_t hg_proc_int_list_t(hg_proc_t proc, void* data)
{
    hg_return_t ret;
    int_list_t* list = (int_list_t*)data;

    hg_size_t length = 0;
    int_list_t tmp   = NULL;
    int_list_t prev  = NULL;

    switch(hg_proc_get_op(proc)) {

        case HG_ENCODE:
            tmp = *list;
            // find out the length of the list
            while(tmp != NULL) {
                tmp = tmp->next;
                length += 1;
            }
            // write the length
            ret = hg_proc_hg_size_t(proc, &length);
            if(ret != HG_SUCCESS)
                break;
            // write the list
            tmp = *list;
            while(tmp != NULL) {
                ret = hg_proc_int32_t(proc, &tmp->value);
                if(ret != HG_SUCCESS)
                    break;
                tmp = tmp->next;
            }
            break;

        case HG_DECODE:
            // find out the length of the list
            ret = hg_proc_hg_size_t(proc, &length);
            if(ret != HG_SUCCESS)
                break;
            // loop and create list elements
            *list = NULL;
            while(length > 0) {
                tmp = (int_list_t)calloc(1, sizeof(*tmp));
                if(*list == NULL) {
                    *list = tmp;
                }
                if(prev != NULL) {
                    prev->next = tmp;
                }
                ret = hg_proc_int32_t(proc, &tmp->value);
                if(ret != HG_SUCCESS)
                    break;
                prev = tmp;
                length -= 1;
            }
            break;

        case HG_FREE:
            tmp = *list;
            while(tmp != NULL) {
                prev = tmp;
                tmp  = prev->next;
                free(prev);
            }
            ret = HG_SUCCESS;
    }
    return ret;
}

#endif

任何 proc 函数都必须包含三个部分,由开关分隔。当 proc 句柄将现有对象序列化到缓冲区中时,使用 HG_ENCODE 部分。当 proc 句柄从其缓冲区的内容创建新对象时,使用 HG_DECODE 部分。释放对象时使用 HG_FREE 部分,例如调用 HG_Free_input 或 HG_Free_output 时。注意这里我们处理的类型是 int_list_t,所以 void* 数据参数实际上是一个指向 int_list_t 的指针,它本身就是一个指向结构的指针。我们使用 hg_proc_int32_t 和 hg_proc_hg_size_t 函数分别序列化/反序列化 int32_t 和 hg_size_t。大多数基本数据类型都在 Mercury 中定义了这样的函数。要序列化/反序列化原始内存,可以使用 hg_proc_raw(hg_proc_t proc, void* data, hg_size_t size),它将复制 data 指向的内存内容的 size 个字节。

编译水星库

mercury-hpc/mercury: Mercury is a C library for implementing RPC, optimized for HPC. (github.com)

其中的一些说明:

git clone --recursive https://github.com/mercury-hpc/mercury.git

下载下来后,cd 进入目录,然后执行:

cd mercury-X
mkdir build
cd build
ccmake ..

然后会出现一个UI界面,按C,让cmake生成cache,然后ccmake就会读到cache,并显示下面的列表,这就是ccmake 将cmake的配置项都列了出来,让用户直观的看到了项目有哪些配置项:

BUILD_SHARED_LIBS                ON (or OFF if the library you link
                                 against requires static libraries)
BUILD_TESTING                    ON/OFF
Boost_INCLUDE_DIR                /path/to/include/directory
CMAKE_INSTALL_PREFIX             /path/to/install/directory
MERCURY_ENABLE_DEBUG             ON/OFF
MERCURY_TESTING_ENABLE_PARALLEL  ON/OFF
MERCURY_USE_BOOST_PP             ON
MERCURY_USE_CHECKSUMS            ON/OFF
MERCURY_USE_SYSTEM_BOOST         ON/OFF
MERCURY_USE_SYSTEM_MCHECKSUM     ON/OFF
MERCURY_USE_XDR                  OFF
NA_USE_BMI                       ON/OFF
NA_USE_MPI                       ON/OFF
NA_USE_OFI                       ON/OFF
NA_USE_PSM                       ON/OFF
NA_USE_PSM2                      ON/OFF
NA_USE_SM                        ON/OFF
NA_USE_UCX                       ON/OFF

按enter可以选择开关选项,按t可以显示更多的选项,如依赖库路径,头文件路径等。

然后按g 生成makefile,然后就可以make编译和make install 安装了。

了解 RPC 和 ULT 模型

注:

  • Margo 是一个 C 库,帮助开发基于 RPC 和 RDMA 的分布式服务。(在mercury层上面)https://mochi.readthedocs.io/en/latest/margo.html
  • Mochi   一个软件定义存储的项目,( A Software Defined Storage Approach to Exascale Storage Services)

在使用 Margo 或 Thallium 开发 Mochi 服务时,记住 RPC 在到达服务器时如何转换为用户级线程 (ULT) 会很有用。

下图总结了当客户端向服务器发送 RPC 时会发生什么,服务器端的 RPC 处理程序包括一些 RDMA 操作。在此图中,我们仅显示客户端的一个执行流,假设它已在没有 Mercury  progress thread的情况下初始化了 Margo(或 Thallium)。在客户端上使用 Mercury  progress thread的情况类似,因为progress thread只是代表调用者线程(caller thread)处理网络活动。

此图显示了一个使用 margo_iforward 的客户端,它以非阻塞方式向服务器发送 RPC。margo_forward 的情况可以看作是相同的场景,但是margo_wait 在margo_iforward 之后立即被调用。在Thallium中,等效代码将使用 callable_remote_procedure 对象的异步成员函数。

Understanding the RPC and ULT model — Mochi documentation

解释

https://mochi.readthedocs.io/en/latest/general/03_rpc_model.html

margo_forward 和 margo_iforward 首先调用序列化函数(由用户在使用 MARGO_REGISTER 注册 RPC 时提供)将 RPC 参数序列化到输入缓冲区中。Mercury 然后向服务器发送一个请求,包括这个缓冲区。

在服务器中,Mercury  progress 循环(可能在专用执行流上执行)最终看到请求并调用相应的回调(黄色)。此回调已由用户代码中的 DEFINE_MARGO_RPC_HANDLER 自动生成。此回调 (1) 查找应该在其中执行 RPC 的 Argobots 池,并且 (2) 在该池中创建一个 ULT。此 ULT 将运行用户的 处理RPC 的程序。

处理RPC程序的 ULT (RPC handler ULTs)发布在一个池中,可能使用不同的执行流 (ES)(处理),该池在Mercury progress 循环中使用。例如,当调用 margo_init(..., 1, 8) 时,将创建 8 个 执行流 (ES) 以及一个共享池, 处理RPC程序的 ULT (RPC handler ULTs) 将被放置(到这个共享池中)。当其中一个 执行流 (ES) 空闲时,它会从池中拉出一个 ULT 并执行它。

通常,RPC 处理程序(RPC handler )将通过调用 margo_get_input 反序列化 RPC 的参数开始。这会调用用户提供的序列化函数将 Mercury 缓冲区的内容反序列化为用户的输入数据结构。

使用 margo_bulk_transfer 执行bulk transfer (RDMA)时,ULT 让 Mercury progress 循环执行传输。同时,这个 ULT 线程抑制/让步/让出时间片,以便它所在的 执行流 (ES) 可以执行其他 ULT(例如其他 RPC 请求)。

Mercury progress 循环最终执行 RDMA 操作并通知calling ULT。calling ULT被标记为就绪,最终将恢复。

当 RPC 处理程序(RPC handler)调用 margo_respond 向客户端发送响应时,它首先调用用户提供的序列化函数将repson  encode到 Mercury 的缓冲区中,然后让出时间片,等待 Mercury progress 循环发送repson  并允许 执行流 (ES)潜在地同时执行其他 RPC。

一旦 Mercury 发送repson  ,Mercury progress 循环就会通知 RPC handler ULT,该处理程序最终会恢复并完成。

最后,margo_wait 在客户端完成。然后客户端可以在 RPC handler上调用 margo_get_output 以使用用户提供的反序列化函数反序列化 RPC 的输出。 

注意

无论 Mercury 进度循环(progress  loop)是否在单独的 ULT 上运行,此模型仍然有效。确实你可能会注意到,从单个 RPC 操作的角度来看,这里显示的服务器上的两个 ES 可以合并为一个。将 ES 专用于 Mercury progress 的优点是多个并发的 RPC 处理程序可以以最小的干扰依赖它。如果 Mercury 进度循环(progress  loop)与 RPC 处理程序在相同的 ES 中运行,则在处理程序中调用 margo_respond 可能会让给另一个(可能长时间运行的)RPC handle而不是进度循环,从而延迟第一个 RPC handle的完成。

DAOS中的ULT https://download.csdn.net/download/bandaoyu/86400579

https://www.researchgate.net/publication/341844608_DAOS_A_Scale-Out_High_Performance_Storage_Stack_for_Storage_Class_Memory

OFI Intel® MPI Library 2019 Over Libfabric*  https://www.intel.cn/content/www/cn/zh/developer/articles/technical/mpi-library-2019-over-libfabric.html

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值