【vbers】ibv_post_send|IBV_SEND_SOLICITED|RDMA

原文:https://www.rdmamojo.com/2013/01/26/ibv_post_send/

ibv_post_send() 将工作请求 (WR) 的链接列表发布到队列对 (QP) 的发送队列(Send Queue )。 ibv_post_send() 逐个检查链表中的所有条目,检查它是否有效,从中生成一个特定于硬件的发送请求(Send Request)并将其添加到 QP 发送队列的尾部(无需任何上下文切换). RDMA 设备将(稍后)以异步方式处理它。如果由于发送队列已满或 WR 中的属性之一错误而导致其中一个 WR 出现故障,则它会立即停止并返回指向该 WR 的指针。 QP 将根据以下规则处理发送队列中的工作请求(Work Requests):

  • 如果 QP 处于 RESET、INIT 或 RTR 状态,则应立即返回错误。但是,它们可能是一些不遵循此规则的底层驱动程序(以消除对数据路径的额外检查,从而提供更好的性能)并且在这些状态中的一个或所有状态下发布发送请求可能会被默默忽略。
  • 如果 QP 处于 RTS 状态,则可以发布 Send Requests并对其进行处理。
  • 如果 QP 处于 SQE 或 ERROR 状态,则可以发布 Send Requests,并且它们将以错误为结果的完成。
  • 如果 QP 处于 SQD 状态,则可以发布发送请求,但不会处理它们。

结构体 ibv_send_wr 描述了对 QP 发送队列的工作请求,即发送请求(SR)。

struct ibv_send_wr {
	uint64_t		wr_id;
	struct ibv_send_wr     *next;
	struct ibv_sge	       *sg_list;
	int			num_sge;
	enum ibv_wr_opcode	opcode;
	int			send_flags;
	uint32_t		imm_data;
	union {
		struct {
			uint64_t	remote_addr;
			uint32_t	rkey;
		} rdma;
		struct {
			uint64_t	remote_addr;
			uint64_t	compare_add;
			uint64_t	swap;
			uint32_t	rkey;
		} atomic;
		struct {
			struct ibv_ah  *ah;
			uint32_t	remote_qpn;
			uint32_t	remote_qkey;
		} ud;
	} wr;
};

这是结构 ibv_send_wr 的完整描述:

(可以参考下文了解更多:【RDMA】技术详解(三):理解RDMA Scatter Gather List|聚散表_bandaoyu的note-CSDN博客_rdma sge聚合)

wr_id

A 64 bits value associated with this WR. If a Work Completion will be generated when this Work Request ends, it will contain this value

与此 WR 关联的 64 位值。这个WR结束时将生成WC,这个WC就包含此值。


似乎erbs没有规定此值,由用户自己决定,所以我看到有人用id传递各种内容,如连接  wr.wr_id = (uintptr_t)conn; 
next

Pointer to the next WR in the linked list. NULL indicates that this is the last WR

指向链表中下一个 WR 的指针。 NULL 表示这是最后一个 WR 
sg_list Scatter/Gather array, as described in the table below. It specifies the buffers that will be read from or the buffers where data will be written in, depends on the used opcode. The entries in the list can specify memory blocks that were registered by different Memory Regions. The message size is the sum of all of the memory buffers length in the scatter/gather list
num_sge Size of the sg_list array. This number can be less or equal to the number of scatter/gather entries that the Queue Pair was created to support in the Send Queue (qp_init_attr.cap.max_send_sge). If this size is 0, this indicates that the message size is 0
opcode

此 WR 将执行的操作。该值控制数据的发送方式、数据流的方向以及 WR 中使用的属性。该值可以是以下枚举值之一:

The operation that this WR will perform. This value controls the way that data will be sent, the direction of the data flow and the used attributes in the WR. The value can be one of the following enumerated values:

  • IBV_WR_SEND - sg_list 中指定的本地内存缓冲区的内容正在发送到远程 QP。发送方不知道数据将写入对端的何处。(对端的)将从它自己的 RQ队列的头部消耗一个RQ(这个RQ记录有将接收到的数据写入哪来的信息),发送的数据将写入该RQ中指定的内存缓冲区。对于 RC 和 UC QP,消息大小可以是 [0, 2^31 ],对于 UD QP,消息大小可以是 [0, path MTU]

The content of the local memory buffers specified in sg_list is being sent to the remote QP. The sender doesn’t know where the data will be written in the remote node. A Receive Request will be consumed from the head of remote QP's Receive Queue and sent data will be written to the memory buffers which are specified in that Receive Request. The message size can be [0, 2^31 ] for RC and UC QPs and [0, path MTU] for UD QP

  • IBV_WR_SEND_WITH_IMM - Same as IBV_WR_SEND, and immediate data will be sent in the message. This value will be available in the Work Completion that will be generated for the consumed Receive Request in the remote QP
  • IBV_WR_RDMA_WRITE -sg_list中记录的本地buffer中的内容被发送(到对端网卡)并被(对端网卡)写到在对端QP的虚拟地址范围内的连续内存上。
  • 这并不一定意味着远程内存在物理上是连续的。在远程 QP 中不会消耗任何接收请求。消息大小可以是[0,2^31]

The content of the local memory buffers specified in sg_list is being sent and written to a contiguous block of memory range in the remote QP's virtual space.

This doesn't necessarily means that the remote memory is physically contiguous. No Receive Request will be consumed in the remote QP. The message size can be [0,  2^31]

  • IBV_WR_RDMA_WRITE_WITH_IMM - Same as IBV_WR_RDMA_WRITE, but Receive Request will be consumed from the head of remote QP's Receive Queue and immediate data will be sent in the message. This value will be available in the Work Completion that will be generated for the consumed Receive Request in the remote QP
  • IBV_WR_RDMA_READ - Data is being read from a contiguous block of memory range in the remote QP's virtual space and being written to the local memory buffers specified in sg_list. No Receive Request will be consumed in the remote QP. The message size can be [0, 2^31 ]
  • IBV_WR_ATOMIC_FETCH_AND_ADD - A 64 bits value in a remote QP's virtual space is being read, added to wr.atomic.compare_add and the result is being written to the same memory address, in an atomic way. No Receive Request will be consumed in the remote QP. The original data, before the add operation, is being written to the local memory buffers specified in sg_list
  • IBV_WR_ATOMIC_CMP_AN
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值