DPDK学习记录3 - rte_eth_rx_burst如何接收报文

1. 调试环境准备

1.1 编译准备
包括:编译dpdk / 配置hugepages / insert UIO kernel / bind ethernet device /
在这里插入图片描述
1.2 启动test-pmd 并gdb连接
连gdb有两种方式,一种等test-pmd with interactive模式启动之后,用gdb attach上去,另一种是gdb中启动test-pmd。
1.2.1
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
1.2.2
在这里插入图片描述
在这里插入图片描述

2. eth_em_recv_pkts如何收包

2.1 rte_eth_rx_burst
通过上图中的bt命令可以看到,rte_eth_rx_burst会调用eth_em_recv_pkts,其中eth_em_recv_pkts是具体的驱动收包函数,这里是e1000 em驱动,有可能会是其他驱动函数;而rte_eth_rx_burst是通用的收包接口,其实现如下面代码:先通过port id从全局变量数组rte_eth_devices中获取对应dev的信息(这里需要用rx queues),然后调用dev中的rx_pkt_burst(即eth_em_recv_pkts)。

初始化的时候,把dev的信息存到了全局变量数组rte_eth_devices中,port id为下标;把具体的驱动收包函数挂到了dev的rx_pkt_burst函数指针;该port id下的rx queues信息也是在初始化的时候存储的。(初始化过程可以另外再写)

static inline uint16_t
rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
		 struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
{ 
	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
	uint16_t nb_rx;

#ifdef RTE_LIBRTE_ETHDEV_DEBUG
	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
	RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);

	if (queue_id >= dev->data->nb_rx_queues) {
		RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
		return 0;
	}
#endif
	//rx_pkts是返回值,mbuf数组;nb_pkts是burst的个数,一次性读取的报文数目。
	nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
				     rx_pkts, nb_pkts);

2.2 eth_em_recv_pkts

这里有几个概念,em_rx_queue是对整个接收队列的描述,该队列中重要的两个成员是rx ring和sw ring,另外还有一个rx tail表示ring中未读取报文的位置。

其中 rx ring的成员结构为收包描述符e1000_rx_desc,描述符最重要是保存了buffer_addr,DMA会把报文内容从DMA 物理地址copy到此buffer addr;

sw ring的成员结构为em_rx_entry,其实就是rte_mbuf,rte_mbuf在内存中是“报文属性(struct rte mbuf的内容,目前是128字节大小) + 空128字节用来封装的Headroom + 报文内容”的格式,rte_mbuf中的buf_addr指向Headroom;data_off为buf_addr到报文内容的偏移(初始128,修改报文不停变化);buf_iova固定为buf_addr+Headroom,即rx ring中的buffer addr。

2.2.1

如下为eth_em_recv_pkts函数中定义的局部变量,rxq指向整个接收队列的描述;rx_ring一直指向描述符队列的头部,根据rx tail来偏移;rxdp指向rx ring中某个e1000_rx_desc描述符;rxd是具体的非指针描述符,应该就是*rxdp;sw_ring一直指向em_rx_entry的头部,根据rx tail来偏移;rxe则指向sw ring中具体的entry;rxm为entry里的rte mbuf,这个rxm是要返回的。

nmb则是new mbuf,新申请的mbuf,当rxm从ring中取出后,需要用nmb再挂上去,更新对应rx ring和sw ring中的值,为下一次收包做准备。

eth_em_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
		uint16_t nb_pkts)
{
	volatile struct e1000_rx_desc *rx_ring;
	volatile struct e1000_rx_desc *rxdp;
	struct em_rx_queue *rxq;
	struct em_rx_entry *sw_ring;
	struct em_rx_entry *rxe;
	struct rte_mbuf *rxm;
	struct rte_mbuf *nmb;
	struct e1000_rx_desc rxd;
	uint64_t dma_addr;
	uint16_t pkt_len;
	uint16_t rx_id;
	uint16_t nb_rx;
	uint16_t nb_hold;
	uint8_t status;

	rxq = rx_queue;

	nb_rx = 0;
	nb_hold = 0;
	rx_id = rxq->rx_tail;
	rx_ring = rxq->rx_ring;
	sw_ring = rxq->sw_ring;

2.2.2 gdb

gdb如下图,当下的rx_id为203,rx ring/rxd中读出接收描述符(rx description),sw ring/rxe中读出mbuf。可以看到,rx desc中的buffer addr = 128 + mbuf->buf_iova;mbuf中的buf_addr = mbuf地址 + sizeof(mbuf);sizeof(mbuf)刚好也是128字节。
在这里插入图片描述

如下图,rxm是需要取出来的mbuf,nmb是新申请的mbuf。rxm所占的rx_id为203,当rxm取出之后,用nmb去更新sw ring[203]的位置,同时更新rx ring[203]的buffer addr,为后面继续收取新的报文。
在这里插入图片描述

3. mbuf/rx queue/rx ring/rx description/sw ring/rx entry等

下图为接收报文的一个总图,DMA物理内存,mbuf所占的mempool等都在初始化阶段完成。这里DMA控制器控制报文一个个写到rx ring中接收描述符指定的IO虚拟内存中,对应的实际内存应该就是mbuf中。接收函数用rx tail变量控制不停地读取rx ring中的描述符和sw ring中的mbuf,并申请新的mbuf放入sw ring中,更新rx ring中的buffer addr。把读取的mbuf返回给应用程序。
在这里插入图片描述

/** * * Retrieve a burst of input packets from a receive queue of an Ethernet * device. The retrieved packets are stored in *rte_mbuf* structures whose * pointers are supplied in the *rx_pkts* array. * * The rte_eth_rx_burst() function loops, parsing the RX ring of the * receive queue, up to *nb_pkts* packets, and for each completed RX * descriptor in the ring, it performs the following operations: * * - Initialize the *rte_mbuf* data structure associated with the * RX descriptor according to the information provided by the NIC into * that RX descriptor. * * - Store the *rte_mbuf* data structure into the next entry of the * *rx_pkts* array. * * - Replenish the RX descriptor with a new *rte_mbuf* buffer * allocated from the memory pool associated with the receive queue at * initialization time. * * When retrieving an input packet that was scattered by the controller * into multiple receive descriptors, the rte_eth_rx_burst() function * appends the associated *rte_mbuf* buffers to the first buffer of the * packet. * * The rte_eth_rx_burst() function returns the number of packets * actually retrieved, which is the number of *rte_mbuf* data structures * effectively supplied into the *rx_pkts* array. * A return value equal to *nb_pkts* indicates that the RX queue contained * at least *rx_pkts* packets, and this is likely to signify that other * received packets remain in the input queue. Applications implementing * a "retrieve as much received packets as possible" policy can check this * specific case and keep invoking the rte_eth_rx_burst() function until * a value less than *nb_pkts* is returned. * * This receive method has the following advantages: * * - It allows a run-to-completion network stack engine to retrieve and * to immediately process received packets in a fast burst-oriented * approach, avoiding the overhead of unnecessary intermediate packet * queue/dequeue operations. * * - 此api作用
最新发布
03-11
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值