目录
判断rionet_start_xmit()函数在出问题时是否被执行到
rionet网络正常时,cat /sys/kernel/debug/tracing/trace_pipe的信息
rionet网络阻塞、卡死时,cat /sys/kernel/debug/tracing/trace_pipe的信息
结论:出问题时,rionet_start_xmit()函数没有被执行到。
抓取 sch_direct_xmit()返回值,判断返回值是否正常
rionet网络阻塞、卡死时,cat /sys/kernel/debug/tracing/trace_pipe的信息
分析rnet->tx_cnt、rnet->ack_slot 和 slot的变化关系
修改代码,限制rnet->tx_cnt < RIONET_TX_RING_SIZE
更多rapidio的分析笔记请看:
linux内核下rapidio(TSI721)相关笔记汇总-CSDN博客
软硬件信息
客户设计了一款板卡,CPU使用的时龙芯3A系列,rapidio芯片使用的是TSI721,内核版本是kernel-5.4。
rapidio和tsi721驱动代码来自:Releases · RapidIO/kernel-rapidio · GitHub
进行测试时,使用2块相同的办卡,一块作为rapidio发送端,一块作为rapidio接收端。
测试方法 和 问题现象
使用常用的iperf3命令来测试。
问题现象,iperf3发送端信息如下:
出现上述情况时,网速降为0,此时无法ping通另一块板卡,并且只能重启系统来恢复。
开始定位问题
数据发送的代码流程
通过rionet(rapidio的网络驱动)发送数据的大致流程如下:
判断rionet_start_xmit()函数在出问题时是否被执行到
添加kprobe探测点
cat /dev/null > /sys/kernel/debug/tracing/trace
echo 0 > /sys/kernel/debug/tracing/events/kprobes/enable
echo 'p:probe0 rionet_start_xmit' > /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe1 rionet_queue_tx_msg' >> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe2 tsi721_add_outb_message' >> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe3 dev_hard_start_xmit' >> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe4 dev_direct_xmit' >> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe5 generic_xdp_tx' >> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe6 dev_nit_active' >> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe7 dev_queue_xmit_nit' >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
echo 1 > /sys/kernel/debug/tracing/events/net/net_dev_xmit/enable
echo 1 > /sys/kernel/debug/tracing/events/net/net_dev_start_xmit/enable
echo 1 > /sys/kernel/debug/tracing/events/net/net_dev_queue/enable
echo 1 > /sys/kernel/debug/tracing/tracing_on
rionet网络正常时,cat /sys/kernel/debug/tracing/trace_pipe的信息
ping-2974 [001] .... 827.996612: net_dev_queue: dev=eth0 skbaddr=00000000f3e665ee len=98
ping-2974 [001] .... 827.996614: probe3: (dev_hard_start_xmit+0x0/0x2a0)
ping-2974 [001] .... 827.996614: net_dev_start_xmit: dev=eth0 queue_mapping=0 skbaddr=00000000f3e665ee vlan_tagged=0 vlan_proo
to=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 len=98 data_len=0 network_offset=14 transport_offset_valid=1 transport_offset=34 tx_fll
ags=0 gso_size=0 gso_segs=0 gso_type=0x0
ping-2974 [001] .... 827.996615: probe0: (rionet_start_xmit+0x0/0x310 [rionet])
ping-2974 [001] d... 827.996616: probe1: (rionet_queue_tx_msg+0x0/0x110 [rionet])
ping-2974 [001] d... 827.996616: probe2: (tsi721_add_outb_message+0x0/0x268 [tsi721_mport])
ping-2974 [001] .... 827.996618: net_dev_xmit: dev=eth0 skbaddr=00000000f3e665ee len=98 rc=0
ping-2974 [001] .... 829.020604: net_dev_queue: dev=eth0 skbaddr=00000000eeeb1fe0 len=98
ping-2974 [001] .... 829.020606: probe3: (dev_hard_start_xmit+0x0/0x2a0)
ping-2974 [001] .... 829.020606: net_dev_start_xmit: dev=eth0 queue_mapping=0 skbaddr=00000000eeeb1fe0 vlan_tagged=0 vlan_proo
to=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 len=98 data_len=0 network_offset=14 transport_offset_valid=1 transport_offset=34 tx_fll
ags=0 gso_size=0 gso_segs=0 gso_type=0x0
ping-2974 [001] .... 829.020606: probe0: (rionet_start_xmit+0x0/0x310 [rionet])
ping-2974 [001] d... 829.020607: probe1: (rionet_queue_tx_msg+0x0/0x110 [rionet])
ping-2974 [001] d... 829.020607: probe2: (tsi721_add_outb_message+0x0/0x268 [tsi721_mport])
ping-2974 [001] .... 829.020608: net_dev_xmit: dev=eth0 skbaddr=00000000eeeb1fe0 len=98 rc=0
ping-2974 [001] .... 830.044605: net_dev_queue: dev=eth0 skbaddr=00000000132a02e2 len=98
ping-2974 [001] .... 830.044606: probe3: (dev_hard_start_xmit+0x0/0x2a0)
ping-2974 [001] .... 830.044606: net_dev_start_xmit: dev=eth0 queue_mapping=0 skbaddr=00000000132a02e2 vlan_tagged=0 vlan_proo
to=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 len=98 data_len=0 network_offset=14 transport_offset_valid=1 transport_offset=34 tx_fll
ags=0 gso_size=0 gso_segs=0 gso_type=0x0
ping-2974 [001] .... 830.044607: probe0: (rionet_start_xmit+0x0/0x310 [rionet])
rionet网络阻塞、卡死时,cat /sys/kernel/debug/tracing/trace_pipe的信息
ping-2549 [003] .... 211.804623: net_dev_queue: dev=eth0 skbaddr=00000000a2bcacc1 len=98
ping-2549 [003] .... 212.828621: net_dev_queue: dev=eth0 skbaddr=00000000fbfcdcc9 len=98
ping-2549 [003] .... 213.852611: net_dev_queue: dev=eth0 skbaddr=0000000081af0540 len=98
ping-2549 [003] .... 214.876611: net_dev_queue: dev=eth0 skbaddr=00000000d879ce7e len=98
ping-2558 [003] .... 216.006098: net_dev_queue: dev=lo skbaddr=000000008c11e0a2 len=79
ping-2558 [003] .... 216.006104: probe3: (dev_hard_start_xmit+0x0/0x2a0)
结论:出问题时,rionet_start_xmit()函数没有被执行到。
抓取 sch_direct_xmit()返回值,判断返回值是否正常
添加kprobe探测点
cat /dev/null > /sys/kernel/debug/tracing/trace
echo 0 > /sys/kernel/debug/tracing/events/kprobes/enable
echo 'p:probe0 rionet_start_xmit' > /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe33 sch_fq_codel' >> /sys/kernel/debug/tracing/kprobe_events
echo 'r:probe37 __qdisc_run' >> /sys/kernel/debug/tracing/kprobe_events
echo 'r:probe39 sch_direct_xmit $retval'>> /sys/kernel/debug/tracing/kprobe_events
echo 'p:probe42 rionet_outb_msg_event' >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
echo 1 > /sys/kernel/debug/tracing/events/net/net_dev_xmit/enable
echo 1 > /sys/kernel/debug/tracing/events/net/net_dev_start_xmit/enable
echo 1 > /sys/kernel/debug/tracing/events/net/net_dev_queue/enable
echo 1 > /sys/kernel/debug/tracing/tracing_on
rionet网络阻塞、卡死时,cat /sys/kernel/debug/tracing/trace_pipe的信息
结论
通过kprobe抓取sch_direct_xmit()返回值,发现在网络卡死时,返回值是0(false)。根据sch_direct_xmit()代码注释,返回值是0(false),表示“Driver returned NETDEV_TX_BUSY”。
rionet.c中有一个宏“RIONET_TX_RING_SIZE”,用来限制网络协议栈下发给tsi721缓冲中的数据包数量(rnet->tx_cnt)。当达到这个上限时(rnet->tx_cnt == RIONET_TX_RING_SIZE),rionet_queue_tx_msg()中会调用netif_stop_queue()停止下发数据。
dev_hard_start_xmit()函数中发现stop queue(netif_tx_queue_stopped())后会设置状态NETDEV_TX_BUSY。
数据包通过rtsi721发送出去后,会触发中断,在中断处理函数rionet_outb_msg_event()里会 将rnet->tx_cnt的值减去发送完成的数据包数量。如果rnet->tx_cnt小于 RIONET_TX_RING_SIZE,会调用netif_wake_queue()继续下发数据。
出问题时,由于判断条件不满足(rnet->ack_slot != slot),导致rnet->tx_cnt的值没变化。
分析rnet->tx_cnt、rnet->ack_slot 和 slot的变化关系
修改 并 打开rionet代码自带的调试信息
rionet代码自带的调试信息如下
static void rionet_inb_msg_event(struct rio_mport *mport, void *dev_id, int mbox, int slot)
{
……
if (netif_msg_intr(rnet))
printk(KERN_INFO "%s: inbound message event, mbox %d slot %d\n",
DRV_NAME, mbox, slot);
……
}
对上面的代码修改如下
static void rionet_inb_msg_event(struct rio_mport *mport, void *dev_id, int mbox, int slot)
{
……
if (netif_msg_intr(rnet))
printk(KERN_INFO
"%s: outbound message event, tx_cnt %d, mbox %d, ack_slot %d, slot %d\n",
DRV_NAME, rnet->tx_cnt, mbox, rnet->ack_slot, slot);
……
}
查看netif_msg_intr()的message level时0x200。
NETIF_MSG_INTR = 0x0200
#define netif_msg_intr(p) ((p)->msg_enable & NETIF_MSG_INTR)
查看默认的 message level
# ethtool eth0
Settings for eth0:
Current message level: 0x000000c5 (197)
drv link rx_err tx_err
Link detected: yes
修改message level打开上面我们修改的调试信息。
# ethtool -s eth0 msglvl 0x2c5 //0xc5 + 0x200
获取到的dmesg信息
[ 241.668251] rionet: inbound message event, mbox 0 slot -1
[ 241.668390] rionet: inbound message event, mbox 0 slot -1
[ 241.668551] rionet: inbound message event, mbox 0 slot -1
[ 241.668691] rionet: outbound message event, tx_cnt 148, mbox 0, ack_slot 98, slot 246
[ 241.668707] rionet: inbound message event, mbox 0 slot -1
[ 241.668719] rionet: inbound message event, mbox 0 slot -1
[ 241.668727] rionet: inbound message event, mbox 0 slot -1
……
[ 241.669031] rionet: inbound message event, mbox 0 slot -1
[ 241.669235] rionet: inbound message event, mbox 0 slot -1
[ 241.669391] rionet: inbound message event, mbox 0 slot -1
[ 241.669481] rionet: outbound message event, tx_cnt 148, mbox 0, ack_slot 246, slot 138
[ 241.669497] rionet: inbound message event, mbox 0 slot -1
[ 241.669506] rionet: inbound message event, mbox 0 slot -1
[ 241.669524] rionet: inbound message event, mbox 0 slot -1
……
[ 241.671096] rionet: inbound message event, mbox 0 slot -1
[ 241.671107] rionet: inbound message event, mbox 0 slot -1
[ 241.671122] rionet: inbound message event, mbox 0 slot -1
[ 241.670243] rionet: outbound message event, tx_cnt 222, mbox 0, ack_slot 138, slot 104
[ 241.671096] rionet: inbound message event, mbox 0 slot -1
[ 241.671107] rionet: inbound message event, mbox 0 slot -1
[ 241.671122] rionet: inbound message event, mbox 0 slot -1
……
[ 241.671096] rionet: inbound message event, mbox 0 slot -1
[ 241.671107] rionet: inbound message event, mbox 0 slot -1
[ 241.671122] rionet: inbound message event, mbox 0 slot -1
[ 241.671085] rionet: outbound message event, tx_cnt 74, mbox 0, ack_slot 104, slot 178
[ 241.671096] rionet: inbound message event, mbox 0 slot -1
[ 241.671107] rionet: inbound message event, mbox 0 slot -1
[ 241.671122] rionet: inbound message event, mbox 0 slot -1
……
[ 241.671241] rionet: inbound message event, mbox 0 slot -1
[ 241.671516] rionet: inbound message event, mbox 0 slot -1
[ 241.671690] rionet: inbound message event, mbox 0 slot -1
[ 241.671851] rionet: inbound message event, mbox 0 slot -1
[ 241.671872] ----- /home/kylin/Workspace/gaohua/driver_rapidio/kernel-rapidio-master/rionet.c,215
[ 241.672008] rionet: inbound message event, mbox 0 slot -1
[ 241.672041] rionet: outbound message event, tx_cnt 256, mbox 0, ack_slot 178, slot 178
[ 241.672043] ----- rionet_outb_msg_event rnet->tx_cnt = 256, rnet->ack_slot = 178, slot = 178
[ 241.672049] rionet: inbound message event, mbox 0 slot -1
[ 241.672060] rionet: inbound message event, mbox 0 slot -1
[ 241.672065] rionet: inbound message event, mbox 0 slot -1
结论
分析dmesg信息发现,每次出问题时rnet->tx_cnt=256(当前RIONET_TX_RING_SIZE的值是256),所以怀疑当网络协议栈下发的数据包数量等于RIONET_TX_RING_SIZE时,会导致网络阻塞、卡死。
下面将验证此怀疑点。
修改代码,限制rnet->tx_cnt < RIONET_TX_RING_SIZE
修改代码
文件:rionet.c
函数:rionet_queue_tx_msg()
将
if (++rnet->tx_cnt == RIONET_TX_RING_SIZE)
netif_stop_queue(ndev);
改为
if (++rnet->tx_cnt == (RIONET_TX_RING_SIZE - 1))
netif_stop_queue(ndev);
验证结果
修改后的代码工作正常,不会导致网络阻塞、卡死。