Kernel包接收处理过程-CSDN博客

本文链接：https://blog.csdn.net/wjian1997/article/details/77675732

Enable/Disable Device：

对于device来说，可以enable/disable发送（__LINK_STATE_XOFF），但不能enable/disable接收（但可以enable/disable device来禁止接收）。

通知内核接收：NAPI、netif_rx：

netif_rx (old function)：

这里可以在一个中断内处理多个frame。

(3c59x.c)

vortex_interrupt() -> vortex_rx() -> netif_rx() -> enqueue_to_backlog() -> ____napi_schedule() (Schedule NAPI for backlog device)。

enqueue_to_backlog() ：当queue不空时，不用调用____napi_schedule()，因为 NET_RX_SOFTIRQ会执行到queue空为止。

backlog device的poll()函数被初始化为process_backlog()，在net_dev_init()中。

NAPI：

(drivers/net/tg3.c) idea: 中断与轮询（polling）的混合。

tg3_interrupt() -> napi_schedule() -> ____napi_schedule

____napi_schedule

将dev链入本CPU的poll_list；

__raise_softirq_irqoff(NET_RX_SOFTIRQ);
-> net_rx_action()

net_rx_action和NAPI:

* net_rx_action是NET_RX_SOFTIRQ的处理函数。

* 有数据包接收的dev，会disable中断，并链入poll_list。

* net_rx_action调用 dev->poll，直到处理完数据包，或调度时间到。

* net_rx_action运行时，中断是enabled。所以可能不断有新的frame链入。

process_backlog():

* 该函数运行时，硬中断是enabled。所以该函数可能被抢占。并且该函数是处理多个device共享的ingress队列，访问softnet_data时，需要使用同步函数：

    napi->weight = weight_p;
    local_irq_disable();
    while (work < quota) {
        struct sk_buff *skb;
        unsigned int qlen;
        ......
    }

对于NAPI，却不需要：poll()函数执行时，硬中断是disabled。并且，每个device有自己的队列。

* 该函数对每个skb，调用__netif_receive_skb()：

    rcu_read_lock();
    local_irq_enable();
    __netif_receive_skb(skb);
    rcu_read_unlock();

为何要在进入__netif_receive_skb()前 rcu_read_lock()？这个是kernel bug fix的comment:

commit 52135f132988284a8091940362c923218c409f57
Author: Julian Anastasov <ja@ssi.bg>
Date:   Thu Jul 9 09:59:10 2015 +0300

    net: call rcu_read_lock early in process_backlog
    
    [ Upstream commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 ]
    
    Incoming packet should be either in backlog queue or
    in RCU read-side section. Otherwise, the final sequence of
    flush_backlog() and synchronize_net() may miss packets
    that can run without device reference:
    
    CPU 1                  CPU 2
                           skb->dev: no reference
                           process_backlog:__skb_dequeue
                           process_backlog:local_irq_enable
    
    on_each_cpu for
    flush_backlog =>       IPI(hardirq): flush_backlog
                           - packet not found in backlog
    
                           CPU delayed ...
    synchronize_net
    - no ongoing RCU
    read-side sections
    
    netdev_run_todo,
    rcu_barrier: no
    ongoing callbacks
                           __netif_receive_skb_core:rcu_read_lock
                           - too late
    free dev
                           process packet for freed dev

解释一下就是：一个skb，要么在backlog queue里面（dequeue之前的状态；dequeue时中断是disabled）；要么已经rcu_read_lock()后。否则可能出现“process packet for freed dev”。

（RCU基本知识看这篇：http://blog.csdn.net/xabc3000/article/details/15335131）