NAPI 方式的实现

Linux 内核协议栈中报文接收的设计思路:


NAPI接口和旧接口两者有一下相同点:

  (1)、对报文的处理都应该放在软中断中处理。

  (2)、两者都有存储报文的队列,NAPI的队列是由网卡来管理的,旧接口的队列是由内核管理的。


每个NAPI设备都有一个轮询函数来由软中断调用,来进行轮询处理报文。我们可以建立一个虚拟的NAPI设备,让她的轮询函数来轮询的处理旧接口的报文队列,这样NAPI和旧接口的软中断处理函数就可以共用一个。收包软中断处理函数只是进行调用相应NAPI的轮询函数进行处理,并不关心是虚拟NAPI还是非虚拟NAPI。具体轮询细节由相关NAPI的轮询函数自己实现。


如上所述,我们现在可以知道每个CPU上需要有如下几个元素:

1、一个报文的接收队列,由旧接口来使用。

2、一个虚拟的NAPI设备。

3、一个NAPI的链表,上面挂着有报文需要处理的NAPI设备。


Linux内核具体实现:


结构体定义如下


1
2
3
4
5
6
struct  softnet_data
{
   struct  sk_buff_head   input_pkt_queue;  //旧接口的输入队列
   struct  list_head  poll_list;  //有需要处理报文的NAPI设备
   struct  napi_struct    backlog; //虚拟的NAPI设备 backlog
};


定义一个per_cpu变量 softnet_data

1
DECLARE_PER_CPU( struct  softnet_data,softnet_data);



softnet_data 的初始化如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
static  int  __init net_dev_init( void )
{
     int  i, rc = -ENOMEM;
     /*
      *  Initialise the packet receive queues.
      */
     for_each_possible_cpu(i)
     {
         struct  softnet_data *queue;
         queue = &per_cpu(softnet_data, i);
         /*初始化输入队列*/
         skb_queue_head_init(&queue->input_pkt_queue);
         queue->completion_queue = NULL;
         /*初始化pool 链表头*/
         INIT_LIST_HEAD(&queue->poll_list);
         /*把backlog的轮询函数初始化为 process_backlog
          该函数来处理传统接口使用的输入队列的报文*/
         queue->backlog.poll = process_backlog;
         /*backlog 轮询函数一次可以处理的报文上限个数*/
         queue->backlog.weight = weight_p;
         queue->backlog.gro_list = NULL;
         queue->backlog.gro_count = 0;
     }
}



NAPI的数据结构:


每个NAPI需要如下几个元素:

1、一个轮询函数,来轮询处理报文。

2、需要有一个状态,来标识NAPI的调度状态,是正在被调度还是没有被调度。

3、每个NAPI的一次轮询能处理的报文个数应该有一个上限,不能无限制的处理下去,防止独占CPU资源导致其他进程被饿死。

4、每个NAPI应该和一个网络设备进行关联。

5、NAPI设计时考虑了多队列的情况。一个网络设备可以有多个报文队列,这样一个网络设备可以关联多个NAPI,每个NAPI只处理特定的队列的报文。这样在多核情况下,多个CPU核可以并行的进行报文的处理。一般专用的网络多核处理器存在这种情况。

6、每个NAPI在有报文的情况下应该挂到softnet_data的pool_list上去。


如上所述,NAPI的结构体定义如下


1
2
3
4
5
6
7
8
9
10
struct  napi_struct
{
     struct  list_head  poll_list; //挂到softnet_data的pool_list上
     unsigned  long   state; //NAPI的调度状态
     int   weight; //一次轮询的最大处理报文数
     int   (*poll)( struct  napi_struct *,  int ); //轮询函数
     struct  net_device   *dev; //指向关联的网络设备
     struct  list_head  dev_list; //对应的网络设备上关联的NAPI链表节点
         /*其他字段是gso功能用,这里先不讨论*/
};


NAPI的调度状态:

NAPI_STATE_SCHED设置时表示该NAPI有报文 需要接收。即把NAPI挂到softnet_data 时要设置该状态,处理完softnet_data 上摘除该NAPI时要清除该状态。



一些NAPI的函数详解:


1、netif_napi_add(),把网络设备net_device 和NAPI结构相绑定。



1
2
3
4
5
6
7
8
9
10
11
12
13
14
void  netif_napi_add( struct  net_device *dev,
       struct  napi_struct *napi,
       int  (*poll)( struct  napi_struct *,  int ),  int  weight)
{
     INIT_LIST_HEAD(&napi->poll_list);
     napi->poll = poll;
     napi->weight = weight;
     /*把NAPI加入到网络设备相关联的NAPI链表上去。*/
     list_add(&napi->dev_list, &dev->napi_list);
     napi->dev = dev;
     /*绑定时设置NAPI是已调度状态,禁用该NAPI,以后手动的来清除该标识来使
     能NAPI.*/
     set_bit(NAPI_STATE_SCHED, &napi->state);
}


2、使能和禁用NAPI:


1
2
3
4
5
6
7
8
9
10
11
12
static  inline  void  napi_disable( struct  napi_struct *n)
{
     /*先设置NAPI状态为DISABLE*/
     set_bit(NAPI_STATE_DISABLE, &n->state);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
     /*循环的等待NAPI被调度完成,变成可用的,设置成SCHED状态*/
     while  (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
         msleep(1);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
     /*清除DISABLE状态*/
     clear_bit(NAPI_STATE_DISABLE, &n->state);
}


3、调度NAPI


1
2
3
4
5
6
7
8
9
10
11
12
13
14
void  __napi_schedule( struct  napi_struct *n)
{
     unsigned  long  flags;
     /*链表操作必须在关闭本地中断的情况下操作,防止硬中断抢占*/
     local_irq_save(flags);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
     /*把NAPI加入到本地CPU的softnet_data 的pool_list 链表上*/
     list_add_tail(&n->poll_list,
                   &__get_cpu_var(softnet_data).poll_list);
     /*调度收包软中断*/
     __raise_softirq_irqoff(NET_RX_SOFTIRQ);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
     local_irq_restore(flags);
}


4、NAPI执行完毕,如果NAPI一次轮询处理完队列的所以报文,调用该函数。


1
2
3
4
5
6
7
8
9
10
11
12
void  __napi_complete( struct  napi_struct *n)
{
     BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
     BUG_ON(n->gro_list);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
     /*把NAPI从softnet_data的pool_list上摘除下来*/
     list_del(&n->poll_list);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
     /*使能NAPI,允许它下次可以被再次调度*/
     smp_mb__before_clear_bit();
     clear_bit(NAPI_STATE_SCHED, &n->state);
}


一般网卡驱动的NAPI的步骤:

1、不同的网卡驱动都会为网络设备定义自己的结构体。有的是自己的结构体中即包括net_device和napi,有的是把自己的结构体放到net_device的priv部分。


2、实现NAPI的轮询函数 rx_pool(struct napi_struct *napi,int weigh);


int rx_pool(struct napi_struct *napi,int weigh)

{

   int rx_cnt = 0;

   struct sk_buff *skb;


   while(rx_cnt < weigh)

   {

       skb = netdev_alloc_skb();

       copy_skb_form_hw(skb);//把报文从硬件缓存中读到内存skb中*/

       rx_cnt ++;

       netif_receive_skb(skb);//送协议栈直接处理

   }

   if(rx_cnt < weigh)

   {

       /*处理报文小与允许处理最大个数,表示一次轮询处理完全部的报文。

       使能收包中断*/

       napi_complete(napi)

       enable_rx_irq();

   }

   return rx_cnt;

}


3、把网络设备跟NAPI相关联,netif_napi_add(netdev,napi,rx_pool,weigh);


4、注册网卡的收包中断:例:

   request_irq(irq,&rx_irq_handle, 0, netdev->name, netdev);


在中断处理函数中

static irqreturn_t rx_irq_handle(int irq, void *data)

{

   struct net_device *netdev = data

   struct napi_struct *napi = netdev_priv(netdev)->napi;

   disable_rx_irq();

__napi_schedule(napi);

}

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值