linux收发包进程名称,Linux 内核收发包流程

https://blog.csdn.net/kklvsports/article/details/74452953

收包流程:

传统方式和NAPI方式收包流程是有差异的,如图所示。

0c9683f4d9ad7fe739c850ed09d09a0a.png

传统收包是中断,驱动处理完后直接调用netif_rx将报文送入内核处理,内核将报文skb挂到该CPU的softnet_data结构input_pkt_queue队列上, 为了统一传统收包和NAPI设备收包的处理,内核为所有不使用NAPI的驱动程序提供一个虚拟设备,叫做积压设备,每个CPU一个积压设备,对应结构softnet_data->backlog_dev。input_pkt_queue即是该设备的积压队列,用于存储skb,该队列是一个双向链表,组织结构如下。中断上半部只是将报文入队,并将backlog的实例挂到poll_list上,等待下半部软中断轮询poll_list net_rx_action->preocess_backlog将报文进一步处理。

input_pkt_queue structure

+------------------------------------------------------------+

|                                                            |

|  skb_buff_head        skb_buff             skb_buff        |

|    _______       _______________       _______________     |

+-->|  next |---->|           next|---->|           next|----+

+---|  pre  |

|   |_len=2_|     |_______________|     |_______________|    |

|                                                            |

+------------------------------------------------------------+

传统收包是每个报文都触发中断,如果报文太快,中断太频繁,CPU总是处理中断,其他任务无法得到调度,于是NAPI(NewAPI)出现了,采用中断+轮询的方式收包以提高吞吐。

NAPI收包需要网卡驱动支持,如intel e1000系列网卡,在收包中断中e1000_intr_msix_rx将网卡napi实例加入softnet_data的poll_list链表上,然后设置NET_RX_SOFTIRQ软中断标志,等待net_rx_action中检查标志并处理。何时运行软中断?两个时机:1,do_IRQ-->irq_exit-->do_softirq-->call_softirq-->__do_softirq中断上半部退出的时候调用软中断处理函数net_rx_action,net_rx_action遍历poll_list链表上的网卡,函数执行过程如下(kernel version 3.2.x)。2,__do_softirq循环调用MAX_SOFTIRQ_RESTART = 10次net_rx_action如果还有pending的报文,则wakeup_softirqd唤醒ksoftirqd内核线程运行run_ksoftirqd-->__do_softirq-->net_rx_action收包。

static void net_rx_action(struct softirq_action *h)

{

struct softnet_data *sd = &__get_cpu_var(softnet_data);

unsigned long time_limit = jiffies + 2;

int budget = netdev_budget; //一次中断处理的skb数目,系统默认300,对应net.core.netdev_budget = 300

void *have;

local_irq_disable(); //关闭中断以访问softnet_data

while (!list_empty(&sd->poll_list)) {

struct napi_struct *n;

int work, weight;

/* If softirq window is exhuasted then punt.

* Allow this to run for 2 jiffies since which will allow

* an average latency of 1.5/HZ.

*/

if (unlikely(budget <= 0 || time_after_eq(jiffies, time_limit))) //轮询时间不要超过2个jiffies,处理skb数目不要超过预算300

goto softnet_break;

local_irq_enable();

/* Even though interrupts have been re-enabled, this

* access is safe because interrupts can only add new

* entries to the tail of this list, and only ->poll()

* calls can remove this head entry from the list.

*/

n = list_first_entry(&sd->poll_list, struct napi_struct, poll_list); //取poll_list链表的头,即某网卡的napi实例

have = netpoll_poll_lock(n);

weight = n->weight;//该网卡一次轮询最多处理的报文个数,64

/* This NAPI_STATE_SCHED test is for avoiding a race

* with netpoll's poll_napi().  Only the entity which

* obtains the lock and sees NAPI_STATE_SCHED set will

* actually make the ->poll() call.  Therefore we avoid

* accidentally calling ->poll() when NAPI is not scheduled.

*/

work = 0;

if (test_bit(NAPI_STATE_SCHED, &n->state)) {

work = n->poll(n, weight);//调用设备特定的poll函数处理报文,poll中如果一次把包收完会将设备从poll_list上摘除?;如果是非NAPI调用的是process_backlog;

trace_napi_poll(n);

}

WARN_ON_ONCE(work > weight);

budget -= work;

local_irq_disable();

/* Drivers must not modify the NAPI state if they

* consume the entire weight.  In such cases this code

* still "owns" the NAPI instance and therefore can

* move the instance around on the list at-will.

*/

//如果一次就把weight消耗光了,说明可能还需要继续轮询这个设备,所以把这个napi放到poll_list的末尾;如果还有报文在gro处理中,不再等待直接将报文feed进协议栈

if (unlikely(work == weight)) {

if (unlikely(napi_disable_pending(n))) {

local_irq_enable();

napi_complete(n);

local_irq_disable();

} else {

if (n->gro_list) {

/* flush too old packets

* If HZ < 1000, flush all packets.

*/

local_irq_enable();

napi_gro_flush(n, HZ >= 1000);

local_irq_disable();

}

list_move_tail(&n->poll_list, &sd->poll_list);

}

}

netpoll_poll_unlock(have);

}

out:

net_rps_action_and_irq_enable(sd);

#ifdef CONFIG_NET_DMA

/*

* There may not be any more sk_buffs coming right now, so push

* any pending DMA copies to hardware

*/

dma_issue_pending_all();

#endif

return;

softnet_break:

sd->time_squeeze++;

__raise_softirq_irqoff(NET_RX_SOFTIRQ);//如果本轮轮询没有处理完,设置软中断标志,等下次软中断调用net_rx_action处理?

goto out;

}

软中断之后报文进入内核协议栈进行处理。期间还设计netfilter,xfrm(ipsec)等的处理,后续再详细分析。

IP报文的处理过程如下:

硬件中断 -->do_IRQ-->handle_irq-->e1000_intr_msix_rx-->__napi_schedule(&adapter->napi)-->

____napi_schedule-->__raise_softirq_irqoff(NET_RX_SOFTIRQ)

do_IRQ-->irq_exit-->do_softirq-->call_softirq-->__do_softirq-->

net_rx_action->e1000e_poll-->e1000_receive_skb->napi_gro_receive-->

netif_receive_skb-->__netif_receive_skb-->__netif_receive_skb_core-->

deliver_skb-->ip_rcv-->NF_HOOK(NF_INET_PRE_ROUTING)-->

ip_rcv_finish-->dst_input-->ip_local_deliver-->

NF_HOOK(NF_INET_LOCAL_IN)-->ip_local_deliver_finish-->ipprot->handler()

ip_forward-->NF_HOOK(NF_INET_FORWARD)-->ip_forward_finish-->

dst_output-->dst->output-->ip_output-->NF_HOOK_COND(NF_INET_POST_ROUTING)-->

ip_finish_output-->ip_finish_output2-->__ipv4_neigh_lookup_noref-->

dst_neigh_output-->neigh_hh_output-->dev_queue_xmit-->dev_hard_start_xmit-->ndo_start_xmit

网上找到个协议栈收发包流程图图,非常好,感谢原作者.

1e971c902eb7d6910b71771a25b14164.png

参考:

http://blog.csdn.net/hui6075/article/details/51196056

标签:报文,NAPI,list,收发,内核,Linux,net,poll,napi

来源: https://blog.csdn.net/wangyangzhizunwudi/article/details/99864501

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值