linux config_hz 与io中断延时,linux 如何降低入向软中断占比

最近遇到一个问题,当tcp收包的时候,我们的服务器的入向软中断比例很高。

我们知道,napi模式,可以降低收包入向软中断占比,那么,针对napi模式,能不能优化?本文针对2.6.32-358内核进行分析:

static void net_rx_action(struct softirq_action *h)

{struct list_head *list = &__get_cpu_var(softnet_data).poll_list;

unsignedlong time_limit = jiffies + 2;int budget =netdev_budget;----------------这个值可以通过/proc/sys/net/core/netdev_budget修改,默认是300void *have;int select;struct rps_remote_softirq_cpus *rcpus;

local_irq_disable();while (!list_empty(list)) {-------------------不为空,则一直循环struct napi_struct *n;intwork, weight;/*If softirq window is exhuasted then punt.

* Allow this to run for 2 jiffies since which will allow

* an average latency of 1.5/HZ.*/

if (unlikely(budget <= 0 ||time_after(jiffies, time_limit)))----------时间到,或者配额到了,则退出循环。gotosoftnet_break;

local_irq_enable();/*Even though interrupts have been re-enabled, this

* access is safe because interrupts can only add new

* entries to the tail of this list, and only ->poll()

* calls can remove this head entry from the list.*/n= list_first_entry(list, structnapi_struct, poll_list);

have=netpoll_poll_lock(n);

weight= n->weight;--------------napi配置的weight,这个是一次poll的配额,和上面总的配合一起控制收包,这个在netif_napi_add 函数中设置。

/*This NAPI_STATE_SCHED test is for avoiding a race * with netpoll's poll_napi(). Only the entity which * obtains the lock and sees NAPI_STATE_SCHED set will * actually make the ->poll() call. Therefore we avoid * accidently calling ->poll() when NAPI is not scheduled.*/work= 0;if (test_bit(NAPI_STATE_SCHED, &n->state)) { work= n->poll(n, weight);------------回调poll,不同的厂家有不同的实现,比如intel的ixgbe_poll实现 trace_napi_poll(n); } WARN_ON_ONCE(work>weight); budget-=work; local_irq_disable();/*Drivers must not modify the NAPI state if they * consume the entire weight. In such cases this code * still "owns" the NAPI instance and therefore can * move the instance around on the list at-will.*/ if (unlikely(work ==weight)) {if(unlikely(napi_disable_pending(n))) { local_irq_enable(); napi_complete(n); local_irq_disable(); }elselist_move_tail(&n->poll_list, list); } netpoll_poll_unlock(have); }out: rcpus= &__get_cpu_var(rps_remote_softirq_cpus);select = rcpus->select; rcpus->select ^= 1; local_irq_enable(); net_rps_action(&rcpus->mask[select]); #ifdef CONFIG_NET_DMA/** There may not be any more sk_buffs coming right now, so push * any pending DMA copies to hardware*/dma_issue_pending_all();#endif return; softnet_break: __get_cpu_var(netdev_rx_stat).time_squeeze++; __raise_softirq_irqoff(NET_RX_SOFTIRQ);goto out; }

从代码可以看出,限制一次调用net_rx_action的地方,无非是时间,还有netdev_budget,如果把netdev_budget 调大,是不是就可以一次性多收一点包呢,意味着触发软中断的次数就会减少,答案是肯定的。

那么默认值来看,netdev_budget 比napi配置的weight要大。

/* initialize NAPI */

netif_napi_add(adapter->netdev, &q_vector->napi,

ixgbe_poll, 64);-------传入64

void netif_napi_add(struct net_device *dev, struct napi_struct *napi,int (*poll)(struct napi_struct *, int), intweight)

{

INIT_LIST_HEAD(&napi->poll_list);

napi->gro_count = 0;

napi->gro_list =NULL;

napi->skb =NULL;

napi->poll =poll;if (weight >NAPI_POLL_WEIGHT)

pr_err_once("netif_napi_add() called with weight %d on device %s\n",--------会打印,但是也不会限制

weight, dev->name);

napi->weight =weight;

list_add(&napi->dev_list, &dev->napi_list);

napi->dev =dev;

#ifdef CONFIG_NETPOLL

spin_lock_init(&napi->poll_lock);

napi->poll_owner = -1;#endifset_bit(NAPI_STATE_SCHED,&napi->state);

我们通过将传入的64改大到256,因为在

int ixgbe_poll(struct napi_struct *napi, intbudget)

{struct ixgbe_q_vector *q_vector =container_of(napi,structixgbe_q_vector, napi);struct ixgbe_adapter *adapter = q_vector->adapter;struct ixgbe_ring *ring;intper_ring_budget;bool clean_complete = true;

#ifdef CONFIG_IXGBE_DCAif (adapter->flags &IXGBE_FLAG_DCA_ENABLED)

ixgbe_update_dca(q_vector);#endifixgbe_for_each_ring(ring, q_vector->tx)----------------------遍历tx队列,跟本文讨论的内容不相关,

clean_complete&= !!ixgbe_clean_tx_irq(q_vector, ring);if (!ixgbe_qv_lock_napi(q_vector))returnbudget;/*attempt to distribute budget to each queue fairly, but don't allow

* the budget to go below 1 because we'll exit polling*/

if (q_vector->rx.count > 1)

per_ring_budget= max(budget/q_vector->rx.count, 1);---通过增大 budget,从64改大到256,增加了一次循环收包的个数elseper_ring_budget=budget;

ixgbe_for_each_ring(ring, q_vector->rx)-----------遍历收包队列,

clean_complete&=(ixgbe_clean_rx_irq(q_vector, ring,

per_ring_budget)

ixgbe_qv_unlock_napi(q_vector);/*If all work not completed, return budget and keep polling*/

if (!clean_complete)returnbudget;/*all work done, exit the polling mode*/napi_complete(napi);if (adapter->rx_itr_setting & 1)

ixgbe_set_itr(q_vector);if (!test_bit(__IXGBE_DOWN, &adapter->state))

ixgbe_irq_enable_queues(adapter, ((u64)1 << q_vector->v_idx));return 0;

}

通过将 budget 从64改大到256,同样的入向流量,软中断从25%降低到20%,效果很明显。

那么,能否无限制改大呢,显然不行,一则是改大后,各个队列收包就很难均衡,因为每个队列收完对应的报文之后(除非收空了),才能返回,这样,就关中断时间太长了。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值