如何识别并解决复杂的dcache问题

最新推荐文章于 2024-05-25 09:41:14 发布

置顶

安第斯智能云

最新推荐文章于 2024-05-25 09:41:14 发布

阅读量571

点赞数

文章标签：后端

本文链接：https://blog.csdn.net/weixin_59152315/article/details/118721823

版权

本文详细介绍了在CentOS 7.6环境中，如何识别和解决一个关于dcache的问题，特别是针对ipv6路径的延迟问题。通过分析perf热点、dentry冲突链和内核调用栈，揭示了问题的根源在于net_namespace的dentry缓存。解决方案包括监控dentry状态、设置缓存限制，并探讨了内核hash桶冲突链的监控策略。

摘要由CSDN通过智能技术生成

背景：这个是在centos7.6的环境上复现的，但该问题其实在很多内核版本上都有，如何做好对linux一些缓存的监控和控制，一直是云计算方向的热点，但这些热点属于细分场景，很难合入到linux主基线，随着ebpf的逐渐稳定，对通用linux内核的编程，观测，可能会有新的收获。下面列一下我们是怎么排查并解决这个问题的。

一、故障现象

oppo云内核团队发现集群的snmpd的cpu消耗冲高,
snmpd几乎长时间占用一个核，perf发现热点如下：

+   92.00%     3.96%  [kernel]    [k]    __d_lookup 
-   48.95%    48.95%  [kernel]    [k] _raw_spin_lock 
     20.95% 0x70692f74656e2f73                       
        __fopen_internal                              
        __GI___libc_open                              
        system_call                                   
        sys_open                                       
        do_sys_open                                    
        do_filp_open                                   
        path_openat                                    
        link_path_walk                                 
      + lookup_fast                                    
-   45.71%    44.58%  [kernel]    [k] proc_sys_compare 
   - 5.48% 0x70692f74656e2f73                          
        __fopen_internal                               
        __GI___libc_open                               
        system_call                                    
        sys_open                                       
        do_sys_open                                    
        do_filp_open                                   
        path_openat                                    
   + 1.13% proc_sys_compare

几乎都消耗在内核态 __d_lookup的调用中，然后strace看到的消耗为：

open("/proc/sys/net/ipv4/neigh/kube-ipvs0/retrans_time_ms", O_RDONLY) = 8 <0.000024>------v4的比较快
open("/proc/sys/net/ipv6/neigh/ens7f0_58/retrans_time_ms", O_RDONLY) = 8 <0.456366>-------v6很慢

进一步手工操作，发现进入ipv6的路径很慢：

time cd /proc/sys/net

real 0m0.000s
user 0m0.000s
sys 0m0.000s

time cd /proc/sys/net/ipv6

real 0m2.454s
user 0m0.000s
sys 0m0.509s

time cd /proc/sys/net/ipv4

real 0m0.000s
user 0m0.000s
sys 0m0.000s
可以看到，进入ipv6的路径的时间消耗远远大于ipv4的路径。

二、故障现象分析

我们需要看一下，为什么perf的热点显示为__d_lookup中proc_sys_compare消耗较多，它的流程是怎么样的
proc_sys_compare只有一个调用路径，那就是d_compare回调，从调用链看：

__d_lookup--->if (parent->d_op->d_compare(parent, dentry, tlen, tname, name))
struct dentry *__d_lookup(const struct dentry *parent, const struct qstr *name)
{
.....
	hlist_bl_for_each_entry_rcu(dentry, node, b, d_hash) {

		if (dentry->d_name.hash != hash)
			continue;

		spin_lock(&dentry->d_lock);
		if (dentry->d_parent != parent)
			goto next;
		if (d_unhashed(dentry))
			goto next;

		/*
		 * It is safe to compare names since d_move() cannot
		 * change the qstr (protected by d_lock).
		 */
		if (parent->d_flags & DCACHE_OP_COMPARE) {
			int tlen = dentry->d_name.len;
			const char *tname = dentry->d_name.name;
			if (parent->d_op->d_compare(parent, dentry, tlen, tname, name))
				goto next;//caq：返回1则是不相同
		} else {
			if (dentry->d_name.len != len)
				goto next;
			if (dentry_cmp(dentry, str, len))
				goto next;
		}
		....
next:
		spin_unlock(&dentry->d_lock);//caq:再次进入链表循环
 	}		

.....
}

集群同物理条件的机器，snmp流程应该一样，所以很自然就怀疑，是不是hlist_bl_for_each_entry_rcu
循环次数过多，导致了parent->d_op->d_compare不停地比较冲突链，
进入ipv6的时候，是否比较次数很多，因为遍历list的过程中肯定会遇到了比较多的cache miss，当遍历了
太多的链表元素，则有可能触发这种情况，下面需要验证下：

static inline long hlist_count(const struct dentry *parent, const struct qstr *name)
{
  long count = 0;
  unsigned int hash = name->hash;
  struct hlist_bl_head *b = d_hash(parent, hash);
  struct hlist_bl_node *node;
  struct dentry *dentry;

  rcu_read_lock();
  hlist_bl_for_each_entry_rcu(dentry, node, b, d_hash) {
    count++;
  }
  rcu_read_unlock();
  if(count >COUNT_THRES)
  {
     printk("hlist_bl_head=%p,count=%ld,name=%s,hash=%u\n",b,count,name,name->hash);
  }
  return count;
}

kprobe的结果如下：

[20327461.948219] hlist_bl_head=ffffb0d7029ae3b0 count = 799259,name=ipv6/neigh

最低0.47元/天解锁文章

安第斯智能云

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫