跟踪slab分配堆栈流程的方法(perf、systemtap)
内存泄露是在解决内核故障会遇到的棘手情况,根据具体的内存使用情况,追踪相应slab cache的分配堆栈流程,是追踪泄露原因的第一步,接下来根据内核版本的不同,介绍二种跟踪slab分配的方法;
1.在perf支持probe功能的情况下(内核版本高于3.10内核),可以使用perf probe追踪相应slab cache的分配堆栈流程;
参考:https://access.redhat.com/solutions/2850631
脚本如下:vim trace_slab_cache.sh
#!usr/bin/bash
[ $# -eq 0 ] && {
echo "Usage: $0 <slab_cache_name> <timer>
Note, the cache names can be found in '\proc\slabinfo'"
exit 1
}
SLAB="$1"
TIMER=30
[ $# -eq 2 ] && { TIMER=$2; }
grep -q ^"$SLAB" /proc/slabinfo || {
echo "error: no '$SLAB' slab cache exists"; exit 2
exit 2
}
UNAME_R=$(uname -r)
echo "$UNAME_R" | grep -q el5 && { RHEL_VER=el5; }
echo "$UNAME_R" | grep -q el6 && { RHEL_VER=el6; }
echo "$UNAME_R" | grep -q el7 && { RHEL_VER=el7; }
echo "$UNAME_R" | grep -q el8 && { RHEL_VER=el8; }
perf probe -d kmem_cache_alloc* 2>/dev/null
case $RHEL_VER in
el5)
perf probe kmem_cache_alloc 'cachep->name:string' 2>/dev/null # RHEL 5
;;
el6)
perf probe kmem_cache_alloc 'cachep->name:string' 2>/dev/null # RHEL 6
;;
el7)
perf probe kmem_cache_alloc 's->name:string' 2>/dev/null # RHEL 7
;;
el8)
perf probe kmem_cache_alloc 's->name:string' 2>/dev/null # RHEL 8
;;
*)
;;
esac
grep -q 'probe/kmem_cache_alloc' /sys/kernel/debug/tracing/kprobe_events || {
echo "error: failed to add the probe"
exit 3
}
echo "collecting the data for $TIMER seconds, stand by..."
perf record -a -g -e probe:kmem_cache_alloc --filter 'name == "'$SLAB'"' sleep "$TIMER"
perf probe -d kmem_cache_alloc* 2>/dev/null
echo "creating the archive with debugging symbols..."
rpm -q kernel-debuginfo-* >/dev/null \
|| echo "warning: package kernel-debuginfo-$UNAME_R is not installed"
perf archive >/dev/null
echo "done: please share both perf.data and" perf.data.tar.* "with Red Hat support for analysis"
echo "note: if there is no perf.data.tar.* generated, $SLAB might not be in use during $TIMER seconds
or $SLAB might not be used by kmem_cache_alloc function"
1)执行以上脚本可以收集对应slab cache的堆栈数据到perf.data文件中;使用方法:sh trace_slab_cache.sh kmalloc-512 20(其中kmalloc-512是slab cache名称,可以通过slabtop -o s c结果的NAME字段设置;20是收集的时长);
2)执行命令perf scripts查看数据perf.data即可查询slab cache分配kmalloc-512的所有函数调用堆栈。
2.在perf不支持probe功能的情况下(内核版本低于3.10内核,高版本内核也可用),只能使用systemtap(依赖 kernel-debuginfo kernel-debuginfo-common kernel-devel包)打点跟踪,下面可以举个简单例子
vim trace_slab_cache.stp
#/usr/sbin/stap
global count
probe kernel.function("kmem_cache_alloc")
{
if( kernel_string($cachep->name) == "scsi_data_buffer"){ #scsi_data_buffer是我们要追踪的slab cache名称
count++;//分配时+1;
printf("======execname:%s======cache name:%s======\n",execname(),kernel_string($cachep->name));(高版本cachep为s)
printf("=================alloc strace====================\n");
print_backtrace();
}
}
probe kernel.function("kmem_cache_free")
{
if( kernel_string($cachep->name) == "scsi_data_buffer"){ #scsi_data_buffer是我们要追踪的slab cache名称
count--;//释放时-1
printf("======execname:%s======cache name:%s======\n",execname(),kernel_string($cachep->name));(高版本cachep为s)
printf("=================free strace====================\n");
print_backtrace();
}
}
probe timer.s(10)
{
printf("====================count:%d===============\n",count);//打印内存分配次数,粗略的打印,需要先确保流程单一,结果为正值可能存在泄漏
}
执行stap --all-modules trace_slab_cache.stp即可打印跟踪堆栈和分配计数。