一、分类处理内存泄露
内存泄露有很多种情况,首先我们需要作一个初步筛查。比如先区分到底是驱动导致的内存泄露还是用户层程序导致的内存泄露。
可以通过如下方法分类到底是哪种类型的内存泄露。
1、查看驱动是否存在内存泄露
1.1 连续卸载安装驱动程序10000次,查看是否有内存泄露。
1.2 如果1.1没有找到内存泄露,可以写简单的业务应用程序,先安装驱动,再跑简单业务,最后卸载驱动,查看是否有内存泄露。
2、查看应用是否存在内存泄露
应用程序连续跑业务拷机。注意,如果是应用和驱动联合跑的程序,建议先排查驱动是否存在内存泄露问题。应用层的内存泄露可以参考如下文章:
3、通过pmap分析进程内存细节
通过pmap -X -p `pidof xxx`来获取进程的地址映射空间,可以分析进程内存细节。下面就是通过pmap分析telenetd程序启动起来后的内存使用情况。下面是arm的情况,参数与我们平时用的有些区别。
~ # pmap -x `pidof vdec_unit_test`
131: ./vdec_unit_test --vdec-cfgpath=config/vdec_unittest_path.cfg --gtest_filter=vdec_unitest.testcase_spec_typical_resolution_h264
Address Kbytes PSS Dirty Swap Mode Mapping
0000000000400000 1044 704 704 0 r-xp /usr/vdec/vdec_unit_test
0000000000514000 28 28 28 0 r--p /usr/vdec/vdec_unit_test
000000000051b000 4 4 4 0 rw-p /usr/vdec/vdec_unit_test
000000000051c000 28 8 8 0 rw-p [ anon ]
000000001defa000 332 220 220 0 rw-p [heap]
0000ffff84000000 132 8 8 0 rw-p [ anon ]
0000ffff84021000 65404 0 0 0 ---p [ anon ]
0000ffff8c000000 132 4 4 0 rw-p [ anon ]
0000ffff8c021000 65404 0 0 0 ---p [ anon ]
0000ffff90482000 4 0 0 0 ---p [ anon ]
0000ffff90483000 8192 12 12 0 rw-p [ anon ]
0000ffff90c83000 4 0 0 0 ---p [ anon ]
0000ffff90c84000 8192 12 12 0 rw-p [ anon ]
0000ffff91484000 4 0 0 0 ---p [ anon ]
0000ffff91485000 8192 12 12 0 rw-p [ anon ]
0000ffff91c85000 24 24 24 0 r-xp /lib/librt-2.25.so
0000ffff91c8b000 60 0 0 0 ---p /lib/librt-2.25.so
0000ffff91c9a000 4 4 4 0 r--p /lib/librt-2.25.so
0000ffff91c9b000 4 4 4 0 rw-p /lib/librt-2.25.so
0000ffff91c9c000 8 8 8 0 r-xp /lib/libdl-2.25.so
0000ffff91c9e000 60 0 0 0 ---p /lib/libdl-2.25.so
0000ffff91cad000 4 4 4 0 r--p /lib/libdl-2.25.so
0000ffff91cae000 4 4 4 0 rw-p /lib/libdl-2.25.so
0000ffff91caf000 1224 179 0 0 r-xp /lib/libc-2.25.so
0000ffff91de1000 64 0 0 0 ---p /lib/libc-2.25.so
0000ffff91df1000 16 16 16 0 r--p /lib/libc-2.25.so
0000ffff91df5000 8 8 8 0 rw-p /lib/libc-2.25.so
0000ffff91df7000 16 16 16 0 rw-p [ anon ]
0000ffff91dfb000 72 20 20 0 r-xp /lib/libgcc_s.so.1
0000ffff91e0d000 64 0 0 0 ---p /lib/libgcc_s.so.1
0000ffff91e1d000 4 4 4 0 r--p /lib/libgcc_s.so.1
0000ffff91e1e000 4 4 4 0 rw-p /lib/libgcc_s.so.1
0000ffff91e1f000 640 9 0 0 r-xp /lib/libm-2.25.so
0000ffff91ebf000 60 0 0 0 ---p /lib/libm-2.25.so
0000ffff91ece000 4 4 4 0 r--p /lib/libm-2.25.so
0000ffff91ecf000 4 4 4 0 rw-p /lib/libm-2.25.so
0000ffff91ed0000 1500 1280 1280 0 r-xp /lib/libstdc++.so.6.0.24
0000ffff92047000 64 0 0 0 ---p /lib/libstdc++.so.6.0.24
0000ffff92057000 40 40 40 0 r--p /lib/libstdc++.so.6.0.24
0000ffff92061000 8 8 8 0 rw-p /lib/libstdc++.so.6.0.24
0000ffff92063000 12 12 12 0 rw-p [ anon ]
0000ffff92066000 336 208 208 0 r-xp /lib/libGAL.so
0000ffff920ba000 64 0 0 0 ---p /lib/libGAL.so
0000ffff920ca000 4 4 4 0 r--p /lib/libGAL.so
0000ffff920cb000 60 60 60 0 rw-p /lib/libGAL.so
0000ffff920da000 72 64 64 0 r-xp /lib/libgpe.so
0000ffff920ec000 64 0 0 0 ---p /lib/libgpe.so
0000ffff920fc000 4 4 4 0 r--p /lib/libgpe.so
0000ffff920fd000 4 4 4 0 rw-p /lib/libgpe.so
0000ffff920fe000 40 40 40 0 r-xp /lib/libvenc.so
0000ffff92108000 64 0 0 0 ---p /lib/libvenc.so
0000ffff92118000 4 4 4 0 r--p /lib/libvenc.so
0000ffff92119000 4 4 4 0 rw-p /lib/libvenc.so
0000ffff9211a000 8 8 8 0 r-xp /lib/libsysapi.so
0000ffff9211c000 60 0 0 0 ---p /lib/libsysapi.so
0000ffff9212b000 4 4 4 0 r--p /lib/libsysapi.so
0000ffff9212c000 4 4 4 0 rw-p /lib/libsysapi.so
0000ffff9212d000 88 88 88 0 r-xp /lib/libpthread-2.25.so
0000ffff92143000 64 0 0 0 ---p /lib/libpthread-2.25.so
0000ffff92153000 4 4 4 0 r--p /lib/libpthread-2.25.so
0000ffff92154000 4 4 4 0 rw-p /lib/libpthread-2.25.so
0000ffff92155000 16 4 4 0 rw-p [ anon ]
0000ffff92159000 12 12 12 0 r-xp /lib/libbaseapi.so
0000ffff9215c000 60 0 0 0 ---p /lib/libbaseapi.so
0000ffff9216b000 4 4 4 0 r--p /lib/libbaseapi.so
0000ffff9216c000 4 4 4 0 rw-p /lib/libbaseapi.so
0000ffff9216d000 12 12 12 0 r-xp /lib/libvdec.so
0000ffff92170000 60 0 0 0 ---p /lib/libvdec.so
0000ffff9217f000 4 4 4 0 r--p /lib/libvdec.so
0000ffff92180000 4 4 4 0 rw-p /lib/libvdec.so
0000ffff92181000 112 16 0 0 r-xp /lib/ld-2.25.so
0000ffff921a0000 40 36 36 0 rw-p [ anon ]
0000ffff921aa000 4 1 0 0 r--p [vvar]
0000ffff921ab000 4 0 0 0 r-xp [vdso]
0000ffff921ac000 4 4 4 0 r--p /lib/ld-2.25.so
0000ffff921ad000 8 8 8 0 rw-p /lib/ld-2.25.so
0000ffffe8b06000 132 12 12 0 rw-p [stack]
---------------- ------ ------ ------ ------
total 162504 3289 3084 0
~ #
二、内存泄露分析(对用户层和内核层都适用)
通过cat /proc/meminfo查看当前的系统是否存在内存泄露。不过在此之前首先需要通过echo 3 > /proc/sys/vm/drop_caches来手动清除缓存。通过对比内存泄露前后的meminfo可以分析大体泄露的内存位置。
从MemFree和MemAvailable看,内存都降低了约140M。然后通过下面的详细分类可以看出来基本上都消耗在了slab,slab消耗了139M,在仔细看可以发现SUreclaim消耗了139M。通过free查看前后消耗的内存数量,也基本上能对应上。
基本上可以确认由于Unreclaim类型的slab泄漏导致的内存泄漏。
三、分析slab内存泄露点
分slab主要用以下分析方案,先查看/proc/slabinfo内容,然后根据slabinfo在/sys/kernel/slab中具体分析。
首先对比/proc/slabinfo前后的差异:
以ext4_inode_cache为例说明各参数的意义:
name:slab object名称;
active_objs:活跃的对象个数;
num_objs:总的对象个数;
objsize:每个对象的大小,以字节为单位;
objperslab:每个slab包含的ext4_inode_cache对象数目;
pageperslab:tunable:一个slab占几个page内存页,一个slab的大小为880*4=3440,小于内存页大小4096,所以一个slab只占用一个内存页;
limit:每个 CPU 可以缓存的对象的最大数量;
batchcount:当缓存为空时转换到每个 CPU 缓存中全局缓存对象的最大数量;
sharedfactor:说明了对称多处理器(Symmetric MultiProcessing,SMP)系统的共享行为;
active_slabs:活跃的slab数目;
num_slabs:总的slab数目;
通过上面的信息,计算kmalloc-4k,kmalloc-128和seq_file分别消耗了103,661,568(103M)5,234,688(5M)和 1,750,736(1M内存)。
我们重点分析kmalloc-4k,在/sys/kernel/slab/kmalloc-4k/中提供了alloc_calls和free_calls两个节点记录当前消耗的内存的调用函数。
如果显示如下信息:
# cat /sys/kernel/slab/kmem_cache/alloc_calls
cat: read error: Function not implemented
说明内核配置CONFIG_SLUB_DEBUG_ON没有打开,打开即可以看到调用栈(将.config中的相关参数修改为CONFIG_SLUB_DEBUG_ON=y)。
四、增加内核打印信息,查找具体调用栈
理论上有两种方法:1,在内核代码添加kvmalloc_node中添加dump_stack()函数,对内核重新编译后可以打印出调用栈;2,使用echo 1> /sys/kernel/slab/kmalloc-4k/trace 打开slab跟踪,也可以打出slab相关调用栈。
调用栈如下:因为打印较多需要仔细查看,到这里大概率可以看到自己的代码逻辑了,查看是否申请的内存没有被释放。搜索对应的TRACE kmalloc-4k 关键字,即可看到alloc和free的地址是否对应来排查那些地址有内存泄漏,调用的函数是哪些。
[47653.135133] CPU: 0 PID: 18328 Comm: rmmod Tainted: G O 5.4.94 #2
[47653.142579] Hardware name: Visinextek Technologies, Inc. vs819-emulation (DT)
[47653.149836] Call trace:
[47653.152574] dump_backtrace+0x0/0x150
[47653.156422] show_stack+0x14/0x20
[47653.159935] dump_stack+0xbc/0x118
[47653.163574] free_debug_processing+0x24c/0x400
[47653.168228] __slab_free+0x2e0/0x418
[47653.172002] kfree+0x28c/0x298
[47653.175274] kobject_uevent_env+0x100/0x510
[47653.179661] kobject_uevent+0x10/0x18
[47653.183526] device_del+0x23c/0x348
[47653.187202] device_destroy+0x50/0x88
[47653.191051] misc_deregister+0x78/0x128
[47653.197282] vdec_driver_remove+0x80/0xe8 [vs_vdec]
[47653.202395] platform_drv_remove+0x24/0x50
[47653.206720] device_release_driver_internal+0xf4/0x1c8
[47653.212070] driver_detach+0x4c/0xd8
[47653.215851] bus_remove_driver+0x54/0xb0
[47653.219983] driver_unregister+0x2c/0x58
[47653.224086] platform_driver_unregister+0x10/0x18
[47653.230433] vs_vdec_exit+0x2c/0x48 [vs_vdec]
[47653.235055] __arm64_sys_delete_module+0x154/0x238
[47653.240080] el0_svc_common.constprop.2+0x64/0x168
[47653.245074] el0_svc_handler+0x20/0x80
[47653.249004] el0_svc+0x8/0x204
[47653.306749] vdec deinit OK
[47653.311169] TRACE kmalloc-4k alloc 0x0000000035b8bec4 inuse=2 fp=0x0000000000000000
[47653.319089] CPU: 0 PID: 18328 Comm: rmmod Tainted: G O 5.4.94 #2
[47653.326544] Hardware name: Visinextek Technologies, Inc. vs819-emulation (DT)
[47653.333802] Call trace:
[47653.336520] dump_backtrace+0x0/0x150
[47653.340364] show_stack+0x14/0x20
[47653.343877] dump_stack+0xbc/0x118
[47653.347522] alloc_debug_processing+0x58/0x198
[47653.352189] ___slab_alloc.constprop.102+0x66c/0x6c8
[47653.357378] __slab_alloc.isra.96.constprop.101+0x30/0x48
[47653.362989] kmem_cache_alloc+0x214/0x228
[47653.367211] kobject_uevent_env+0xb0/0x510
[47653.371507] kobject_uevent+0x10/0x18
[47653.375402] device_release_driver_internal+0x174/0x1c8
[47653.380841] driver_detach+0x4c/0xd8
[47653.384617] bus_remove_driver+0x54/0xb0
[47653.388752] driver_unregister+0x2c/0x58
[47653.392857] platform_driver_unregister+0x10/0x18
[47653.399318] vs_vdec_exit+0x2c/0x48 [vs_vdec]
[47653.403955] __arm64_sys_delete_module+0x154/0x238
[47653.408984] el0_svc_common.constprop.2+0x64/0x168
[47653.413984] el0_svc_handler+0x20/0x80
[47653.417910] el0_svc+0x8/0x204
[47653.429722] TRACE kmalloc-4k alloc 0x00000000a7d685c8 inuse=2 fp=0x0000000000000000
[47653.437702] CPU: 0 PID: 78 Comm: mdev Tainted: G O 5.4.94 #2
[47653.444810] Hardware name: Visinextek Technologies, Inc. vs819-emulation (DT)
[47653.452072] Call trace:
[47653.454801] dump_backtrace+0x0/0x150
[47653.458642] show_stack+0x14/0x20
[47653.462151] dump_stack+0xbc/0x118
[47653.465786] alloc_debug_processing+0x58/0x198
[47653.470461] ___slab_alloc.constprop.102+0x66c/0x6c8
[47653.475649] __slab_alloc.isra.96.constprop.101+0x30/0x48
[47653.481253] __kmalloc+0x28c/0x2a0
[47653.484872] kvmalloc_node+0x90/0xa8
[47653.488673] seq_read+0x398/0x4f8
[47653.492171] kernfs_fop_read+0x14c/0x218
[47653.496303] __vfs_read+0x18/0x38
[47653.499812] vfs_read+0x74/0xd8
[47653.503146] ksys_read+0x68/0xf8
[47653.506574] __arm64_sys_read+0x18/0x20
[47653.510633] el0_svc_common.constprop.2+0x64/0x168
[47653.515624] el0_svc_handler+0x20/0x80
[47653.519549] el0_svc+0x8/0x204
[47650.701235] TRACE kmalloc-4k free 0x00000000c766e47e inuse=2 fp=0x0000000000000000
[47650.709122] Object 00000000c766e47e: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[47650.718628] Object 00000000cf371536: 00 00 00 00 00 00 00 00 1c 11 86 85 00 00 ff ff ................
[47650.728119] Object 00000000cd4063bd: 2a 11 86 85 00 00 ff ff 4d 11 86 85 00 00 ff ff *.......M.......
[47650.737603] Object 000000006fa8585c: 5c 11 86 85 00 00 ff ff 65 11 86 85 00 00 ff ff ................
[47653.135133] CPU: 0 PID: 18328 Comm: rmmod Tainted: G O 5.4.94 #2
[47653.142579] Hardware name: Visinextek Technologies, Inc. vs819-emulation (DT)
[47653.149836] Call trace:
[47653.152574] dump_backtrace+0x0/0x150
[47653.156422] show_stack+0x14/0x20
[47653.159935] dump_stack+0xbc/0x118
[47653.163574] free_debug_processing+0x24c/0x400
[47653.168228] __slab_free+0x2e0/0x418
[47653.172002] kfree+0x28c/0x298
[47653.175274] kobject_uevent_env+0x100/0x510
[47653.179661] kobject_uevent+0x10/0x18
[47653.183526] device_del+0x23c/0x348
[47653.187202] device_destroy+0x50/0x88
[47653.191051] misc_deregister+0x78/0x128
[47653.197282] vdec_driver_remove+0x80/0xe8 [vs_vdec]
[47653.202395] platform_drv_remove+0x24/0x50
[47653.206720] device_release_driver_internal+0xf4/0x1c8
[47653.212070] driver_detach+0x4c/0xd8
[47653.215851] bus_remove_driver+0x54/0xb0
[47653.219983] driver_unregister+0x2c/0x58
[47653.224086] platform_driver_unregister+0x10/0x18
[47653.230433] vs_vdec_exit+0x2c/0x48 [vs_vdec]
[47653.235055] __arm64_sys_delete_module+0x154/0x238
[47653.240080] el0_svc_common.constprop.2+0x64/0x168
[47653.245074] el0_svc_handler+0x20/0x80
[47653.249004] el0_svc+0x8/0x204
[47653.306749] vdec deinit OK