1. 打开内核的SLUB DEBUG选项
+CONFIG_SLUB_DEBUG=y
+CONFIG_SLUB_DEBUG_ON=y
2. 观察slabinfo
cat /proc/slabinfo
启动后记录下slabinfo。运行一段时间,再观察slabinfo。
找到增长比较大的slab。
3. 打开slab trace
echo 1 > /sys/kernel/slab/<leaking_slab>/trace
打开以后slab trace会向console打印。
如果console是串口的话很有可能把系统打的无响应。最好写一个脚本。运行一段时间后关闭slab
echo 1 > /sys/kernel/slab/<leaking_slab>/trace
sleep 60
echo 0 > /sys/kernel/slab/<leaking_slab>/trace
4. 分析
打印的slab trace大概张这样
[47744.480000] TRACE kmalloc-128 alloc 0x83df8300 inuse=16 fp=0x (null)
[47744.480000] Call Trace:
[47744.480000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.480000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.480000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.480000] [<800df2c0>] __kmalloc+0x98/0x148
[47744.480000] [<8308ad74>] amalloc_private+0x38/0x13c [asf]
[47744.480000] [<82aba2a8>] osif_forward_mgmt_to_app+0xa0/0x280 [umac]
[47744.480000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.480000]
[47744.530000] TRACE kmalloc-128 free 0x83df8300 inuse=16 fp=0x (null)
[47744.530000] Object 83df8300: 4d 61 6e 61 67 65 2e 70 72 6f 62 5f 72 65 71 20 Manage.prob_req
[47744.530000] Object 83df8310: 35 30 00 00 00 00 00 00 00 00 00 00 00 00 40 00 50............@.
[47744.530000] Object 83df8320: 00 00 ff ff ff ff ff ff 78 11 dc 0c 55 34 ff ff ........x...U4..
[47744.530000] Object 83df8330: ff ff ff ff 70 ad 00 08 63 68 5f 42 38 5f 32 47 ....p...ch_B8_2G
[47744.530000] Object 83df8340: 01 08 8b 96 82 84 0c 18 30 60 32 04 6c 12 24 48 ........0`2.l.$H
[47744.530000] Object 83df8350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[47744.530000] Object 83df8360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[47744.530000] Object 83df8370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[47744.530000] Call Trace:
[47744.530000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.530000] [<8027d81c>] free_debug_processing+0x19c/0x218
[47744.530000] [<8027d8dc>] __slab_free+0x44/0x280
[47744.530000] [<82aba324>] osif_forward_mgmt_to_app+0x11c/0x280 [umac]
[47744.530000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.530000]
[47744.650000] TRACE kmalloc-128 alloc 0x830e0b00 inuse=16 fp=0x (null)
[47744.650000] Call Trace:
[47744.650000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.650000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.650000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.650000] [<800df2c0>] __kmalloc+0x98/0x148
[47744.650000] [<8308ad74>] amalloc_private+0x38/0x13c [asf]
[47744.650000] [<82aba2a8>] osif_forward_mgmt_to_app+0xa0/0x280 [umac]
[47744.650000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.650000]
[47744.700000] TRACE kmalloc-128 free 0x830e0b00 inuse=10 fp=0x830e0300
[47744.700000] Object 830e0b00: 4d 61 6e 61 67 65 2e 70 72 6f 62 5f 72 65 71 20 Manage.prob_req
[47744.700000] Object 830e0b10: 38 36 00 00 00 00 00 00 00 00 00 00 00 00 40 00 86............@.
[47744.700000] Object 830e0b20: 00 00 ff ff ff ff ff ff 78 11 dc 32 e2 53 ff ff ........x..2.S..
[47744.700000] Object 830e0b30: ff ff ff ff f0 8f 00 0d 58 69 61 6f 6d 69 5f 46 ........Xiaomi_F
[47744.700000] Object 830e0b40: 61 6d 69 6c 79 01 08 02 04 0b 0c 12 16 18 24 03 amily.........$.
[47744.700000] Object 830e0b50: 01 04 2d 1a 00 00 03 ff 00 00 00 00 00 00 00 00 ..-.............
[47744.700000] Object 830e0b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 32 04 ..............2.
[47744.700000] Object 830e0b70: 30 48 60 6c 00 00 00 00 00 00 00 00 00 00 00 00 0H`l............
[47744.700000] Call Trace:
[47744.700000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.700000] [<8027d81c>] free_debug_processing+0x19c/0x218
[47744.700000] [<8027d8dc>] __slab_free+0x44/0x280
[47744.700000] [<82aba324>] osif_forward_mgmt_to_app+0x11c/0x280 [umac]
[47744.700000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.700000]
[47744.810000] TRACE kmalloc-128 alloc 0x830e0b00 inuse=16 fp=0x (null)
[47744.810000] Call Trace:
[47744.810000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.810000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.810000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.810000] [<800dec80>] kmem_cache_alloc+0x3c/0xe4
[47744.810000] [<801c53b8>] sock_alloc_inode+0x4c/0xc4
[47744.810000] [<800f9080>] alloc_inode+0x28/0xac
[47744.810000] [<800fa328>] new_inode_pseudo+0x10/0x30
[47744.810000] [<801c6560>] sock_alloc+0x1c/0x80
[47744.810000] [<801c6b30>] __sock_create+0x8c/0x1cc
[47744.810000] [<801c6cec>] sock_create+0x38/0x44
[47744.810000] [<801c7294>] sys_socket+0x38/0x7c
[47744.810000] [<8006d8c4>] stack_done+0x20/0x40
[47744.810000]
[47744.900000] TRACE kmalloc-128 alloc 0x830e0500 inuse=16 fp=0x (null)
[47744.900000] Call Trace:
[47744.900000] [<8027c4b4>] dump_stack+0x8/0x34
[47744.900000] [<8027d5fc>] alloc_debug_processing+0xf8/0x17c
[47744.900000] [<8027decc>] __slab_alloc.constprop.65+0x2e0/0x350
[47744.900000] [<800df2c0>] __kmalloc+0x98/0x148
[47744.900000] [<8308ad74>] amalloc_private+0x38/0x13c [asf]
[47744.900000] [<82aba2a8>] osif_forward_mgmt_to_app+0xa0/0x280 [umac]
[47744.900000] [<82aba478>] osif_forward_mgmt_to_app+0x270/0x280 [umac]
[47744.900000]
[47744.950000] TRACE kmalloc-128 free 0x830e0500 inuse=11 fp=0x830e0300
[47744.950000] Object 830e0500: 4d 61 6e 61 67 65 2e 70 72 6f 62 5f 72 65 71 20 Manage.prob_req
[47744.950000] Object 830e0510: 37 39 00 00 00 00 00 00 00 00 00 00 00 00 40 00 79............@.
[47744.950000] Object 830e0520: 00 00 ff ff ff ff ff ff f0 b4 29 07 10 22 ff ff ..........).."..
[47744.950000] Object 830e0530: ff ff ff ff 00 9f 00 06 4d 49 2d 4d 41 43 01 08 ........MI-MAC..
[47744.950000] Object 830e0540: 02 04 0b 0c 12 16 18 24 03 01 03 2d 1a 00 00 03 .......$...-....
[47744.950000] Object 830e0550: ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[47744.950000] Object 830e0560: 00 00 00 00 00 00 00 32 04 30 48 60 6c 00 00 00 .......2.0H`l...
[47744.950000] Object 830e0570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
分析起来比较困难。土法写了个脚本。将trace保存为kmalloc-t.txt
grep "TRACE kmalloc-128 alloc" kmalloc-t.txt | awk '{print $5}' | sort > alloc.txt
grep "TRACE kmalloc-128 free" kmalloc-t.txt | awk '{print $5}' | sort > free.txt
将alloc和free简单做一个排序。 然后通过bcompare或者vimdiff看一下同一个slab的alloc和free是否成对出现。
比较清晰的能看出来哪一个内存快没有free。
再去kmalloc-t.txt中检查一下free.txt中消失的内存块。手工分析一下是否是可疑的内存泄露点
Done