Android中native进程内存泄露的调试技巧（一）-- libc debug

最新推荐文章于 2025-06-17 09:53:53 发布

agwtpcbox

最新推荐文章于 2025-06-17 09:53:53 发布

阅读量3.4k

点赞数

分类专栏： Android

Android 专栏收录该内容

18 篇文章

订阅专栏

本文介绍Android系统中使用libc.debug.malloc属性进行内存调试的方法，包括内存泄露检测、内存越界访问检查及缓冲区溢出检测等功能。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

libc.debug.malloc

// 1 - For memory leak detections.
// 5 - For filling allocated / freed memory with patterns defined by
// CHK_SENTINEL_VALUE, and CHK_FILL_FREE macros.
// 10 - For adding pre-, and post- allocation stubs in order to detect
// buffer overruns.
android本身有集成集中调试机制，代码在bionic/libc/bionic/malloc_debug_common.c/h/cpp，可以通过属性控制：
setprop libc.debug.malloc 0: 这是默认的等级，仅作最基本的判断
setprop libc.debug.malloc 1: 这会在malloc记录调用栈，用于分析内存泄露
setprop libc.debug.malloc 5: 在申请后会填充固定的pattern，用于检查是否越界访问
setprop libc.debug.malloc 10: 增加foot/head头，记录调用栈

android的DDMS可以帮助查看c++ native heap的使用，但需要一定的配置，而且必须是root的手机。
1. 在~/.android/ddms.cfg增加"native=true"。这样子ddms才会有native heap的tab。
2. 指向下面adb命令打开malloc的debug模式
  adb root
  adb shell setprop libc.debug.malloc 1
  adb shell stop
  adb shell start
3. 打开standalone的DDMS（不是eclipse中那个，是独立的应用程序，sdk目录下有），然后在native heap这个tab下，可以查看native heap的分配情况。
在很多手机上，即使执行了这些命令，还是看不到结果。原因是很多手机上并没有安装debug版本的malloc库（包括libc_malloc_debug_leak.so 和 libc_malloc_debug_qemu.so）

基于Android5.0版本

Android为Java程序提供了方便的内存泄露信息和工具（如MAT），便于查找。但是，对于纯粹C/C++ 编写的natvie进程，却不那么容易查找内存泄露。传统的C/C＋＋程序可以使用valgrind工具，也可以使用某些代码检查工具。幸运的是，Google的bionic库为我们查找内存泄露提供了一个非常棒的API－－get_malloc_leak_info。利用它，我们很容易通过得到backtrace的方式找到涉嫌内存泄露的地方。

代码原理分析

我们可以使用adb shell setprop libc.debug.malloc 1来设置内存的调试等级（debug_level），更详细的等级解释见文件bionic/libc/bionic/malloc_debug_common.cpp中的注释：
[cpp] view plain copy
1. // Handle to shared library where actual memory allocation is implemented.
2. // This library is loaded and memory allocation calls are redirected there
3. // when libc.debug.malloc environment variable contains value other than
4. // zero:
5. // 1 - For memory leak detections.
6. // 5 - For filling allocated / freed memory with patterns defined by
7. // CHK_SENTINEL_VALUE, and CHK_FILL_FREE macros.
8. // 10 - For adding pre-, and post- allocation stubs in order to detect
9. // buffer overruns.
10. // Note that emulator's memory allocation instrumentation is not controlled by
11. // libc.debug.malloc value, but rather by emulator, started with -memcheck
12. // option. Note also, that if emulator has started with -memcheck option,
13. // emulator's instrumented memory allocation will take over value saved in
14. // libc.debug.malloc. In other words, if emulator has started with -memcheck
15. // option, libc.debug.malloc value is ignored.
16. // Actual functionality for debug levels 1-10 is implemented in
17. // libc_malloc_debug_leak.so, while functionality for emulator's instrumented
18. // allocations is implemented in libc_malloc_debug_qemu.so and can be run inside
19. // the emulator only.
get_malloc_leak_info()函数也位于malloc_debug_common.cpp文件中，若探究其实现，请自行查看源码。

对于不同的内存调试等级（debug_level），malloc_dispatch_table將指向不同的内存分配管理函数。这样，内存的分配和释放，在不同的的调试等级下，将使用不同的函数版本。

详细的代码过程如下：
[cpp] view plain copy
1. // Initializes memory allocation framework.
2. // This routine is called from __libc_init routines implemented
3. // in libc_init_static.c and libc_init_dynamic.c files.
4. extern "C" __LIBC_HIDDEN__ void malloc_debug_init() {
5. #if !defined(LIBC_STATIC)
6. static pthread_once_t malloc_init_once_ctl = PTHREAD_ONCE_INIT;
7. if (pthread_once(&malloc_init_once_ctl, malloc_init_impl)) {
8. error_log("Unable to initialize malloc_debug component.");
9. }
10. #endif // !LIBC_STATIC
11. }
如代码注释所说，__libc_init()例程中（位于libc_init_static.c和libc_init_dynamic.c文件中）会调用malloc_debug_init进行初始化，进而调用malloc_init_impl（在一个进程中，使用pthread_once保证其只被执行一次）

在malloc_init_impl()例程中，先打开so库，再从so库中解析出malloc_debug_initialize符号，然后执行它。当debug_level为1/5/10时，將会打开libc_malloc_debug_leak.so库文件，malloc_debug_initialize()函数的实现在malloc_debug_check.cpp文件中；当debug_level为20时，將会打开libc_malloc_debug_qemu.so库文件，malloc_debug_initialize()函数的实现在malloc_debug_qemu.cpp文件中。

接着，针对不同的debug_level，解析出不同的内存操作函数malloc/free/calloc/realloc/memalign实现。对于debug_level等级1、5、10的情况，malloc/free/calloc/realloc/memalign各种版本的实现位于文件bionic/libc/bionic/malloc_debug_leak.cpp和malloc_debug_check.cpp中。

当debug_level为1调试memory leak时，其实现是打出backtrace：

leak_malloc()函数实现如下
[cpp] view plain copy
1. extern "C" void* leak_malloc(size_t bytes) {
2. if (DebugCallsDisabled()) {
3. return g_malloc_dispatch->malloc(bytes);
4. }
6. // allocate enough space infront of the allocation to store the pointer for
7. // the alloc structure. This will making free'ing the structer really fast!
9. // 1. allocate enough memory and include our header
10. // 2. set the base pointer to be right after our header
12. size_t size = bytes + sizeof(AllocationEntry);
13. if (size < bytes) { // Overflow.
14. errno = ENOMEM;
15. return NULL;
16. }
18. void* base = g_malloc_dispatch->malloc(size);
19. if (base != NULL) {
20. ScopedPthreadMutexLocker locker(&g_hash_table->lock);
22. uintptr_t backtrace[BACKTRACE_SIZE];
23. size_t numEntries = GET_BACKTRACE(backtrace, BACKTRACE_SIZE);
25. AllocationEntry* header = reinterpret_cast<AllocationEntry*>(base);
26. header->entry = record_backtrace(backtrace, numEntries, bytes);
27. header->guard = GUARD;
29. // now increment base to point to after our header.
30. // this should just work since our header is 8 bytes.
31. base = reinterpret_cast<AllocationEntry*>(base) + 1;
32. }
34. return base;
35. }
[cpp] view plain copy
1. extern bool g_backtrace_enabled;
3. #define GET_BACKTRACE(bt, depth) \
4. (g_backtrace_enabled ? get_backtrace(bt, depth) : 0)
该malloc函数在实际分配的bytes字节前额外分配了一块数据用作AllocationEntry。在分配内存成功后，分配了一个拥有32个元素的指针数组，用于存放调用堆栈指针，调用宏函数GET_BACKTRACE将调用堆栈保存起来，也就是将各函数指针保存到数组backtrace中；然后使用record_backtrace记录下该调用堆栈，然后让AllocationEntry的entry成员指向它。函数record_backtrace会通过hash值在全局调用堆栈表gHashTable里查找。若没找到，则创建一项调用堆栈信息，将其加入到全局表中。最后，将base所指向的地方往后移一下，然后它，就是分配的内存地址。

可见，该版本的malloc函数额外记录了调用堆栈的信息。通过在分配的内存块前加一个头的方式，保存了如何查询hash表调用堆栈信息的entry。

再来看一下record_backtrace函数，在分析其代码之前，看一下结构体（文件malloc_debug_common.h）：
[cpp] view plain copy
1. #define HASHTABLE_SIZE 1543
3. // =============================================================================
4. // Structures
5. // =============================================================================
7. struct HashEntry {
8. size_t slot;
9. HashEntry* prev;
10. HashEntry* next;
11. size_t numEntries;
12. // fields above "size" are NOT sent to the host
13. size_t size;
14. size_t allocations;
15. uintptr_t backtrace[0];
16. };
18. struct HashTable {
19. pthread_mutex_t lock;
20. size_t count;
21. HashEntry* slots[HASHTABLE_SIZE];
22. };
在一个进程中，有一个全局的变量gHashTable，用于记录谁最终调用了malloc分配内存的调用堆栈列表。gHashTable的类型是HashTable，其有一个指针，这个指针指向一个slots数组，该数组的最大容量是1543；数组中有多少有效的值由另一个成员count记录。可以通过backtrace和 numEntries得到hash值，再与HASHTABLE_SIZE整除得到HashEntry在该数组中的索引，这样就可以根据自身信息根据hash，快速得到在数组中的索引。
另一个结构体是HashEntry，因其成员存在指向前后的指针，所以它也是个链表，hash值相同将添加到链表的后面。HashEntry第一个成员slot就是自身在数组中的索引，亦即由hash运算而来；最后一项即调用堆栈backtrace[0]，里面是函数指针，这个数组具体有多少项则由另一个成员numEntries记录；size表示该次分配的内存的大小；allocations是分配次数，即有多少次同一调用路径。

这两个数据结构关系可由下图表示：

在leak_malloc中调用record_backtrace记录堆栈信息时，先由backtrace和numEntries得到hash值，再整除运算后得到在gHashTable中的数组索引；接着检查是否已经存在该项，即有没有分配了相同内存大小、同一调用路径、记录了相当数量的函数指针的HashEntry。若有，则直接在原有项上的allocations加1，没有则创建新项：为HashEntry结构体分配内存，然后调用堆栈信息复制给HashEntry最后的一个成员backtrace。最后，还要为整个表格增加计数。

这样record_backtrace函数完成了向全局表中添加backtrace信息的任务：要么新增加一项HashEntry，要么增加索引。
[cpp] view plain copy
1. static HashEntry* record_backtrace(uintptr_t* backtrace, size_t numEntries, size_t size) {
2. size_t hash = get_hash(backtrace, numEntries);
3. size_t slot = hash % HASHTABLE_SIZE;
5. if (size & SIZE_FLAG_MASK) {
6. debug_log("malloc_debug: allocation %zx exceeds bit width\n", size);
7. abort();
8. }
10. if (gMallocLeakZygoteChild) {
11. size |= SIZE_FLAG_ZYGOTE_CHILD;
12. }
14. HashEntry* entry = find_entry(g_hash_table, slot, backtrace, numEntries, size);
16. if (entry != NULL) {
17. entry->allocations++;
18. } else {
19. // create a new entry
20. entry = static_cast<HashEntry*>(g_malloc_dispatch->malloc(sizeof(HashEntry) + numEntries*sizeof(uintptr_t)));
21. if (!entry) {
22. return NULL;
23. }
24. entry->allocations = 1;
25. entry->slot = slot;
26. entry->prev = NULL;
27. entry->next = g_hash_table->slots[slot];
28. entry->numEntries = numEntries;
29. entry->size = size;
31. memcpy(entry->backtrace, backtrace, numEntries * sizeof(uintptr_t));
33. g_hash_table->slots[slot] = entry;
35. if (entry->next != NULL) {
36. entry->next->prev = entry;
37. }
39. // we just added an entry, increase the size of the hashtable
40. g_hash_table->count++;
41. }
43. return entry;
44. }
而在leak_free()函数中会释放上述全局hash表中的堆栈项
[cpp] view plain copy
1. extern "C" void leak_free(void* mem) {
2. if (DebugCallsDisabled()) {
3. return g_malloc_dispatch->free(mem);
4. }
6. if (mem == NULL) {
7. return;
8. }
10. ScopedPthreadMutexLocker locker(&g_hash_table->lock);
12. // check the guard to make sure it is valid
13. AllocationEntry* header = to_header(mem);
15. if (header->guard != GUARD) {
16. // could be a memaligned block
17. if (header->guard == MEMALIGN_GUARD) {
18. // For memaligned blocks, header->entry points to the memory
19. // allocated through leak_malloc.
20. header = to_header(header->entry);
21. }
22. }
24. if (header->guard == GUARD || is_valid_entry(header->entry)) {
25. // decrement the allocations
26. HashEntry* entry = header->entry;
27. entry->allocations--;
28. if (entry->allocations <= 0) {
29. remove_entry(entry);
30. g_malloc_dispatch->free(entry);
31. }
33. // now free the memory!
34. g_malloc_dispatch->free(header);
35. } else {
36. debug_log("WARNING bad header guard: '0x%x'! and invalid entry: %p\n",
37. header->guard, header->entry);
38. }
39. }
该函数传入的参数是调用malloc()函数返回的内存地址，首先检查mem是否为NULL，若为NULL，直接返回，该函数什么也没干，若不为空，取出AllocationEntry结构体，进而得到类型为HashEntry*的变量entry。接下来，先对成员allocations减一操作，若该引用计数小于等于0，则从hash表中移除，并释放entry占用的内存空间。最后，不管成员allocations的值是多少，都会释放由malloc()分配的内存空间。
因此，在全局表中剩下的未被释放的项，就是分配了内存但未被释放的调用了malloc的调用堆栈。

那么，如何获取一个进程malloc的分配情况呢？接下来，就看一下bionic库提供的API - get_malloc_leak_info()函数，该函数用于获取内存泄露信息。在分配内存时，记录下调用堆栈，在释放时清除它们。这样，剩下的就很有可能是产生内存泄露的地方。
[cpp] view plain copy
1. // Retrieve native heap information.
2. //
3. // "*info" is set to a buffer we allocate
4. // "*overallSize" is set to the size of the "info" buffer
5. // "*infoSize" is set to the size of a single entry
6. // "*totalMemory" is set to the sum of all allocations we're tracking; does
7. // not include heap overhead
8. // "*backtraceSize" is set to the maximum number of entries in the back trace
10. // =============================================================================
11. // Exported for use by ddms.
12. // =============================================================================
13. extern "C" void get_malloc_leak_info(uint8_t** info, size_t* overallSize,
14. size_t* infoSize, size_t* totalMemory, size_t* backtraceSize) {
15. // Don't do anything if we have invalid arguments.
16. if (info == NULL || overallSize == NULL || infoSize == NULL ||
17. totalMemory == NULL || backtraceSize == NULL) {
18. return;
19. }
20. *totalMemory = 0;
22. ScopedPthreadMutexLocker locker(&g_hash_table.lock);
23. if (g_hash_table.count == 0) {
24. *info = NULL;
25. *overallSize = 0;
26. *infoSize = 0;
27. *backtraceSize = 0;
28. return;
29. }
31. HashEntry** list = static_cast<HashEntry**>(Malloc(malloc)(sizeof(void*) * g_hash_table.count));
33. // Get the entries into an array to be sorted.
34. size_t index = 0;
35. for (size_t i = 0 ; i < HASHTABLE_SIZE ; ++i) {
36. HashEntry* entry = g_hash_table.slots[i];
37. while (entry != NULL) {
38. list[index] = entry;
39. *totalMemory = *totalMemory + ((entry->size & ~SIZE_FLAG_MASK) * entry->allocations);
40. index++;
41. entry = entry->next;
42. }
43. }
45. // XXX: the protocol doesn't allow variable size for the stack trace (yet)
46. *infoSize = (sizeof(size_t) * 2) + (sizeof(uintptr_t) * BACKTRACE_SIZE);
47. *overallSize = *infoSize * g_hash_table.count;
48. *backtraceSize = BACKTRACE_SIZE;
50. // now get a byte array big enough for this
51. *info = static_cast<uint8_t*>(Malloc(malloc)(*overallSize));
52. if (*info == NULL) {
53. *overallSize = 0;
54. Malloc(free)(list);
55. return;
56. }
58. qsort(list, g_hash_table.count, sizeof(void*), hash_entry_compare);
60. uint8_t* head = *info;
61. const size_t count = g_hash_table.count;
62. for (size_t i = 0 ; i < count ; ++i) {
63. HashEntry* entry = list[i];
64. size_t entrySize = (sizeof(size_t) * 2) + (sizeof(uintptr_t) * entry->numEntries);
65. if (entrySize < *infoSize) {
66. // We're writing less than a full entry, clear out the rest.
67. memset(head + entrySize, 0, *infoSize - entrySize);
68. } else {
69. // Make sure the amount we're copying doesn't exceed the limit.
70. entrySize = *infoSize;
71. }
72. memcpy(head, &(entry->size), entrySize);
73. head += *infoSize;
74. }
76. Malloc(free)(list);
77. }
函数get_malloc_leak_info()一共接收5个参数，用于存放各种变量的地址，调用结束后，这些变量将得到修改。如其代码注释所说：
*info将指向在该函数中分配的整块内存，这些内存空间大小为overallSize；
整个空间若干小项组成，每项的大小为infoSize，这个小项的数据结构等同于HashEntry中自size成员开始的结构，即第一个成员是malloc分配的内存大小size，第二个成员是记录的分配次数allocations，即多次有着相同调用堆栈的计数，最后一项是backtrace，共32（BACKTRACE_SIZE）个指针值的空间。因此，*info指向的大内存块包含了共有overallSize/infoSize个小项。注意HashEntry中backtrace数组是按实际数量分配的，而此处则统一按32个分配空间，若不到32个，则后面的值置0；
totalMemory是malloc分配的所有内存的大小；

最后一个参数是backtraceSize，即32（BACKTRACE_SIZE）

该函数首先检查传递进来的参数的合法性，以及全局堆栈中是否有堆栈项。接着，查看全局堆栈表中有多少项，然后分配内存，构建数组list，用于保存指针，这些指针用于指向gHashTable中所有的HashEntry项，在遍历全局堆栈哈希表时，对数组list进行赋值，并顺便计算出已分配的但未释放的内存空间大小totalMemory（用于返回给调用者）。然后，对参数infoSize，overallSize，backtraceSize进行赋值，并为info分配大小为overallSize的内存空间。目前，list中保存的是所有的HashEntry项，先对list排序，接着，遍历数组list，把HashEntry中的size,allocations,backtraces[32]拷贝到info指向的内存中。info用于返回给调用者，至此，通过调用get_malloc_leak_info()函数，就可以得到进程的内存malloc堆栈。与其对应的还有一个get_malloc_leak_free()函数，用于释放info指向的内存空间。

总结

当程序运行结束时，一般来说，内存都应该释放，这时我们可以调用get_malloc_leak_info获取未被释放的调用堆栈项。原理上，这些就是内存泄露的地方。但实际情况可能是，在我们运行get_malloc_leak_info时，某些内存应该保留还不应该释放。
另外，我们有时要检查的进程是守护进程，不会退出。所以有些内存应该一直保持下去，不被释放。这时，我们可以选择某个状态的一个时刻来查看未释放的内存，比如在刚进入时的idle状态时的一个时刻，使用get_malloc_leak_info获取未释放的内存信息，然后在程序执行某些操作结束后返回Idle状态时，再次使用get_malloc_leak_info获取未释放的内存信息。两种信息对比，新多出来的调用堆栈项，就存在涉嫌内存泄露。