GDB内存调试初探八

Linux/amd64的调用规则

为了方便调试,笔者在PC机上直接调试简单的内存相关的应用;这需要了解x86_64ABI,该文档对函数调用制定了一些限定规则,其中重要的有两点,第一点是参数的传参(非浮点参数):

User-level applications use as integer registers
for passing the sequence: %rdi, %rsi, %rdx, %rcx, %r8 and %r9.
The kernel interface uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.

在函数入口处加入gdb断点,可以访问这些寄存器得到函数的入参。第二点是关于栈指针的对齐要求,在本文后面编写的汇编代码中要注意:

The end of the input argument area shall be aligned on a 16 (32, if __m256 is
passed on stack) byte boundary. In other words, the value (%rsp + 8) is always
a multiple of 16 (32) when control is transferred to the function entry point. The
stack pointer, %rsp, always points to the end of the latest allocated stack frame.

使用GDB获取所有malloc调用的信息

对于glibc中的ptmalloc模块,其提供的malloc/calloc/realloc/free等常用的内存分配释放的函数,实际上以__libc_为前缀的函数的别名:

/* malloc/malloc.c */
strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc)
strong_alias (__libc_free, __free) strong_alias (__libc_free, free)
strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)
strong_alias (__libc_memalign, __memalign)
weak_alias (__libc_memalign, memalign)
strong_alias (__libc_realloc, __realloc) strong_alias (__libc_realloc, realloc)
strong_alias (__libc_valloc, __valloc) weak_alias (__libc_valloc, valloc)

注意,动态链接器ld.so也提供了malloc等函数,但导出的是弱符号;这些内存分配函数在动态链接器加载libc.so之前使用(或者在链接器内部使用)。因此笔者常常通过带有__libc_前缀的函数名查找这些内存分配的函数,以确定是ptmalloc模块提供的函数:

$ nm -D --defined-only /usr/lib/x86_64-linux-gnu/ld-2.31.so | grep \
  -e malloc -e calloc -e realloc -e free
000000000001d5b0 W calloc
0000000000019250 T _dl_exception_free
000000000001d5f0 W free
000000000001d490 W malloc
000000000001d7e0 W realloc

GDB提供了commands,在触发断点时可以自动执行GDB命令。当该命令列表中包含continue命令时,GDB会继续将调用进程恢复运行,不需要人工干预。笔者获取所有malloc调用的操作如下:

$ gdb -q ./multi-thread-memory
Reading symbols from ./multi-thread-memory...
(gdb) break main
Breakpoint 1 at 0x17e7: file multi-thread-memory.c, line 283.
(gdb) run
Starting program: /home/yejq/program/blogs/20210912/multi-thread-memory 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, main (argc=1, argv=0x7fffffffde98) at multi-thread-memory.c:283
283	{
(gdb) info address __libc_malloc
Symbol "__libc_malloc" is at 0x7ffff7e35260 in a file compiled without debugging.
(gdb) break *0x7ffff7e35260
Breakpoint 2 at 0x7ffff7e35260: file malloc.c, line 3023.
(gdb) commands 2
Type commands for breakpoint(s) 2, one per line.
End with a line saying just "end".
>info register rdi rsp
>x/1xg $rsp
>bt 4
>continue
>end
(gdb) set pagination off
(gdb) c
Continuing.

注意,上面禁用了pagination;这是因为GDB会有大量的调试信息自动输出。笔者先在main函数入口加断点,而未直接在__libc_malloc函数处加断点,是因为此时libc.so动态库未加载;当应用运行到main函数入口时,libc.so动态库就已加载了。此外,笔者没用使用break __libc_malloc命令加断点,是因为该命令可能不会在__libc_malloc函数第一条机器码加断点。下面是调试的结果:

Breakpoint 2, __GI___libc_malloc (bytes=1024) at malloc.c:3023
3023	malloc.c: No such file or directory.
rdi            0x400               1024
rsp            0x7fffffffd458      0x7fffffffd458
0x7fffffffd458:	0x00007ffff7e1ce84
#0  __GI___libc_malloc (bytes=1024) at malloc.c:3023
#1  0x00007ffff7e1ce84 in __GI__IO_file_doallocate (fp=0x7ffff7f846a0 <_IO_2_1_stdout_>) at filedoalloc.c:101
#2  0x00007ffff7e2d050 in __GI__IO_doallocbuf (fp=fp@entry=0x7ffff7f846a0 <_IO_2_1_stdout_>) at libioP.h:948
#3  0x00007ffff7e2c0b0 in _IO_new_file_overflow (f=0x7ffff7f846a0 <_IO_2_1_stdout_>, ch=-1) at fileops.c:745
thread[0] => allocMax: 2 MB, mbMax: 1 KB
thread[1] => allocMax: 4 MB, mbMax: 2 KB
thread[2] => allocMax: 8 MB, mbMax: 8 KB
thread[3] => allocMax: 16 MB, mbMax: 16 KB
thread[4] => allocMax: 32 MB, mbMax: 64 KB
thread[5] => allocMax: 64 MB, mbMax: 128 KB
thread[6] => allocMax: 128 MB, mbMax: 512 KB
thread[7] => allocMax: 256 MB, mbMax: 1024 KB
[New Thread 0x7ffff7d94700 (LWP 3022)]
[New Thread 0x7ffff7593700 (LWP 3023)]
[New Thread 0x7ffff6d92700 (LWP 3024)]
[Switching to Thread 0x7ffff7d94700 (LWP 3022)]

Thread 2 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=352) at malloc.c:3023
3023	in malloc.c
rdi            0x160               352
rsp            0x7ffff7d93e68      0x7ffff7d93e68
0x7ffff7d93e68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=352) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=320, ranfd=3) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdc20) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[New Thread 0x7ffff6591700 (LWP 3025)]
[New Thread 0x7ffff5d90700 (LWP 3026)]
[Switching to Thread 0x7ffff7593700 (LWP 3023)]

Thread 3 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=2032) at malloc.c:3023
3023	in malloc.c
rdi            0x7f0               2032
rsp            0x7ffff7592e68      0x7ffff7592e68
0x7ffff7592e68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=2032) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=2000, ranfd=5) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdc48) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[New Thread 0x7ffff558f700 (LWP 3027)]
[Switching to Thread 0x7ffff6591700 (LWP 3025)]

Thread 5 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=3624) at malloc.c:3023
3023	in malloc.c
rdi            0xe28               3624
rsp            0x7ffff6590e68      0x7ffff6590e68
0x7ffff6590e68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=3624) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=3592, ranfd=6) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdc98) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[New Thread 0x7ffff4d8e700 (LWP 3028)]
[Switching to Thread 0x7ffff6d92700 (LWP 3024)]

Thread 4 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=5200) at malloc.c:3023
3023	in malloc.c
rdi            0x1450              5200
rsp            0x7ffff6d91e68      0x7ffff6d91e68
0x7ffff6d91e68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=5200) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=5168, ranfd=4) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdc70) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[Switching to Thread 0x7ffff5d90700 (LWP 3026)]

Thread 6 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=51216) at malloc.c:3023
3023	in malloc.c
rdi            0xc810              51216
rsp            0x7ffff5d8fe68      0x7ffff5d8fe68
0x7ffff5d8fe68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=51216) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=51184, ranfd=7) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdcc0) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[New Thread 0x7fffeffff700 (LWP 3029)]
All memory threads created and running...
[Switching to Thread 0x7ffff558f700 (LWP 3027)]

Thread 7 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=56816) at malloc.c:3023
3023	in malloc.c
rdi            0xddf0              56816
rsp            0x7ffff558ee68      0x7ffff558ee68
0x7ffff558ee68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=56816) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=56784, ranfd=8) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdce8) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[Switching to Thread 0x7ffff4d8e700 (LWP 3028)]

Thread 8 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=15912) at malloc.c:3023
3023	in malloc.c
rdi            0x3e28              15912
rsp            0x7ffff4d8de68      0x7ffff4d8de68
0x7ffff4d8de68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=15912) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=15880, ranfd=9) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdd10) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[Switching to Thread 0x7fffeffff700 (LWP 3029)]

Thread 9 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=662080) at malloc.c:3023
3023	in malloc.c
rdi            0xa1a40             662080
rsp            0x7fffefffee68      0x7fffefffee68
0x7fffefffee68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=662080) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=662048, ranfd=10) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdd38) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[Switching to Thread 0x7ffff7d94700 (LWP 3022)]

Thread 2 "multi-thread-me" hit Breakpoint 2, __GI___libc_malloc (bytes=496) at malloc.c:3023
3023	in malloc.c
rdi            0x1f0               496
rsp            0x7ffff7d93e68      0x7ffff7d93e68
0x7ffff7d93e68:	0x00005555555554e9
#0  __GI___libc_malloc (bytes=496) at malloc.c:3023
#1  0x00005555555554e9 in memblock_create (memlen=464, ranfd=3) at multi-thread-memory.c:72
#2  thread_func (tharg=0x7fffffffdc20) at multi-thread-memory.c:235
#3  0x00007ffff7f93609 in start_thread (arg=<optimized out>) at pthread_create.c:477
[Switching to Thread 0x7ffff7593700 (LWP 3023)]

遗憾的是,该调试过程不会得到malloc函数返回的内存指针。函数入口是固定的,但函数的返回之处可能有多个地址;若要得到malloc的返回值,需要在多个地方加断点,断点的位置也不易确定。这个遗憾在本文后面的调试中仍将持续。一种可行的方案是在返回地址处加入临时断点tbreak,并查看rax寄存器的值;不过这种调试方式是不推荐的,不仅会严重影响应用的运行效率,而且不确定其可行性(需要大量地、动态地插入断点)。

malloc函数动态添加钩子函数

现有的一些调试工具(如DTrace等)可以实现malloc等内存分配函数的返回值的跟踪、记录,笔者未曾实践过,本文暂不讨论。上面的GDB调试存在一个缺陷,它会(严重地)降低被调用应用的运行速度。对于一些大型的嵌入应用,应用的运行效率过于低下会导致运行异常。每一个断点的触发,会导致应用暂停,GDB调试器通过ptrace系统调用读取相关信息,之后修改应用的地址空间(把断点机器指令替换为原来的指令),最后恢复应用的运行。这一系列操作虽是自动化的,但效率极低。

在笔者以往的文章中,使用LD_PRELOAD环境预加载了钩子函数,替换了malloc/calloc等函数。其优点是可以获得到内存分配的返回指针;但其要求钩住了函数符号是可见的——如何不使用LD_PRELOAD预加载动态库的方法,钩住应用使用的一些(内部)函数?

一种可行的方案是在应用运行过程中,直接修改malloc等函数入口的汇编指令,添加钩子函数。这些钩子函数因添加在函数入口,因此不能获取内存分配的返回指针。一般情况下,钩子函数会完全替代被钩住的函数;但该情况下,钩子函数在执行之后,仍需要跳转回原处继续执行;这给钩子的实现带来很大的难度。首先,笔者编写的钩子注入函数全部代码如下:

/* injection.h */
#ifndef MALLOC_INJECTION_H
#define MALLOC_INJECTION_H 1
#ifdef __cplusplus
extern "C" {
#endif

enum inj_type {
    inj_func_malloc,
    inj_func_calloc,
    inj_func_realloc,
    inj_func_free,
    inj_func_end,
};

int malloc_inject(enum inj_type type);
int malloc_deject(enum inj_type type);
#endif

/* injection.c */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <dlfcn.h>

#include "injection.h"

#define INJ_CODE_MAX_LEN  256
#define INJ_PAGE_SIZE     4096

struct injection_code {
    const unsigned char * origin_code;
    const unsigned char * new_code;
    size_t                origin_len;
    size_t                new_len;
    const char *          func_name;
    const unsigned char * jmp_func;
};

extern void phony_malloc(void);
extern void phony_mallocp(void);

extern void phony_calloc(void);
extern void phony_callocp(void);

extern void phony_realloc(void);
extern void phony_reallocp(void);

extern void phony_free(void);
extern void phony_freep(void);

extern unsigned long phony_callback(unsigned long,
    unsigned long, unsigned long);

/*
(gdb) disassemble /r __libc_malloc
Dump of assembler code for function __GI___libc_malloc:
   0x00007ffff7e35260 <+0>:	f3 0f 1e fa	endbr64 
   0x00007ffff7e35264 <+4>:	48 8b 05 85 dc 14 00	mov    0x14dc85(%rip),%rax        # 0x7ffff7f82ef0
   0x00007ffff7e3526b <+11>:	41 54	push   %r12
   0x00007ffff7e3526d <+13>:	55	push   %rbp
   0x00007ffff7e3526e <+14>:	48 89 fd	mov    %rdi,%rbp
   0x00007ffff7e35271 <+17>:	53	push   %rbx
   0x00007ffff7e35272 <+18>:	48 8b 00	mov    (%rax),%rax

Dump of assembler code for function __libc_calloc:
   0x00007ffff7e36c90 <+0>:	f3 0f 1e fa	endbr64 
   0x00007ffff7e36c94 <+4>:	41 55	push   %r13
   0x00007ffff7e36c96 <+6>:	48 89 f8	mov    %rdi,%rax
   0x00007ffff7e36c99 <+9>:	41 54	push   %r12
   0x00007ffff7e36c9b <+11>:	55	push   %rbp
   0x00007ffff7e36c9c <+12>:	53	push   %rbx

Dump of assembler code for function __GI___libc_realloc:
   0x00007ffff7e36000 <+0>:	f3 0f 1e fa	endbr64 
   0x00007ffff7e36004 <+4>:	41 57	push   %r15
   0x00007ffff7e36006 <+6>:	41 56	push   %r14
   0x00007ffff7e36008 <+8>:	41 55	push   %r13
   0x00007ffff7e3600a <+10>:	41 54	push   %r12

Dump of assembler code for function __GI___libc_free:
   0x00007ffff7e35850 <+0>:	f3 0f 1e fa	endbr64 
   0x00007ffff7e35854 <+4>:	48 83 ec 18	sub    $0x18,%rsp
   0x00007ffff7e35858 <+8>:	48 8b 05 99 d6 14 00	mov    0x14d699(%rip),%rax        # 0x7ffff7f82ef8
   0x00007ffff7e3585f <+15>:	48 8b 00	mov    (%rax),%rax

*/

extern unsigned long phony_callback(unsigned long arg0,
    unsigned long arg1, unsigned long arg2);

static const struct injection_code inj_codes[] = {
    [inj_func_malloc] = {
        .origin_code = (const unsigned char *)
            "\xf3\x0f\x1e\xfa"
            "\x48\x8b\x05\x85\xdc\x14\x00"
            "\x41\x54"
            "\x55"
            "\x48\x89\xfd"
            "\x53"
            "\x48\x8b\x00",
        .new_code = (const unsigned char *) phony_malloc,
        .origin_len = 21,
        .new_len = 12,
        .func_name = "__libc_malloc",
        .jmp_func = (const unsigned char *) phony_mallocp,
    },

    [inj_func_calloc] = {
        .origin_code = (const unsigned char *)
            "\xf3\x0f\x1e\xfa"
            "\x41\x55"
            "\x48\x89\xf8"
            "\x41\x54"
            "\x55"
            "\x53",
        .new_code = (const unsigned char *) phony_calloc,
        .origin_len = 13,
        .new_len = 12,
        .func_name = "__libc_calloc",
        .jmp_func = (const unsigned char *) phony_callocp,
    },

    [inj_func_realloc] = {
        .origin_code = (const unsigned char *)
            "\xf3\x0f\x1e\xfa"
            "\x41\x57"
            "\x41\x56"
            "\x41\x55"
            "\x41\x54",
        .new_code = (const unsigned char *) phony_realloc,
        .origin_len = 12,
        .new_len = 12,
        .func_name = "__libc_realloc",
        .jmp_func = (const unsigned char *) phony_reallocp,
    },

    [inj_func_free] = {
        .origin_code = (const unsigned char *)
            "\xf3\x0f\x1e\xfa"
            "\x48\x83\xec\x18"
            "\x48\x8b\x05\x99\xd6\x14\x00"
            "\x48\x8b\x00",
        .new_code = (const unsigned char *) phony_free,
        .origin_len = 18,
        .new_len = 12,
        .func_name = "__libc_free",
        .jmp_func = (const unsigned char *) phony_freep,
    },
};

int malloc_inject(enum inj_type type)
{
    void * glibc;
    size_t off_set;
    int itype, rval;
    unsigned long faddr;
    unsigned char * funcaddr;
    const struct injection_code * injcode;
    unsigned char opcode[INJ_CODE_MAX_LEN];

    itype = (int) type;
    if (itype < (int) inj_func_malloc ||
        itype > (int) inj_func_free)
        return 1;

    injcode = &inj_codes[itype];
    if (injcode->origin_len < injcode->new_len ||
        injcode->origin_len >= INJ_CODE_MAX_LEN)
        return 2;

    glibc = dlopen("libc.so.6", RTLD_LAZY | RTLD_GLOBAL | RTLD_NODELETE);
    if (glibc == NULL)
        return 3;

    funcaddr = (unsigned char *) dlsym(glibc, injcode->func_name);
    if (funcaddr == NULL)
        return 4;
    faddr = (unsigned long) funcaddr;
    off_set = (size_t) (faddr & (INJ_PAGE_SIZE - 1));
    if (off_set != 0) {
        faddr &= ~(INJ_PAGE_SIZE - 1);
        funcaddr = (unsigned char *) faddr;
    }

    rval = mprotect(funcaddr, INJ_PAGE_SIZE * 2, PROT_READ | PROT_WRITE | PROT_EXEC);
    if (rval != 0)
        return 5;

    memset(opcode, 0x90, sizeof(opcode));
    memcpy(opcode, injcode->new_code, injcode->new_len);
    *((unsigned long *) &(opcode[0x2])) = (unsigned long) injcode->jmp_func;
    memcpy(funcaddr + off_set, opcode, injcode->origin_len);

    rval = mprotect(funcaddr, INJ_PAGE_SIZE * 2, PROT_READ | PROT_EXEC);
    if (rval != 0)
        return 6;

    return 0;
}

int malloc_deject(enum inj_type type)
{
    void * glibc;
    size_t off_set;
    int itype, rval;
    unsigned long faddr;
    unsigned char * funcaddr;
    const struct injection_code * injcode;

    itype = (int) type;
    if (itype < (int) inj_func_malloc ||
        itype > (int) inj_func_free)
        return 1;

    injcode = &inj_codes[itype];
    if (injcode->origin_len < injcode->new_len ||
        injcode->origin_len >= INJ_CODE_MAX_LEN)
        return 2;

    glibc = dlopen("libc.so.6", RTLD_LAZY | RTLD_GLOBAL | RTLD_NODELETE);
    if (glibc == NULL)
        return 3;

    funcaddr = (unsigned char *) dlsym(glibc, injcode->func_name);
    if (funcaddr == NULL)
        return 4;

    if (memcmp(funcaddr, injcode->origin_code, injcode->origin_len) == 0)
        return 0;

    faddr = (unsigned long) funcaddr;
    off_set = (size_t) (faddr & (INJ_PAGE_SIZE - 1));
    if (off_set != 0) {
        faddr &= ~(INJ_PAGE_SIZE - 1);
        funcaddr = (unsigned char *) faddr;
    }

    rval = mprotect(funcaddr, INJ_PAGE_SIZE * 2, PROT_READ | PROT_WRITE | PROT_EXEC);
    if (rval != 0)
        return 5;

    memcpy(funcaddr + off_set, injcode->origin_code, injcode->origin_len);
    rval = mprotect(funcaddr, INJ_PAGE_SIZE * 2, PROT_READ | PROT_EXEC);
    if (rval != 0)
        return 6;

    return -1; /* -1 indicates code recovery actually happens */
}

unsigned long phony_callback(unsigned long arg0,
    unsigned long arg1, unsigned long retaddr)
{
    fprintf(stderr, "In [%s], return address: %p, arg0: %lx, arg1: %lx\n",
        __FUNCTION__, (void *) retaddr, arg0, arg1);
    fflush(stderr);
    return 0;
}

其中,phony_callback是钩子函数都会调用;通过retaddr参数可以确定是哪一个钩子函数调用的,判断的代码如下:

if (retaddr == ((unsigned long) __libc_malloc + 0xc)) {
    ....
} else if (retaddr == ((unsigned long) __libc_calloc + 0xc)) {
    ....
} else if (retaddr == ((unsigned long) __libc_realloc + 0xc)) {
    ....
} else if (retaddr == ((unsigned long) __libc_free + 0xc)) {
    ....
} else {
    /* impossible */
}

修改phony_callback函数,可以增加栈指针的获取功能,回溯函数栈上保存的函数返回地址可以得到哪些地址处调用了malloc/calloc等函数。上面代码的偏移量0xc是钩子函数的大小,这些钩子函数分别为:

phony_malloc:
    mov rax, 0x1234567890 ; phony_mallocp
    call rax
phony_calloc:
    mov rax, 0x1234567890 ; phony_callocp
    call rax
phony_realloc:
    mov rax, 0x1234567890 ; phony_reallocp
    call rax
phony_free:
    mov rax, 0x1234567890 ; phony_freep
    call rax

这四个钩子在注入时会被修改,上面的代码中,jmp_func指定了写入rax寄存器的跳转地址:

*((unsigned long *) &(opcode[0x2])) = (unsigned long) injcode->jmp_func;

这样做是必须的,因为带有偏移量的call汇编指令跳转范围是有限制的,必需写入运行时的地址,通过call rax来实现间接的跳转。这样四个钩子函数的定义是相同的,两条汇编指令的机器码长度为0xc。完整的汇编代码如下:

    BITS 64
    GLOBAL phony_malloc:function
    GLOBAL phony_mallocp:function

    GLOBAL phony_calloc:function
    GLOBAL phony_callocp:function

    GLOBAL phony_realloc:function
    GLOBAL phony_reallocp:function

    GLOBAL phony_free:function
    GLOBAL phony_freep:function
    EXTERN phony_callback
    SECTION .text

phony_all:
    push rbp
    mov rbp, rsp
    push rdi
    push rsi
    push rdx
    mov rdx, rcx
    call phony_callback wrt ..plt
    pop rdx
    pop rsi
    pop rdi
    mov rsp, rbp
    pop rbp
    ret

phony_mallocp:
    endbr64
    sub rsp, 0x8
    mov rcx, [rsp + 0x8]
    call phony_all
    mov rcx, [rsp + 0x8]
    add rsp, 0x10
    push r12
    push rbp
    mov rbp, rdi
    push rbx
    xor rax, rax
    jmp rcx

phony_callocp:
    endbr64
    sub rsp, 0x8
    mov rcx, [rsp + 0x8]
    call phony_all
    mov rcx, [rsp + 0x8]
    add rsp, 0x10
    push r13
    mov rax, rdi
    push r12
    push rbp
    push rbx
    jmp rcx

phony_reallocp:
    endbr64
    sub rsp, 0x8
    mov rcx, [rsp + 0x8]
    call phony_all
    mov rcx, [rsp + 0x8]
    add rsp, 0x10
    push r15
    push r14
    push r13
    push r12
    jmp rcx

phony_freep:
    endbr64
    sub rsp, 0x8
    mov rcx, [rsp + 0x8]
    call phony_all
    mov rcx, [rsp + 0x8]
    add rsp, 0x10
    sub rsp, 0x18
    xor rax, rax
    jmp rcx

phony_malloc:
    mov rax, 0x1234567890 ; phony_mallocp
    call rax

phony_calloc:
    mov rax, 0x1234567890 ; phony_callocp
    call rax

phony_realloc:
    mov rax, 0x1234567890 ; phony_reallocp
    call rax

phony_free:
    mov rax, 0x1234567890 ; phony_freep
    call rax

上面的汇编代码调用了定义于C代码中的函数phony_callback;因phony_callback函数被编译为动态库,因此汇编代码为:

call phony_callback wrt ..plt

值得一提的是,钩子函数因替换了malloc/calloc等函数入口的指令,在返回call rax之后继续执行前,需要补充被替换的机器指令,且不能用ret指令返回(上面是通过jmp rcx来返回的),因为补充的指令会操作函数栈,而ret指令需要从栈上弹出返回地址并跳转。这些操作类似汇编的杂技,通常编写汇编代码不会这样写。笔者编写了简单的测试应用,可以测试钩子是否可用:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#include "injection.h"

int main(int argc, char *argv[])
{
    int ret;
    void * ptrs[3];

    ret = setvbuf(stdout, NULL, _IONBF, 0);
    if (ret == 0)
        ret = setvbuf(stderr, NULL, _IONBF, 0);

    fprintf(stdout, "Buffer all disabled: %d\n", ret);
    fflush(stdout);

    ptrs[0] = NULL;
    ptrs[1] = NULL;
    ptrs[2] = NULL;

    ret = malloc_inject(inj_func_malloc);
    fprintf(stdout, "Malloc hooked: %d\n", ret);
    ret = malloc_inject(inj_func_calloc);
    fprintf(stdout, "calloc hooked: %d\n", ret);
    ret = malloc_inject(inj_func_realloc);
    fprintf(stdout, "realloc hooked: %d\n", ret);
    ret = malloc_inject(inj_func_free);
    fprintf(stdout, "free hooked: %d\n", ret);

    ptrs[0] = malloc(100);
    fprintf(stdout, "malloc(...):   %p\n", ptrs[0]);
    ptrs[1] = calloc(1, 100);
    fprintf(stdout, "calloc(...):   %p\n", ptrs[1]);
    ptrs[2] = realloc(NULL, 100);
    fprintf(stdout, "realloc(...):  %p\n", ptrs[2]);

    free(ptrs[0]); ptrs[0] = NULL;
    free(ptrs[1]); ptrs[1] = NULL;
    free(ptrs[2]); ptrs[2] = NULL;

    ret = malloc_deject(inj_func_malloc);
    fprintf(stdout, "Malloc unhooked: %d\n", ret);
    ret = malloc_deject(inj_func_calloc);
    fprintf(stdout, "calloc unhooked: %d\n", ret);
    ret = malloc_deject(inj_func_realloc);
    fprintf(stdout, "realloc unhooked: %d\n", ret);
    ret = malloc_deject(inj_func_free);
    fprintf(stdout, "free unhooked: %d\n", ret);

    return 0;
}

编译和运行结果如下:

$ make
gcc -Wall -D_GNU_SOURCE -I. -fPIC -O1 -ggdb -c -o main.o main.c
gcc -Wall -D_GNU_SOURCE -I. -fPIC -O1 -ggdb -c -o injection.o injection.c
nasm -f elf64 -g -o test.o test.S
gcc -ggdb -shared -o libinjection.so -Wl,-soname=libinjection.so injection.o test.o -ldl
gcc -ggdb -o testinj main.o -L. "-Wl,-rpath=\$ORIGIN" -linjection
$ ./testinj 
Buffer all disabled: 0
Malloc hooked: 0
calloc hooked: 0
realloc hooked: 0
free hooked: 0
In [phony_callback], return address: 0x7eff3aed626c, arg0: 64, arg1: 7ffe8d1b1320
malloc(...):   0x55a2c4f80340
In [phony_callback], return address: 0x7eff3aed7c9c, arg0: 1, arg1: 64
calloc(...):   0x55a2c4f803b0
In [phony_callback], return address: 0x7eff3aed626c, arg0: 64, arg1: 7ffe8d1b1320
realloc(...):  0x55a2c4f80420
In [phony_callback], return address: 0x7eff3aed685c, arg0: 55a2c4f80340, arg1: 7ffe8d1b1320
In [phony_callback], return address: 0x7eff3aed685c, arg0: 55a2c4f803b0, arg1: 55a2c4f80340
In [phony_callback], return address: 0x7eff3aed685c, arg0: 55a2c4f80420, arg1: 55a2c4f803b0
Malloc unhooked: -1
calloc unhooked: -1
realloc unhooked: -1
free unhooked: -1

可以用GDB查看被钩住的函数的反汇编:

(gdb) disassemble /r __libc_malloc
Dump of assembler code for function __GI___libc_malloc:
   0x00007ffff7e53260 <+0>:	48 b8 c7 55 fc f7 ff 7f 00 00	movabs $0x7ffff7fc55c7,%rax
   0x00007ffff7e5326a <+10>:	ff d0	callq  *%rax
   0x00007ffff7e5326c <+12>:	90	nop
   0x00007ffff7e5326d <+13>:	90	nop
(gdb) disassemble /r __libc_calloc
Dump of assembler code for function __libc_calloc:
   0x00007ffff7e54c90 <+0>:	48 b8 ee 55 fc f7 ff 7f 00 00	movabs $0x7ffff7fc55ee,%rax
   0x00007ffff7e54c9a <+10>:	ff d0	callq  *%rax
   0x00007ffff7e54c9c <+12>:	90	nop
   0x00007ffff7e54c9d <+13>:	48 83 ec 08	sub    $0x8,%rsp
(gdb) disassemble /r __libc_realloc
Dump of assembler code for function __GI___libc_realloc:
   0x00007ffff7e54000 <+0>:	48 b8 14 56 fc f7 ff 7f 00 00	movabs $0x7ffff7fc5614,%rax
   0x00007ffff7e5400a <+10>:	ff d0	callq  *%rax
   0x00007ffff7e5400c <+12>:	49 89 f4	mov    %rsi,%r12
   0x00007ffff7e5400f <+15>:	55	push   %rbp
(gdb) disassemble /r __libc_free
Dump of assembler code for function __GI___libc_free:
   0x00007ffff7e53850 <+0>:	48 b8 39 56 fc f7 ff 7f 00 00	movabs $0x7ffff7fc5639,%rax
   0x00007ffff7e5385a <+10>:	ff d0	callq  *%rax
   0x00007ffff7e5385c <+12>:	90	nop
   0x00007ffff7e5385d <+13>:	90	nop

总结

本文记录了笔者为malloc/calloc等函数添加钩子进行内存分配信息的获取的过程。其缺点是不能获取到内存分配的返回指针;相比于GDB调试,其优点是不会影响被调试应用的运行效率。此外,还需要熟悉汇编并编写可用的钩子函数。这种调试方法是不推荐的,建议先尝试DTrace等调试工具;走投无路时可以考虑该方法。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值