9.12. Advanced Interception Using Global Offset Table
The method of function interception described here (using weak symbols) is very powerful but requires starting the executable under the LD_PRELOAD environment variable. What if a process is already running?
这里描述的函数拦截方法(使用弱符号)非常强大,但需要在LD_PRELOAD环境变量下启动可执行文件。 如果进程已在运行怎么办?
Consider the following simple program: 请考虑下面的简单程序:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main( )
{
char *ptr ;
while ( 1 )
{
ptr = (char *)malloc( 1024 ) ;
sleep(5) ;
}
return 0 ;
}
Ignore the obvious memory leak. The purpose of this program is to illustrate a debugging technique. The program mimics a real program that may call malloc over and over again (as many/most programs do).
忽略明显的内存泄漏。 该程序的目的是说明调试技术。 该程序模拟了一个可以反复调用malloc的实际程序(正如许多/大多数程序所做的那样)。
After the program is running, it is too late to start it under LD_PRELOAD; however, if the program was built with -ldl (i.e., it will load the libdl library), we can use gdb to dynamically load a shared library into the address space of the process and then redirect the entry in the global offset for malloc to a function in this new shared library.
程序运行后,在LD_PRELOAD下启动它已经太晚了; 但是,如果程序是使用-ldl构建的(即,它将加载libdl库),我们可以使用gdb将共享库动态加载到进程的地址空间中,然后将malloc的全局偏移中的条目重定向到 这个新共享库中的一个函数。
penguin> g++ alloc.C -o alloc -ldl
penguin> gdb alloc
GNU gdb 5.2.1
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are welcome to change it and/or distribute copies of it under
certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i586-suse-linux"...
(gdb) break main
Breakpoint 1 at 0x80483d2
(gdb) run
Starting program: /home/wilding/src/Linuxbook/ELF/alloc
Breakpoint 1, 0x080483d2 in main ()
In another window, we can use readelf to find the global offset table entry for malloc:
penguin> readelf -r alloc
Relocation section '.rel.dyn' at offset 0x290 contains 1 entries:
Offset Info Type Sym.Value Sym. Name
08049578 00000606 R_386_GLOB_DAT 00000000 gmon_start__
Relocation section '.rel.plt' at offset 0x298 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
0804956c 00000107 R_386_JUMP_SLOT 080482d8 malloc
08049570 00000207 R_386_JUMP_SLOT 080482e8 sleep
08049574 00000307 R_386_JUMP_SLOT 080482f8 libc_start_main
From the output, we can see that the address of the GOT slot for malloc is 0x0804956c. We’ll let the program handle the first call to malloc as normal (as it would if we were attaching to it at some point in its lifetime).
从输出中,我们可以看到malloc的GOT插槽的地址是0x0804956c。 我们将让程序正常处理对malloc的第一次调用(就像我们在其生命周期的某个时刻附加它一样)。
(gdb) cont
Continuing.
Program received signal SIGINT, Interrupt.
0x401a3d01 in nanosleep () from /lib/libc.so.6
Okay, if the program is sleeping, it has made at least one call to malloc, resolving the address of malloc and placing it into the GOT. Let’s confirm by looking at the value at the corresponding slot in the GOT:
好的,如果程序正在休眠,它至少调用了一次malloc,解析了malloc的地址并将其放入GOT中。 让我们通过查看GOT中相应插槽的值来确认:
(gdb) x/x 0x0804956c
0x804956c <_GLOBAL_OFFSET_TABLE_+12>: 0x40173d70
(gdb) disass 0x40173d70 0x40173d80
Dump of assembler code from 0x40173d70 to 0x40173d80:
0x40173d70 <malloc>: push %ebp
0x40173d71 <malloc+1>: mov %esp,%ebp
0x40173d73 <malloc+3>: sub $0x28,%esp
0x40173d76 <malloc+6>: mov %ebx,0xfffffff4(%ebp)
0x40173d79 <malloc+9>: mov %esi,0xfffffff8(%ebp)
0x40173d7c <malloc+12>: mov %edi,0xfffffffc(%ebp)
0x40173d7f <malloc+15>: call 0x40177193 <malloc_extend_top+707>
End of assembler dump.
As expected, the malloc slot in the GOT is pointing to the function malloc. Now let’s proceed with the steps to redirect this call to a custom function in a shared library that is not yet loaded into the address space. The first step is to get this shared library into the address space.
正如所料,GOT中的malloc插槽指向函数malloc。 现在让我们继续执行将此调用重定向到尚未加载到地址空间的共享库中的自定义函数的步骤。 第一步是将此共享库放入地址空间。
(gdb) info func dlopen
All functions matching regular expression "dlopen":
Non-debugging symbols:
0x40025ec0 dlopen_doit
0x40025f10 dlopen_check
0x40025f10 dlopen@@GLIBC_2.1
0x400269f0 dlopen_doit
0x40026a40 dlopen_nocheck
0x40026a40 dlopen@GLIBC_2.0
0x401fdf40 do_dlopen
0x401fe010 libc_dlopen
There are a few functions that contain dlopen. The last one is the actual symbol for dlopen (see previous section on weak and strong symbols in libc). We’re going to use this function to load a shared library into the address space:
有一些函数包含dlopen。 最后一个是dlopen的实际符号(参见上一节关于libc中的弱符号和强符号)。 我们将使用此函数将共享库加载到地址空间:
(gdb) call __libc_dlopen( "/home/wilding/src/Linuxbook/ELF/intercept.so",
2 )
$1 = 134519336
(gdb) printf "0x%x\n", 134519336
0x8049a28
This call to __libc_dlopen uses the same arguments as defined in the main page for dlopen. For the second argument, we used the value for RTLD_NOW as defined in dlfcn.h:
对__libc_dlopen的调用使用与dlopen主页中定义的相同的参数。 对于第二个参数,我们使用了dlfcn.h中定义的RTLD_NOW的值:
penguin> egrep RTLD_NOW /usr/include/bits/dlfcn.h
#define RTLD_NOW 0x00002 /* Immediate function call binding. */
The return value from the dlopen function call is shown on the second line as $1 = 134519336. This is the handle returned by dlopen, which we don’t care about right now (as long as it is not NULL).
dlopen函数调用的返回值在第二行显示为$ 1 = 134519336.这是dlopen返回的句柄,我们现在不关心它(只要它不是NULL)。
At this point, the shared library should be loaded into the address space. Let’s confirm and get the address where it was loaded.
此时,应将共享库加载到地址空间中。 让我们确认并获取加载它的地址。
(gdb) info shared
From To Syms Read Shared Object Library
0x40025dd0 0x40026ae0 Yes /lib/libdl.so.2
0x40062700 0x400ba2c0 Yes /usr/lib/libstdc++.so.5
0x400de740 0x400f6470 Yes /lib/libm.so.6
0x400ff350 0x40104210 Yes /lib/libgcc_s.so.1
0x4011e2e0 0x40200b94 Yes /lib/libc.so.6
0x40001290 0x4000f12d Yes /lib/ld-linux.so.2
0x400146c0 0x400148f0 Yes /home/wilding/src/Linuxbook/ELF/
intercept.so
The next step is to find the address of our malloc function in the shared library, intercept.so. In this example, the function we want to redirect malloc calls does not have to be called malloc. We’re going to manually redirect calls to malloc so the target function could be named anything we want.
下一步是在共享库中找到malloc函数的地址intercept.so。 在这个例子中,我们想要重定向malloc调用的函数不必被称为malloc。 我们将手动将调用重定向到malloc,以便可以将目标函数命名为我们想要的任何名称。
(gdb) info func malloc
All functions matching regular expression "malloc":
Non-debugging symbols:
0x080482d8 malloc
0x40172ec0 ptmalloc_lock_all
<...>
0x4000cbc0 malloc
0x400147a4 malloc
From the output, the last malloc function is the one we want. The address range for intercept.so is 0x400146c0 to 0x400148f0, and the address of the last malloc function, 0x400147a4, is in this range. Now we need to change the value of the GOT slot for malloc (in the executable alloc) to point to our function:
从输出中,最后一个malloc函数就是我们想要的函数。 intercept.so的地址范围是0x400146c0到0x400148f0,最后一个malloc函数的地址0x400147a4在此范围内。 现在我们需要更改malloc的GOT槽的值(在可执行的alloc中)以指向我们的函数:
(gdb) set $a=0x0804956c
(gdb) set *($a) = 0x400147a4
(gdb) x 0x0804956c 0x804956c
<_GLOBAL_OFFSET_TABLE_+12>: 0x400147a4
If we continue the process now, a call to malloc from the executable alloc should call our version of malloc. Let’s confirm:
如果我们现在继续该过程,从可执行文件alloc调用malloc应该调用我们的malloc版本。 我们确认一下:
(gdb) cont
Continuing.
malloc : Requested block size: 1024
malloc : ptr of allocated block: 0x8049cb0
As expected, we’re now in control of the malloc call. Every call to malloc made from the executable alloc will call our function because we changed the slot for malloc in the GOT to our own function.
正如预期的那样,我们现在可以控制malloc调用。 每次对来自可执行文件alloc的malloc的调用都将调用我们的函数,因为我们将GOT中malloc的插槽更改为我们自己的函数。
Getting this to work without linking the executable with -ldl is possible but much more complicated. Fortunately, many programs are linked with -ldl, which makes this useful for those really challenging problems when all other methods fail.
在不将可执行文件与-ldl链接的情况下使其工作是可能的,但更复杂。幸运的是,许多程序都与-ldl链接,这使得当所有其他方法都失败时,这对那些真正具有挑战性的问题很有用。
9.13. Source Files
Code View: Scroll / Show All
foo.C
#include <stdio.h>
#include "foo.h"
static myClass myObj ;
myClass myObj2 ;
int globInt = 5 ;
static int staticInt = 5 ;
const int constInt = 5 ;
int noValueGlobInt ;
const char *constString = "This is a constant string!";
int list[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
inline int foo( int a )
{
int b = 0 ;
for ( b = a ; b < 100 ; b ++ ) ;
noValueGlobInt = a ;
return b + a ;
}
static int bar( int c )
{
int d = 0;
d = foo( c ) + globInt ;
d += staticInt ;
d += constInt ;
return d ;
}
int baz( int val)
{
bar( val ) ;
printf( "This is a printf format string in baz\n" ) ;
return 0 ;
}
foo.h
class myClass
{
public:
int myVar ;
myClass() {
myVar = 5 ;
}
};
extern int globInt;
extern myClass myObj2 ;
extern int baz( int val ) ;
main.C
#include <stdio.h>
#include <unistd.h>
#include <sys/utsname.h>
#include "foo.h"
myClass myObj3 ;
int main()
{
struct utsname uInfo ;
uname( &uInfo ) ;
baz( 15 ) ;
printf( "This is a printf format string in main\n" ) ;
sleep( 1010 ) ;
return 0 ;
}
9.14. ELF APIs
The ELF APIs provide detailed functionality to read, create, and manipulate ELF files. Unfortunately, some distributions do not install the ELF library by default. Here is the URL for the ELF library on Linux in case you need it:
ELF API提供了读取,创建和操作ELF文件的详细功能。不幸的是,某些发行版默认情况下不安装ELF库。 以下是Linux上ELF库的URL,以备不时之需:
http://www.stud.uni-hannover.de/~michael/software/english.html.
9.15. Other Information
These will help you if you need to read or manipulate ELF files from a program. 如果您需要从程序中读取或操作 ELF 文件, 这些将帮助您。
http://www.caldera.com/developers/gabi/
http://www.caldera.com/developers/devspecs/
http://www.x86-64.org/documentation/abi-0.96.pdf
http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
9.16. Conclusion
ELF is one of those overlooked and under-appreciated aspects of the Linux operating system (as well as many other OSs). However, a solid understanding of ELF and how to use it to increase your debugging options is absolutely critical for any Linux expert. Hopefully this chapter provided an in-depth and useful look behind the scenes of the all-but-forgotten details of ELF.
ELF是Linux操作系统(以及许多其他操作系统)中被忽视和不被重视的方面之一。 但是,对ELF以及如何使用它来增加调试选项的充分理解对于任何Linux专家来说都是至关重要的。 希望本章能够在ELF的所有遗忘细节的幕后提供深入而有用的外观。