文章目录
一、asan介绍
Google ASan工具
ASAN,全称 AddressSanitizer,也即地址消毒技术。可以用来检测内存问题,例如缓冲区溢出或对悬空指针的非法访问等。
ASan主要是进行编译器级别的HOOK与插桩,目前主流Clang,GCC,MSVC都支持,再结合运行时对影子内存的诊断输出,相当于双管齐下,整体效果不错;官方说是2倍左右性能开销,1/8的内存到2倍的开销。
两个主要实现技术点:
内存操作进行插桩: 对new,malloc,delete,free,memcpy,其它内存访问等操作进行编译时替换与代码插入,是编译器完成的;
内存映射与诊断:按照一定的算法对原始内存进行一分影子内存的拷贝生成,目前不是1:1的拷贝,而是巧妙的按1/8大小进行处理,并进行一定的下毒与标记,减少内存的浪费。正常访问内存前,先对影子内存进行检查访问,如果发现数据不对,就进行诊断报错处理。
ASan的诊断功能:
释放后访问(野指针)
栈内存溢出
堆内存溢出
全局对象溢出访问
函数返回后访问
超出作用区访问
初始化顺序问题
内存泄露
ASan的弊端:
ASan也不是完美的,不过它在Chrome与Android上是有使用的,据说诊断出几十处问题了。但也有一些弊端吧:
对内存溢出检查:依赖正常内存左右两端设定毒药区域大小;比如128字节,虽然这个值可以调,但越界超出这个值后,依然无法检查的到。
释放后访问检查:目前是对该内存进行隔离,并对影子内存标记为0xFD,但这个隔离不可能永久;一但被重新复用后,也可能造成严重内存问题,有类像内存池复用崩溃问题;
性能问题:由于插桩引入了很多汇编指令(Andorid平台还会有动态库),性能与内存上对比其它产品虽然还可以,但也只能在内部环境或Debug环境部署,无法直接应用到线上;
要使用ASan,你需要使用支持ASan的编译器,如Clang或GCC,并开启ASan相关的编译选项。
二、asan原理
刚刚提到ASAN有两个主要技术点:
1、运行时库:libasan.so.x
(libasan.so.x)会接管malloc和free等内存操作函数。malloc执行完后,已分配内存的前后(称为“红区”)会被标记为“中毒”状态,而释放的内存则会被隔离起来(暂时不会分配出去)且也会被标记为“中毒”状态。
2、编译器插桩模块:
加了ASAN相关的编译选项后,代码中的每一次内存访问操作都会被编译器修改为如下方式:
编译前:
*address = ...; // or ... = *address;
编译后:
if (IsPoisoned(address)) { // 判断内存是否中毒
ReportError(address, kAccessSize, kIsWrite);
}
*address = ...; // or: ... = *address;
该方式的关键点就在于读写内存前会判断地址是否处于“中毒”状态,还有如何把IsPoisoned
实现的非常快,把ReportError
实现的非常紧凑,从而避免插入的代码过多。
ASan对缓冲区溢出防护的的基本步骤如下:
- 通过在被保护的栈、全局变量、堆周围建立标记为
中毒状态(Poisnoned)的red-zones
;red-zones区会写一个特殊值,该值称为“影子值”。比如fa\fd等 - 将缓冲区和red-zone通过每8字节对应1字节的映射的方式建立影子内存区,影子内存区的获取函数为MemToShadow。
- 如果出现对red-zone的读、写或执行的访问,则ASan可以ShadowIsPoisoned检测出来并报错。
关于影子值
针对任何8字节对齐的主应用区内存,总共有9种不同的影子内存值:
全部8字节都未“中毒”(可访问的),影子值是00。
全部8字节都“中毒”(不可访问的),影子值是负数。
前k个字节未“中毒”,后8-k字节“中毒”,影子值是k。这一功能的达成是由malloc函数总是返回8字节对齐的内存块来保证的,唯一能出现该情况的场景就在申请内存区域的尾部。例如,我们申请13个字节,即malloc(13),这样我们会得到一个完整的未“中毒”的00和前5个字节未“中毒”、后3个字节“中毒”的03。即00 03
三、asan问题详解
1. heap-buffer-overflow(堆溢出)
1、代码
hello.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
int name;
int score[3];
} Person;
int main(int argc, char* argv[])
{
int a = 1;
int b = 1;
Person *people = (Person *)malloc(sizeof(Person));
people->name = 1;
people->score[4] = 100;
Person tmp = {0};
(void)memcpy(&tmp, people+1, sizeof(Person));
return 0;
}
2、编译连接,生成可执行文件
gcc -fsanitize=address -g hello.c -o hello
-fsanitize=address
是ASan的编译选项,用于开启ASan。
-g
选项用于生成调试符号,以支持调试和定位错误
3、执行可执行文件,生成asan
编译完成后,运行生成的可执行文件,ASan会在运行时监测程序的内存访问情况,并在发现错误时提供详细的错误信息,包括错误的位置和类型。
root@10:/home/code/exec/asan/1.heap_buffer_overflow# ./hello
=================================================================
==12634==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb54007c4 at pc 0x00480307 bp 0xbfc9b9b8 sp 0xbfc9b9ac
WRITE of size 4 at 0xb54007c4 thread T0
#0 0x480306 in main /home/code/exec/asan/1.heap_buffer_overflow/hello.c:17
#1 0xb7841e45 in __libc_start_main ../csu/libc-start.c:308
#2 0x4800f0 in _start (/home/code/exec/asan/1.heap_buffer_overflow/hello+0x10f0)
0xb54007c4 is located 4 bytes to the right of 16-byte region [0xb54007b0,0xb54007c0)
allocated by thread T0 here:
#0 0xb7abb4bb in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x48028a in main /home/code/exec/asan/1.heap_buffer_overflow/hello.c:15
#2 0xb7841e45 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/code/exec/asan/1.heap_buffer_overflow/hello.c:17 in main
Shadow bytes around the buggy address:
0x36a800a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a800b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a800c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a800d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a800e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x36a800f0: fa fa fa fa fa fa 00 00[fa]fa fa fa fa fa fa fa
0x36a80100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a80110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a80120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a80130: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36a80140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==12634==ABORTING
4、分析
4.1 初步分析
1、首先是堆溢出
,说明是动态分配的内存
访问溢出,需要关注动态内存申请的变量,比如people
。
2、根据调用栈,确定是第17行发生的异常。 发现确实是访问people的成员,与上述对应的上。
3、查看具体溢出原因,是写
4个字节,说明是赋值操作。
4、为啥会溢出?因为people是分配的16字节,而people->score[4]是people 16字节之后的4字节,访问越界了。
4.2 深入分析
学会看影子表
影子表是正常内存映射的,影子表1个字节表示正常内存8个字节。
其中
12364
是进程号
WRITE of size 4
表示是写4字节导致的内存溢出
影子表中00
表示可访问内存,上图有两个字节00 00
,表示正常内存16个字节。fa
表示已分配的堆的周边内存(即red-zones中毒区),00
右侧出现[fa]
,表示内存访问到有效地址的右侧了,访问毒区了,即上溢出。
2、stack-buffer-overflow(栈溢出)
1、代码
main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char* argv[])
{
int a[3];
printf("%d.\n", a[3]);
return 0;
}
2、编译连接,生成可执行文件
gcc -fsanitize=address -g main.c -o main
3、执行可执行文件,生成asan
=================================================================
==1240==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xbfbd9c8c at pc 0x004fe2ad bp 0xbfbd9c38 sp 0xbfbd9c2c
READ of size 4 at 0xbfbd9c8c thread T0
#0 0x4fe2ac in main /home/code/exec/asan/2.stack-buffer-overflow/main.c:9
#1 0xb780ae45 in __libc_start_main ../csu/libc-start.c:308
#2 0x4fe0f0 in _start (/home/code/exec/asan/2.stack-buffer-overflow/main+0x10f0)
Address 0xbfbd9c8c is located in stack of thread T0 at offset 44 in frame
#0 0x4fe208 in main /home/code/exec/asan/2.stack-buffer-overflow/main.c:7
This frame has 1 object(s):
[32, 44) 'a' (line 8) <== Memory access at offset 44 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/code/exec/asan/2.stack-buffer-overflow/main.c:9 in main
Shadow bytes around the buggy address:
0x37f7b340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b380: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
=>0x37f7b390: 00[04]f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b3a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b3b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b3c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x37f7b3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==1240==ABORTING
4、反汇编
objdump -s -d main > main.txt
000011f9 <main>:
11f9: 8d 4c 24 04 lea 0x4(%esp),%ecx
11fd: 83 e4 f0 and $0xfffffff0,%esp
1200: ff 71 fc pushl -0x4(%ecx)
1203: 55 push %ebp
1204: 89 e5 mov %esp,%ebp
1206: 57 push %edi
1207: 56 push %esi
1208: 53 push %ebx
1209: 51 push %ecx
120a: 83 ec 58 sub $0x58,%esp
120d: e8 ee fe ff ff call 1100 <__x86.get_pc_thunk.bx>
1212: 81 c3 ee 2d 00 00 add $0x2dee,%ebx
1218: 8d 75 a8 lea -0x58(%ebp),%esi
121b: 89 75 9c mov %esi,-0x64(%ebp)
121e: 8b 83 f4 ff ff ff mov -0xc(%ebx),%eax
1224: 83 38 00 cmpl $0x0,(%eax)
1227: 74 13 je 123c <main+0x43>
1229: 83 ec 0c sub $0xc,%esp
122c: 6a 40 push $0x40
122e: e8 4d fe ff ff call 1080 <__asan_stack_malloc_0@plt>
1233: 83 c4 10 add $0x10,%esp
1236: 85 c0 test %eax,%eax
1238: 74 02 je 123c <main+0x43>
123a: 89 c6 mov %eax,%esi
123c: 8d 46 40 lea 0x40(%esi),%eax
123f: 89 c1 mov %eax,%ecx
1241: 89 4d a0 mov %ecx,-0x60(%ebp)
1244: c7 06 b3 8a b5 41 movl $0x41b58ab3,(%esi)
124a: 8d 83 20 e0 ff ff lea -0x1fe0(%ebx),%eax
1250: 89 46 04 mov %eax,0x4(%esi)
1253: 8d 83 f9 d1 ff ff lea -0x2e07(%ebx),%eax
1259: 89 46 08 mov %eax,0x8(%esi)
125c: 89 f7 mov %esi,%edi
125e: c1 ef 03 shr $0x3,%edi
1261: c7 87 00 00 00 20 f1 movl $0xf1f1f1f1,0x20000000(%edi)
1268: f1 f1 f1
126b: c7 87 04 00 00 20 00 movl $0xf3f30400,0x20000004(%edi)
1272: 04 f3 f3
1275: 8d 41 e0 lea -0x20(%ecx),%eax
1278: 83 c0 0c add $0xc,%eax
127b: 89 c1 mov %eax,%ecx
127d: 89 c8 mov %ecx,%eax
127f: c1 e8 03 shr $0x3,%eax
1282: 05 00 00 00 20 add $0x20000000,%eax
1287: 0f b6 10 movzbl (%eax),%edx
128a: 84 d2 test %dl,%dl
128c: 0f 95 45 a7 setne -0x59(%ebp)
1290: 89 c8 mov %ecx,%eax
1292: 83 e0 07 and $0x7,%eax
1295: 83 c0 03 add $0x3,%eax
1298: 38 d0 cmp %dl,%al
129a: 0f 9d c0 setge %al
129d: 22 45 a7 and -0x59(%ebp),%al
12a0: 84 c0 test %al,%al
12a2: 74 09 je 12ad <main+0xb4>
12a4: 83 ec 0c sub $0xc,%esp
12a7: 51 push %ecx
12a8: e8 93 fd ff ff call 1040 <__asan_report_load4@plt>
12ad: 8b 45 a0 mov -0x60(%ebp),%eax
12b0: 8b 40 ec mov -0x14(%eax),%eax
12b3: 83 ec 08 sub $0x8,%esp
12b6: 50 push %eax
12b7: 8d 83 40 e0 ff ff lea -0x1fc0(%ebx),%eax
12bd: 50 push %eax
12be: e8 cd fd ff ff call 1090 <printf@plt>
12c3: 83 c4 10 add $0x10,%esp
12c6: b8 00 00 00 00 mov $0x0,%eax
12cb: 89 c2 mov %eax,%edx
12cd: 39 75 9c cmp %esi,-0x64(%ebp)
12d0: 74 22 je 12f4 <main+0xfb>
12d2: c7 06 0e 36 e0 45 movl $0x45e0360e,(%esi)
12d8: c7 87 00 00 00 20 f5 movl $0xf5f5f5f5,0x20000000(%edi)
12df: f5 f5 f5
12e2: c7 87 04 00 00 20 f5 movl $0xf5f5f5f5,0x20000004(%edi)
12e9: f5 f5 f5
12ec: 8b 46 3c mov 0x3c(%esi),%eax
12ef: c6 00 00 movb $0x0,(%eax)
12f2: eb 14 jmp 1308 <main+0x10f>
12f4: c7 87 00 00 00 20 00 movl $0x0,0x20000000(%edi)
12fb: 00 00 00
12fe: c7 87 04 00 00 20 00 movl $0x0,0x20000004(%edi)
1305: 00 00 00
1308: 89 d0 mov %edx,%eax
130a: 8d 65 f0 lea -0x10(%ebp),%esp
130d: 59 pop %ecx
130e: 5b pop %ebx
130f: 5e pop %esi
1310: 5f pop %edi
1311: 5d pop %ebp
1312: 8d 61 fc lea -0x4(%ecx),%esp
1315: c3 ret
3、heap-use-after-free (野指针)
堆内存释放后使用,动态分配的指针释放后又使用。
1、代码
main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
char *name;
int score[3];
} Class;
int main(int argc, char* argv[])
{
Class *a = (Class *)malloc(sizeof(Class));
a->score[0] = 10;
printf("%d.\n", a->score[0]);
free(a);
printf("%d.\n", a->score[0]);
return 0;
}
2、编译连接,生成可执行文件
gcc -fsanitize=address -g main.c -o main
3、执行可执行文件,生成asan
根据asan日志可以知晓,第16行已释放,第18行又使用
4、global-buffer-overflow (全局对象溢出访问)
1、代码
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int g_aaa[3];
int main(int argc, char* argv[])
{
printf("%d.\n", g_aaa[3]);
return 0;
}
2、编译连接,生成可执行文件
gcc -fsanitize=address -g main.c -o main
3、执行可执行文件,生成asan
全局数组访问溢出
在影子表中,其中
[04]
表示前K个字节未中毒,后8-K个字节中毒,即K=4
所以size 12的可访问区表示为00 04
,00是表示8字节,04是表示前4个字节是可访问的,后(8减4)字节是中毒区。
5、memory_leaks(内存泄漏)
申请内存,未释放,导致内存泄漏
1、代码
main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
char *name;
int score[3];
} Class;
int main(int argc, char* argv[])
{
Class *a = (Class *)malloc(sizeof(Class));
a->score[0] = 10;
printf("%d.\n", a->score[0]);
return 0;
}
2、编译连接,生成可执行文件
gcc -fsanitize=address -g main.c -o main
3、执行可执行文件,生成asan
6、SEGV on unknown address(空指针访问)
1、代码
main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
char *name;
int score[3];
} Class;
int main(int argc, char* argv[])
{
Class *a = NULL;
printf("%d.\n", a->score[0]);
return 0;
}
2、编译连接,生成可执行文件
gcc -fsanitize=address -g main.c -o main
3、执行可执行文件,生成asan
访问未知地址
四、asan工程实战
我们在项目中代码基本是以so方式打包运行,所以通过编译so、运行的方式看下asan问题的定位。
参考: