Finding memory bugs with AddressSanitizer

2 篇文章 0 订阅

AddressSanitizer (ASan) is an instrumentation tool created by Google security researchers to identify memory access problems in C and C++ programs.

在这里插入图片描述

When the source code of a C/C++ application is compiled with AddressSanitizer enabled, the program will be instrumented at runtime to identify and report memory access errors.

But what are memory access errors and how can AddressSanitizer help to identify them?

Memory access errors and AddressSanitizer

C and C++ are very insecure and error-prone languages. And one of the main sources of problems is memory access errors.

Different kind of bugs in the source code could trigger a memory access error, including:

  • Buffer overflow or buffer overrun occurs when a program overruns a buffer’s boundary and overwrites adjacent memory locations.
  • Stack overflow is when a program crosses the boundary of function’s stack.
  • Heap overflow is when a program overruns a buffer allocated in the heap.
  • Memory leak is when a program allocates memory but does not deallocate.
  • Use after free (dangling pointer) is when a program uses memory regions already deallocated.
  • Uninitialized variable is when a program reads a memory location before it is initialized.

All these errors are due to programming bugs. They could prevent the application from executing, cause invalid results or expose a vulnerability that could be exploited by a malicious actor. They are usually very hard to reproduce, debug and fix.

That is why we need tools. And AddressSanitizer is one of them.

AddressSanitizer in a nutshell

AddressSanitizer is implemented through compilation flags. To use AddressSanitizer, we need to compile and link the program using the -fsanitize=address switch.

For example, can you find a memory access error in the C program below?

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, const char *argv[])
{
    char *msg = "Hello world!";
    char *ptr = NULL;

    ptr = malloc(strlen(msg));

    strcpy(ptr, msg);

    printf("%s\n", ptr);

    return 0;
}

This program compiles without any warning and runs:

$ gcc main.c -o main -Wall -Werror -g
% ./main
Hello world!

But the program has a heap buffer overflow error and AddressSanitizer can identify and report the problem. We just need to compile the program with the -fsanitize=address switch:

$ gcc main.c -o main -Wall -Werror -g -fsanitize=address

Now, when we run the application, the memory access problem will be identified and a report of the error will be displayed in the terminal:

$ ./main
=================================================================
==30259==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000001c at pc 0x7f64667423a6 bp 0x7ffd069d7380 sp 0x7ffd069d6b28
WRITE of size 13 at 0x60200000001c thread T0
    #0 0x7f64667423a5  (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x663a5)
    #1 0x55934507fa26 in main /home/sprado/Temp/main.c:12
    #2 0x7f646630cb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #3 0x55934507f8f9 in _start (/home/sprado/Temp/main+0x8f9)

0x60200000001c is located 0 bytes to the right of 12-byte region [0x602000000010,0x60200000001c)
allocated by thread T0 here:
    #0 0x7f64667bab50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x55934507fa0f in main /home/sprado/Temp/main.c:10
    #2 0x7f646630cb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x663a5)
Shadow bytes around the buggy address:
  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00[04]fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==30259==ABORTING

If you compile the application with debugging symbols (-g switch), the tool will also be able to convert addresses into file names and line numbers. That way, we can easily identify the line of the source code that caused the error (see line 6 in the listing above).

So can you see the error now? We overflowed the ptr pointer because we forgot to allocate an extra byte for the null character ('\0’).

In this other example, we have a memory leak problem:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void alloc()
{
    char *ptr;

    ptr = malloc(10);

    memset(ptr, 0, 10);
}

int main(int argc, const char *argv[])
{
    int i;

    for (i = 0; i < 3; i++)
        alloc();

    printf("OK!\n");

    return 0;
}

Again, the program is compiled without warnings and runs:

$ gcc main.c -o main -Wall -Werror -g
sprado@sprado-office:~/Temp$ ./main
OK!

But the memory leak problems are quickly identified when we compile and run the program with AddressSanitizer enabled:

$ gcc main.c -o main -Wall -Werror -g -fsanitize=address

$ ./main
==20677==WARNING: Trying to symbolize code, but external symbolizer is not initialized!

=================================================================
==20677==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 30 byte(s) in 3 object(s) allocated from:
#0 0x465319 (/tmp/main+0x465319)
#1 0x47b588 (/tmp/main+0x47b588)
#2 0x47b8c7 (/tmp/main+0x47b8c7)
#3 0x7f5e28457f44 (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)

SUMMARY: AddressSanitizer: 30 byte(s) leaked in 3 allocation(s).

The memory leak check is enabled by default on x86_64. But depending on the architecture, to check for a memory leak we may need to add detect_leaks=1 to the environment variable ASAN_OPTIONS. Check the documentation for more information about this feature.

More about AddressSanitizer

AddressSanitizer works on x86 (32-bit and 64-bit), ARM (32-bit and 64-bit), MIPS (32-bit and 64-bit) and PowerPC64. The supported operating systems are Linux, Darwin (OS X and iOS Simulator), FreeBSD and Android.

Another interesting fact is that AddressSanitizer is implemented via some libraries, which are kept in the LLVM repository and shared with the GCC project. This is a clear example that, in recent years, there has been an increasing collaboration between the communities of GCC and Clang.

It is also important to note that these memory checks add considerable processing overhead to the application, and should only be used during development and testing.

According to the benchmarks published in the project documentation, AddressSanitizer could decrease the application’s execution time by up to 2x. Which is not good for production code, but not that bad for testing purposes. In fact, the performance is impressive, considering that the tool intercepts and checks every memory access made by the application.

So are you curious about how AddressSanitizer works?

How does AddressSanitizer work?

To instrument memory allocation and identify leaks, the malloc and free family of functions are replaced, so every memory allocation/deallocation is monitored by the tool.

Easy, right? But how to identify buffer overflows?

First, all memory that shouldn’t be accessed is poisoned. That includes memory around allocated regions, deallocated memory, and memory around variables in the stack.

Then every read or write memory access …

*ptr = ...;

… will be compiled to a code that will check if that memory address is poisoned or not. If poisoned, it will report an error.

if (IsPoisoned(ptr)) {
  ReportError(ptr, kAccessSize, kIsWrite);
}
*ptr = ...;

According to the documentation, the tricky part is how to implement IsPoisoned() very fast and ReportError() very compact.

Basically, the virtual address space of an application is divided into the main application memory that is used by the application code and a shadow memory that stores metadata about poisoned (not addressable) memory.

AddressSanitizer maps every 8 bytes of application memory into 1 byte of shadow memory. If a memory address is unpoisoned (i.e. addressable) the bit in the shadow memory is 0. If a memory address is poisoned (i.e. not addressable) the bit in the shadow memory is 1. That way, AddressSanitizer can identify which memory access is allowed or not and report errors.

If you want to get into the details about the implementation, read the documentation of the AddressSanitizer algorithm.

What about Valgrind?

You may know Valgrind, a very popular instrumentation framework that is also able to identify and report memory access problems.

The great advantage of Valgrind is that it can instrument the code without the need to recompile it.

However, the tradeoff is a big performance hit. According to this presentation, while AddressSanitizer execution overhead is around 2x, Valgrind’s overhead could be more than 20x!

In addition to AddressSanitizer, there are also another sanitizers provided by the project:

Support for AddressSanitizer exists in Clang since version 3.1 and GCC since version 4.8. If any of your projects use GCC or Clang, you should really stop what you are doing right now, enable the -fsanitize=address compiler switch and test your code. Do it. You may be surprised with the result!

About the author: Sergio Prado has been working with embedded systems for more than 20 years. If you want to know more about his work, please visit the About page or Embedded Labworks website.

Please email your comments to sergio at embeddedbits.org or sign up the newsletter to receive updates.


See also
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Finding bugs(寻找错误)是指在软件开发过程中,为了保证软件的质量和稳定性,通过一系列的测试和调试过程,找出软件中存在的错误和缺陷,并进行修复的活动。 寻找错误是软件开发过程中必不可少的一步。在软件开发过程中,无论是编写代码、设计界面还是实施功能,都可能出现各种各样的错误。这些错误可能导致软件无法正常运行、功能异常或者性能低下。为了及时发现和修复这些错误,需要进行系统而全面的错误寻找工作。 寻找错误的方法和技巧有很多种。其中一种常用的方法是黑盒测试。黑盒测试是指在不了解软件内部结构和具体实现的情况下,通过输入一些指定的测试用例,观察软件的输出结果,并与预期结果进行对比,从而判断软件是否存在错误。另外一种方法是白盒测试。白盒测试是指在了解软件内部结构和具体实现的情况下,通过对代码进行逐行逐句的检查,发现其中潜在的错误。 除了以上的方法,还可以使用自动化的测试工具来辅助寻找错误。这些工具能够模拟用户的操作,快速地执行大量的测试用例,并生成详细的测试报告,帮助开发人员准确定位和修复错误。 在寻找错误的过程中,要保持耐心和专注。有时候错误可能隐藏得很深,需要仔细地分析和调试。同时,还要注重记录和总结错误,以便后续的修复工作。 总之,寻找错误是软件开发过程中不可或缺的一环。通过系统而全面的测试和调试工作,可以及时发现和修复软件中存在的错误和缺陷,提高软件的质量和可靠性。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值