Memory – Part 5: Debugging Tools

最新推荐文章于 2022-11-14 16:35:11 发布

CaspianSea

最新推荐文章于 2022-11-14 16:35:11 发布

阅读量1.6k

点赞数

分类专栏： Linux 文章标签： memory

Linux 专栏收录该内容

259 篇文章 1 订阅

订阅专栏

Introduction

Here we are! We spent 4 articles explaining what memory is, how to deal with it and what are the kind of problems you can expect from it. Even the best developers write bugs. A commonly accepted estimation seems to be around of few tens of bugs per thousand of lines of code, which is definitely quite huge. As a consequence, even if you proficiently mastered all the concepts covered by our articles, you’ll still probably have a few memory-related bugs.

Memory-related bugs may be particularly hard to spot and fix. Let’s take the following program as an example:

#include <stdio.h>

#define MAX_LINE_SIZE 32

static const char * build_message ( const char * name )

{

char message [ MAX_LINE_SIZE ] ;

sprintf ( message , "hello %s!\n" , name ) ;

return message ;

}

int main ( int argc , char * argv [ ] )

{

fputs ( build_message ( argc > 1 ? argv [ 1 ] : "world" ) , stdout ) ;

return 0 ;

}

This program is supposed to take a message as argument and print “hello <message>!” (the default message being “world”).

The behavior of this program is completely undefined, it is buggy, however it will probably not crash. The function build_message returns a pointer to some memory allocated in its stack-frame. Because of how the stack works, that memory is very susceptible to be overwritten by another function call later, possibly by fputs. As a consequence, if fputs internally uses sufficient stack-memory to overwrite the message, then the output will be corrupted (and the program may even crash), in the other case the program will print the expected message. Moreover, the program may overflow its buffer because of the use of the unsafe sprintf function that has no limit in the number of bytes written.

So, the behavior of the program varies depending on the size of the message given in the command line, the value of MAX_LINE_SIZE and the implementation of fputs. What’s annoying with this kind of bug is that the result may not be obvious: the program “works” well enough with simple use cases and will only fail the day it will receive a parameter with the right properties to exhibit the issue. That’s why it’s important that developers are at ease with some tools that will help them to validate (or to debug) memory management.

In this last article, we will cover some free tools that we consider should be part of the minimal toolkit of a C (and C++) developer.

Debugger

The first of these tools is the debugger. On Linux this will probably be gdb. Most developers know at least the basics of gdb: inspecting a backtrace (bt, up, down, frame <id>, …), adding a breakpoint (break <function|line>, continue, …), executing step-by-step (step, next, fin, …), inspecting memory (print <expr>, call <func>, x/<FMT> <addr>, …), … The debugger is the tool of choice of most developers in the case the program crashes with a segmentation fault.Then the debugger will automatically catch the signal and allow inspecting the state of the program at that instant. A lot of segmentation faults are obvious (uninitialized pointer, NULL pointer dereference, …) and require little work from the debugger.

Less known however, is the ability to place a watchpoint: adding a dynamic breakpoint that will interrupt the program every time the result of an expression changes. This is extremely useful to detect the origin of a memory corruption: place a watchpoint on the content of the memory that get corrupted and as a consequence the program will be interrupted each time the content of that memory changes. This has very little impact on the performance of the program because, as long as you don’t want to monitor too much memory addresses, the watchpoint is managed directly by the hardware.

Take back the example given in the introduction: we do run fputs that prints the content of the pointer given as its first argument, however, the actually printed string is not the one we wrote in build_message. Here is a small debugging session:

First we set a breakpoint on build message and check that sprintf properly built our message

( gdb ) break build_message

Breakpoint 1 at 0x400598 : file blah . c , line 7.

( gdb ) run

Starting program : / home / fruneau / blah

warning : no loadable sections found in added symbol - file system - supplied DSO at 0x7ffff7ffa000

warning : Could not load shared library symbols for linux - vdso . so . 1.

Do you need "set solib-search-path" or "set sysroot" ?

Breakpoint 1 , build_message ( name = 0x4006bf "world" ) at blah . c : 7

7 sprintf ( message , "hello %s!\n" , name ) ;

( gdb ) n

8 return message ;

( gdb ) p message

$ 1 = "hello world!\n\000\000\000\001\000\000\000\000\000\000\000m\006@\000\000\000\000"

In order to be notified whenever the message gets modified, we place a watchpoint on the content of the first character of the string and we let the program continu. The debugger lets us know that it successfully put a hardware watchpoint, which is nice, because a software watchpoint would have a more noticeable impact on the overall performance.

(gdb) watch $1[0]

Hardware watchpoint 2: $1[0]

(gdb) c

Continuing.

The watchpoint interrupts the execution of the program. The debugger prints the old and the new value and we can easily inspect the program. A quick look at the backtrace lets us know that we are somewhere in the code of the dynamic linker (probably during the resolution of the symbol for fputs).

Hardware watchpoint 2: $1[0]

Old value = 104 'h'

New value = 32 ' '

0x00007ffff7def1fc in ?? () from /lib64/ld-linux-x86-64.so.2

(gdb) bt

#0 0x00007ffff7def1fc in ?? () from /lib64/ld-linux-x86-64.so.2

#1 0x00000000004005ff in main (argc=1, argv=0x7fffffffe258) at blah.c:13

Here, the debugger tells us where the memory gets changed, however understanding the issue requires some understanding of what is going on. The debugger provides raw information, the developer remains in charge of the analysis. More generally, the debugger is a good tool whenever you know what to look for.

`valgrind`

valgrind is some kind of swiss knife of the C/C++ developer. It provides various tools such as a memory checker (memcheck), a memory profiler (massif), a cache profiler (cachegrind), a CPU profiler (callgrind), some thread checkers (helgrind, DRD, tsan), …

valgrind is basically a virtual machine that monitors every interaction with the operating system and the virtualized hardware. In order to achieve this, it takes an unmodified executable and wraps every single CPU instruction and every system call with instrumented version. It is extremely configurable: you can define the exact desired behavior of your virtual machine: the number of cores, the size of the caches, the behavior of the system calls (for system calls whose behavior varies from one kernel version to another)… The main drawback however, is that since the code is not directly executed, valgrind has an important overhead and cause a substantial slowdown that varies from 5x to 50x, depending on the tool and the chosen options.

Running valgrind is easy. It requires no modification of your program or of your build system in order to work (however, it can benefit from making some code valgrind-aware). The most basic incantation is just: valgrind --tool=<toolname> <yourprogram and arguments>.

`memcheck`

memcheck is the default tool of valgrind. It’s a memory checker that tracks every single memory access and allocation looking for management errors such as:

accessing not allocated memory
making the program behavior depend on non-initialized memory
leaking some allocated memory

To do this, the first thing memcheck does is to maintain a registry of all allocated memory. Every time a new chunk memory is allocated memcheck keeps track of it by remembering the returned pointer, the amount of memory allocated as well as the backtrace from which it has been allocated. Additionally, it adds some redzones around the allocated memory that cannot be allocated in order to easily detect out-of-buffer accesses.

Needless to say it will also catch every single deallocation in order to keep its registry up to date. The deallocation does not immediately remove the entry from the registry, it marks it as deallocated and remembers the deallocation backtrace. By putting deallocated memory in quarantine, it ensures that use-after-free accesses can be caught as such since that memory cannot be reused for other purpose too rapidly.

At the end of the execution of the program, memcheck will dump its registry: every entry that is not marked as deallocated is a leaked allocation. The report of leaked allocations is associated with the information whether the memory is still referenced or not. Memory that is not pointed anymore by the program is considered as definitely lost.

Additionally, for every single allocated byte, memcheck also maintains an initialization state: memory is considered initialized if, and only if, its value is the result of the computation that uses only initialized bytes. As soon as a non-initialized byte is used in a computation, the result of the computation is undefined and if the program behavior depends on that result, its own behavior is considered undefined.

Overall, at the cost of a massive slowdown and some memory overhead, memcheck detects most dynamic-allocation related errors. However, it’s far less efficient in detecting errors in code that uses static memory or stack-allocated memory because memcheck has very few insights on the internals of the program: it does not know about the various variables that are put on the stack and thus cannot check that you are not overflowing from a stack-allocated buffer on a nearby variable.

A good standard is to impose that every written code be memcheck-clean (or valgrind-clean): a program is not good enough if it produces some errors when run within valgrind. That does not guarantee the program is bug-free, however it ensures that the allocations are well-down. However, that standard is often hard to reach because, for real-life programs, the slowdown of memcheck reaches 40x which makes it almost impossible to run too often. Thankfully, tools such as ASan (covered later in that post) can be used for this purpose.

The documentation of memcheck is full of small examples, so let’s stop paraphrasing the upstream documentation and see what memcheck produces on our small buggy program:

==368== Memcheck, a memory error detector
==368== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==368== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==368== Command: ./blah
==368==
==368== Invalid read of size 1
==368==    at 0x4C2C872: __GI_strlen (mc_replace_strmem.c:400)
==368==    by 0x4E9ED5D: fputs (iofputs.c:36)
==368==    by 0x40060D: main (blah.c:15)
==368==  Address 0x7fefffc10 is just below the stack ptr.  To suppress, use: --workaround-gcc296-bugs=yes
==368==
==368== Invalid read of size 1
==368==    at 0x4C2C884: __GI_strlen (mc_replace_strmem.c:400)
==368==    by 0x4E9ED5D: fputs (iofputs.c:36)
==368==    by 0x40060D: main (blah.c:15)
==368==  Address 0x7fefffc11 is just below the stack ptr.  To suppress, use: --workaround-gcc296-bugs=yes
==368==
==368== Invalid read of size 1
==368==    at 0x4EAB29C: _IO_default_xsputn (genops.c:481)
==368==    by 0x4EA9972: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1364)
==368==    by 0x4E9EDDB: fputs (iofputs.c:41)
==368==    by 0x40060D: main (blah.c:15)
==368==  Address 0x7fefffc10 is just below the stack ptr.  To suppress, use: --workaround-gcc296-bugs=yes
==368==
g==368==
==368== HEAP SUMMARY:
==368==     in use at exit: 0 bytes in 0 blocks
==368==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==368==
==368== All heap blocks were freed -- no leaks are possible
==368==
==368== For counts of detected and suppressed errors, rerun with: -v
==368== ERROR SUMMARY: 5 errors from 3 contexts (suppressed: 2 from 2)

==368== Memcheck, a memory error detector

==368== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info

==368== Command: ./blah

==368==

==368== Invalid read of size 1

==368== at 0x4C2C872: __GI_strlen (mc_replace_strmem.c:400)

==368== by 0x4E9ED5D: fputs (iofputs.c:36)

==368== by 0x40060D: main (blah.c:15)

==368== Address 0x7fefffc10 is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes

==368==

==368== Invalid read of size 1

==368== at 0x4C2C884: __GI_strlen (mc_replace_strmem.c:400)

==368== by 0x4E9ED5D: fputs (iofputs.c:36)

==368== by 0x40060D: main (blah.c:15)

==368== Address 0x7fefffc11 is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes

==368==

==368== Invalid read of size 1

==368== at 0x4EAB29C: _IO_default_xsputn (genops.c:481)

==368== by 0x4EA9972: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1364)

==368== by 0x4E9EDDB: fputs (iofputs.c:41)

==368== by 0x40060D: main (blah.c:15)

==368== Address 0x7fefffc10 is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes

==368==

g==368==

==368== HEAP SUMMARY:

==368== in use at exit: 0 bytes in 0 blocks

==368== total heap usage: 0 allocs, 0 frees, 0 bytes allocated

==368==

==368== All heap blocks were freed -- no leaks are possible

==368==

==368== For counts of detected and suppressed errors, rerun with: -v

==368== ERROR SUMMARY: 5 errors from 3 contexts (suppressed: 2 from 2)

This is a bit more meaningful that the debugging session in gdb. It tells us that fputs is calling strlen (which is quite obviously needed to compute the length of the string it should print), but that strlen reaches some memory that is just below the stack pointer (it actually gone two bytes below the stack pointer). This will still require some analysis, but this time it is quite easy: we are computing the length of a string that is on the same memory as the stack, but that seems to be partially outside of the stack.

A last useful trick with valgrind is its ability to interact with a debugger. Start you program with valgrind --db-attach=yes <yourprogram>. Then every time memcheck reports an error, you’ll be asked whether or not you’d like to debug that error in a debugger.

`massif`

massif is a different kind of tool, it is a memory profiler. It also tracks memory allocations and deallocations, but instead of checking every memory address, it builds a timeline of memory usage. For some chosen moments of the program (such as the moment at which the program had the higher memory usage), it keeps the count of allocations for every single backtrace.

At the end, it dumps the report, by default named massif.out.<pid>. The report is a list of snapshots of the repartions of the memory allocations. It’s hard to process manually. However some tools such as ms_print produce reports easier to understand. The output of ms_print starts with an ASCII-art histogram that visually shows the memory usage:

MB
    1.093^                                                                       #
         |                                                               @:@@:@@:#
         |                                                        @::::@:@:@@:@@:#
         |                                                    ::::@::::@:@:@@:@@:#
         |                                            :::::@::::::@::::@:@:@@:@@:#
         |                                    :@:::@::::: :@::::::@::::@:@:@@:@@:#
         |                               ::::::@:::@::::: :@::::::@::::@:@:@@:@@:#
         |                       @@::::::::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         |                    ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         |              @@::::::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         |        ::@@:@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@@:: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
         | ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#
       0 +----------------------------------------------------------------------->Gi
         0                                                                   2.093

1.093^ #

| @:@@:@@:#

| @::::@:@:@@:@@:#

| ::::@::::@:@:@@:@@:#

| :::::@::::::@::::@:@:@@:@@:#

| :@:::@::::: :@::::::@::::@:@:@@:@@:#

| ::::::@:::@::::: :@::::::@::::@:@:@@:@@:#

| @@::::::::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#

| ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#

| @@::::::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#

| ::@@:@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#

| ::::@@:: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#

| ::::@ :: @ :@@ ::: ::@@ :::: :::: ::@:::@::::: :@::::::@::::@:@:@@:@@:#

0 +----------------------------------------------------------------------->Gi

0 2.093

The # column represents the peak of memory usage, while the @ columns are the detailed snapshots available in the report. If your report looks like this one, you probably have a memory leak in your program, and you should consider fixing it.

The diagram is followed by a table containing the memory usage at every snapshot. It looks like this:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  0              0                0                0             0            0
  1      3,168,761               56               36            20            0
  2      3,174,580            1,752            1,662            90            0
  3      3,212,178            1,936            1,838            98            0
  4      3,237,022            3,720            3,534           186            0
  5      3,265,560            5,096            4,942           154            0
  6      3,292,818            6,536            6,382           154            0
  7      3,311,840            6,784            6,622           162            0
  8      3,339,186            8,888            8,622           266            0
  9      3,357,738           17,768           17,470           298            0
 10      3,379,121           18,080           17,774           306            0
 11      3,401,477           18,104           17,789           315            0
 12      3,418,256           86,864           86,339           525            0
 13      3,446,913           87,680           87,078           602            0
 14      3,547,965           87,680           87,078           602            0
99.31% (87,078B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->87.85% (77,024B) 0x467826: libc_malloc (core-mem.c:50)
| ->74.74% (65,536B) 0x4688FD: mem_alloc (core-mem.h:290)
| | ->74.74% (65,536B) 0x42AB07: master_load_conf.isra.15 (master.blk:2612)
| |   ->74.74% (65,536B) 0x42F88E: platform_start_master (master.blk:2748)
| |     ->74.74% (65,536B) 0x42419E: main (master-main.c:120)
| |
| ->10.07% (8,832B) 0x45A240: thr_initialize (core-mem.h:290)
| | ->10.07% (8,832B) 0x43BD38: platform_initialize (platform.blk:856)
| |   ->10.07% (8,832B) 0x42F85B: platform_start_master (master.blk:2743)
| |     ->10.07% (8,832B) 0x42419E: main (master-main.c:120)
| |
| ->01.92% (1,680B) 0x42DAD0: platform_master_register_service (core-mem.h:290)
| | ->01.92% (1,680B) in 7 places, all below massif's threshold (01.00%)
| |
| ->01.11% (976B) in 6 places, all below massif's threshold (01.00%)
|
->05.64% (4,945B) 0x4678DE: libc_realloc (core-mem.c:74)
| ->03.54% (3,104B) 0x46326F: qhash_resize_start (core-mem.h:310)
| | ->03.54% (3,104B) 0x464A91: __qhash_put_vec (container-qhash.in.c:258)
| |   ->01.77% (1,552B) 0x44E44F: iop_register_class (iop.blk:40)
| |   | ->01.77% (1,552B) 0x456241: iop_register_packages (iop.blk:2924)
| |   | | ->01.77% (1,552B) 0x435018: bon_initialize (base.c:27)
| |   | |   ->01.77% (1,552B) 0x424143: main (master-main.c:109)
| |   | |
| |   | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |
| |   ->01.77% (1,552B) 0x44E540: iop_register_class (iop.blk:62)
| |     ->01.77% (1,552B) 0x456241: iop_register_packages (iop.blk:2924)
| |     | ->01.77% (1,552B) 0x435018: bon_initialize (base.c:27)
| |     |   ->01.77% (1,552B) 0x424143: main (master-main.c:109)
| |     |
| |     ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->01.77% (1,552B) 0x463309: qhash_resize_start (core-mem.h:310)
| | ->01.77% (1,552B) 0x464A91: __qhash_put_vec (container-qhash.in.c:258)
| |   ->01.77% (1,552B) in 2 places, all below massif's threshold (01.00%)
| |
| ->00.33% (289B) in 1+ places, all below ms_print's threshold (01.00%)
|
->02.19% (1,923B) 0x5719785: __tzfile_read (tzfile.c:291)
| ->02.19% (1,923B) 0x5718DAE: tzset_internal (tzset.c:444)
|   ->02.19% (1,923B) 0x571903E: tzset (tzset.c:597)
|     ->02.19% (1,923B) 0x437D87: log_initialize (log.blk:932)
|       ->02.19% (1,923B) 0x43BD3D: platform_initialize (platform.blk:857)

--------------------------------------------------------------------------------

n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)

--------------------------------------------------------------------------------

0 0 0 0 0 0

1 3,168,761 56 36 20 0

2 3,174,580 1,752 1,662 90 0

3 3,212,178 1,936 1,838 98 0

4 3,237,022 3,720 3,534 186 0

5 3,265,560 5,096 4,942 154 0

6 3,292,818 6,536 6,382 154 0

7 3,311,840 6,784 6,622 162 0

8 3,339,186 8,888 8,622 266 0

9 3,357,738 17,768 17,470 298 0

10 3,379,121 18,080 17,774 306 0

11 3,401,477 18,104 17,789 315 0

12 3,418,256 86,864 86,339 525 0

13 3,446,913 87,680 87,078 602 0

14 3,547,965 87,680 87,078 602 0

99.31% (87,078B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.

->87.85% (77,024B) 0x467826: libc_malloc (core-mem.c:50)

| ->74.74% (65,536B) 0x4688FD: mem_alloc (core-mem.h:290)

| | ->74.74% (65,536B) 0x42AB07: master_load_conf.isra.15 (master.blk:2612)

| | ->74.74% (65,536B) 0x42F88E: platform_start_master (master.blk:2748)

| | ->74.74% (65,536B) 0x42419E: main (master-main.c:120)

| |

| ->10.07% (8,832B) 0x45A240: thr_initialize (core-mem.h:290)

| | ->10.07% (8,832B) 0x43BD38: platform_initialize (platform.blk:856)

| | ->10.07% (8,832B) 0x42F85B: platform_start_master (master.blk:2743)

| | ->10.07% (8,832B) 0x42419E: main (master-main.c:120)

| |

| ->01.92% (1,680B) 0x42DAD0: platform_master_register_service (core-mem.h:290)

| | ->01.92% (1,680B) in 7 places, all below massif's threshold (01.00%)

| |

| ->01.11% (976B) in 6 places, all below massif's threshold (01.00%)

->05.64% (4,945B) 0x4678DE: libc_realloc (core-mem.c:74)

| ->03.54% (3,104B) 0x46326F: qhash_resize_start (core-mem.h:310)

| | ->03.54% (3,104B) 0x464A91: __qhash_put_vec (container-qhash.in.c:258)

| | ->01.77% (1,552B) 0x44E44F: iop_register_class (iop.blk:40)

| | | ->01.77% (1,552B) 0x456241: iop_register_packages (iop.blk:2924)

| | | | ->01.77% (1,552B) 0x435018: bon_initialize (base.c:27)

| | | | ->01.77% (1,552B) 0x424143: main (master-main.c:109)

| | | |

| | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)

| | |

| | ->01.77% (1,552B) 0x44E540: iop_register_class (iop.blk:62)

| | ->01.77% (1,552B) 0x456241: iop_register_packages (iop.blk:2924)

| | | ->01.77% (1,552B) 0x435018: bon_initialize (base.c:27)

| | | ->01.77% (1,552B) 0x424143: main (master-main.c:109)

| | |

| | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)

| |

| ->01.77% (1,552B) 0x463309: qhash_resize_start (core-mem.h:310)

| | ->01.77% (1,552B) 0x464A91: __qhash_put_vec (container-qhash.in.c:258)

| | ->01.77% (1,552B) in 2 places, all below massif's threshold (01.00%)

| |

| ->00.33% (289B) in 1+ places, all below ms_print's threshold (01.00%)

->02.19% (1,923B) 0x5719785: __tzfile_read (tzfile.c:291)

| ->02.19% (1,923B) 0x5718DAE: tzset_internal (tzset.c:444)

| ->02.19% (1,923B) 0x571903E: tzset (tzset.c:597)

| ->02.19% (1,923B) 0x437D87: log_initialize (log.blk:932)

| ->02.19% (1,923B) 0x43BD3D: platform_initialize (platform.blk:857)

The first 14 lines here are simple snapshots with only the report about heap-consumption, while the line number 14 is followed by the detailed report about allocation. We can see that at that point, most of the allocated memory was the consequence of the configuration loading.

`Address Sanitizer`

Address Sanitizer (or ASan) is a much more recent tool. It has been initiated by Google in order to provide good memory checking tools without the performance drawback of memcheck for large projects such as WebKit or Chromium. ASan still slows the program down, but by a factor 2, not 40. The tradeoff however is that ASan won’t detect errors such as uses of uninitialized variables or leaks that memcheck can detect, but on the other hand it can detect more errors related to static or stack memory. ASan was first introduced in LLVM/clang 3.1 and has since made its way to GCC with GCC 4.8.

ASan is a pair of tools: first a compiler pass and second a runtime. The runtime of ASan allocates a shadow memory: a huge chunk of RAM that it used to store a single byte for every 8byte word of memory. By default all the memory has its shadow bytes set to 0 which means it is not accessible. Then, when memory gets allocated, the shadow bytes are set to some other values that bring information about which bytes of the word are allocated, who allocated them, … It also overloads the allocators in order to be able to track the allocations and deallocations of memory. Just like memcheck, it will put deallocated memory in a quarantine in order to be able to detect use-after-free accesses.

Then each time a memory access is performed, the runtime will check the values of the associated shadow bytes and if the access is disallowed, ASan will abort the execution of the program: ASan crashes the program on the first error, this forces the program to be ASan-clean.

Overall, the runtime of ASan is less-feature complete than valgrind: it won’t be able to detect memory leaks or access to non-initialized memory. However, most of the power of ASan comes from its compiler-side component. The fact that ASan is intrusive may seem inconvenient, however this allows some closer integration with the program itself. On the other hand, it will only check code that has been instrumented, and won’t be able to catch errors that occur in third party libraries (for example, in the libc).

The main role of the compiler pass is to wrap every single memory access in a small branch that will check that the access is allowed by checking the content of the shadow memory. But, since it is in the compiler, it has access to a lot of information such as, what memory we are accessing, what is the layout of the variables (or the structure members), … and it can also alter all of this. And that is where ASan shines: it can add redzones between global variables or between variables that are put on the stack in order to make bad accesses to those variables easy to detect.

ASan can detect both issues of our example, however since the issue occurs only in functions of the libc, this will not happen as-is. At Intersec, we have our own implementation of sprintf, which cause it to be instrumented by ASan with the program. Here is the output of ASan with a too-long string passed as argument (after running asan_symbolize.py on the output to get the symbol names):

%
 ./blah "coucoeucoijfmiqjfmiqjfmqifjqemfijeqmfijeqfmiqejfmqesifjqemsifj"
 2>&1 | asan_symbolize.py
=================================================================
==17688==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7fffff9fc8c0 at pc 0x425633 bp 0x7fffff9f98f0 sp 0x7fffff9f90b0
WRITE of size 62 at 0x7fffff9fc8c0 thread T0
    #0 0x425632 in memcpy ??:0
    #1 0x4555ea in fmt_output_chunk 
/home/fruneau/dev/mmsx/lib-common/str-iprintf.c:372
    #2 0x438aa8 in fmt_output 
/home/fruneau/dev/mmsx/lib-common/str-iprintf.c:453
    #3 0x44b2dc in isprintf 
/home/fruneau/dev/mmsx/lib-common/str-iprintf.c:1275
    #4 0x4373c5 in build_message 
/home/fruneau/dev/mmsx/lib-common/blah.c:9
    #5 0x436faa in main /home/fruneau/dev/mmsx/lib-common/blah.c:22
    #6 0x7ff113f1da54 in __libc_start_main 
/home/adconrad/eglibc-2.17/csu/libc-start.c:260
    #7 0x436c6c in _start ??:0
Address 0x7fffff9fc8c0 is located in stack of thread T0 at offset 128 in
 frame
    #0 0x4372cf in build_message 
/home/fruneau/dev/mmsx/lib-common/blah.c:6
  This frame has 2 object(s):
    [32, 40) ''
    [96, 128) 'message'
HINT: this may be a false positive if your program uses some custom 
stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
Shadow bytes around the buggy address:
  0x10007ff378c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007ff378d0: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4 f4 f4
  0x10007ff378e0: f2 f2 f2 f2 00 00 00 f4 f2 f2 f2 f2 04 f4 f4 f4
  0x10007ff378f0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007ff37900: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4
=>0x10007ff37910: f2 f2 f2 f2 00 00 00 00[f3]f3 f3 f3 00 00 00 00
  0x10007ff37920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007ff37930: f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4
  0x10007ff37940: f2 f2 f2 f2 00 f4 f4 f4 f3 f3 f3 f3 00 00 00 00
  0x10007ff37950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007ff37960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:     fa
  Heap right redzone:    fb
  Freed heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==17688==ABORTING

% ./blah "coucoeucoijfmiqjfmiqjfmqifjqemfijeqmfijeqfmiqejfmqesifjqemsifj" 2>&1 | asan_symbolize.py

=================================================================

==17688==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffff9fc8c0 at pc 0x425633 bp 0x7fffff9f98f0 sp 0x7fffff9f90b0

WRITE of size 62 at 0x7fffff9fc8c0 thread T0

#0 0x425632 in memcpy ??:0

#1 0x4555ea in fmt_output_chunk /home/fruneau/dev/mmsx/lib-common/str-iprintf.c:372

#2 0x438aa8 in fmt_output /home/fruneau/dev/mmsx/lib-common/str-iprintf.c:453

#3 0x44b2dc in isprintf /home/fruneau/dev/mmsx/lib-common/str-iprintf.c:1275

#4 0x4373c5 in build_message /home/fruneau/dev/mmsx/lib-common/blah.c:9

#5 0x436faa in main /home/fruneau/dev/mmsx/lib-common/blah.c:22

#6 0x7ff113f1da54 in __libc_start_main /home/adconrad/eglibc-2.17/csu/libc-start.c:260

#7 0x436c6c in _start ??:0

Address 0x7fffff9fc8c0 is located in stack of thread T0 at offset 128 in frame

#0 0x4372cf in build_message /home/fruneau/dev/mmsx/lib-common/blah.c:6

This frame has 2 object(s):

[32, 40) ''

[96, 128) 'message'

HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext

(longjmp and C++ exceptions *are* supported)

Shadow bytes around the buggy address:

0x10007ff378c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007ff378d0: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4 f4 f4

0x10007ff378e0: f2 f2 f2 f2 00 00 00 f4 f2 f2 f2 f2 04 f4 f4 f4

0x10007ff378f0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00

0x10007ff37900: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4

=>0x10007ff37910: f2 f2 f2 f2 00 00 00 00[f3]f3 f3 f3 00 00 00 00

0x10007ff37920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007ff37930: f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4

0x10007ff37940: f2 f2 f2 f2 00 f4 f4 f4 f3 f3 f3 f3 00 00 00 00

0x10007ff37950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007ff37960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Shadow byte legend (one shadow byte represents 8 application bytes):

Addressable: 00

Partially addressable: 01 02 03 04 05 06 07

Heap left redzone: fa

Heap right redzone: fb

Freed heap region: fd

Stack left redzone: f1

Stack mid redzone: f2

Stack right redzone: f3

Stack partial redzone: f4

Stack after return: f5

Stack use after scope: f8

Global redzone: f9

Global init order: f6

Poisoned by user: f7

ASan internal: fe

==17688==ABORTING

Doing the same thing with a short string and a reimplementation of fputs gives the same kind of result:

%
 ./blah 2>&1 | asan_symbolize.py
=================================================================
==17891==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7fffd15a3ce4 at pc 0x426ae6 bp 0x7fffd15a3bd0 sp 0x7fffd15a3ba8
READ of size 14 at 0x7fffd15a3ce4 thread T0
    #0 0x426ae5 in strlen ??:0
    #1 0x43719a in my_fputs /home/fruneau/dev/mmsx/lib-common/blah.c:15
    #2 0x436fe7 in main /home/fruneau/dev/mmsx/lib-common/blah.c:22
    #3 0x7f1a99108a54 in __libc_start_main 
/home/adconrad/eglibc-2.17/csu/libc-start.c:260
    #4 0x436c6c in _start ??:0
Address 0x7fffd15a3ce4 is located in stack of thread T0 at offset 164 in
 frame
    #0 0x43704f in my_fputs /home/fruneau/dev/mmsx/lib-common/blah.c:14
  This frame has 3 object(s):
    [32, 40) ''
    [96, 104) ''
    [160, 164) 'len'
HINT: this may be a false positive if your program uses some custom 
stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
Shadow bytes around the buggy address:
  0x10007a2ac740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007a2ac750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007a2ac760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007a2ac770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007a2ac780: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4
=>0x10007a2ac790: f2 f2 f2 f2 00 f4 f4 f4 f2 f2 f2 f2[04]f4 f4 f4
  0x10007a2ac7a0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007a2ac7b0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4
  0x10007a2ac7c0: f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 f4 f4 f4
  0x10007a2ac7d0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007a2ac7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:     fa
  Heap right redzone:    fb
  Freed heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==17891==ABORTING

% ./blah 2>&1 | asan_symbolize.py

=================================================================

==17891==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffd15a3ce4 at pc 0x426ae6 bp 0x7fffd15a3bd0 sp 0x7fffd15a3ba8

READ of size 14 at 0x7fffd15a3ce4 thread T0

#0 0x426ae5 in strlen ??:0

#1 0x43719a in my_fputs /home/fruneau/dev/mmsx/lib-common/blah.c:15

#2 0x436fe7 in main /home/fruneau/dev/mmsx/lib-common/blah.c:22

#3 0x7f1a99108a54 in __libc_start_main /home/adconrad/eglibc-2.17/csu/libc-start.c:260

#4 0x436c6c in _start ??:0

Address 0x7fffd15a3ce4 is located in stack of thread T0 at offset 164 in frame

#0 0x43704f in my_fputs /home/fruneau/dev/mmsx/lib-common/blah.c:14

This frame has 3 object(s):

[32, 40) ''

[96, 104) ''

[160, 164) 'len'

HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext

(longjmp and C++ exceptions *are* supported)

Shadow bytes around the buggy address:

0x10007a2ac740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007a2ac750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007a2ac760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007a2ac770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

0x10007a2ac780: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4

=>0x10007a2ac790: f2 f2 f2 f2 00 f4 f4 f4 f2 f2 f2 f2[04]f4 f4 f4

0x10007a2ac7a0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00

0x10007a2ac7b0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4

0x10007a2ac7c0: f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 f4 f4 f4

0x10007a2ac7d0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00

0x10007a2ac7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Shadow byte legend (one shadow byte represents 8 application bytes):

Addressable: 00

Partially addressable: 01 02 03 04 05 06 07

Heap left redzone: fa

Heap right redzone: fb

Freed heap region: fd

Stack left redzone: f1

Stack mid redzone: f2

Stack right redzone: f3

Stack partial redzone: f4

Stack after return: f5

Stack use after scope: f8

Global redzone: f9

Global init order: f6

Poisoned by user: f7

ASan internal: fe

==17891==ABORTING

Still, as seen in the previous examples, this does not provide anything more than a hint, not a full answer to what is wrong with the program.

Conclusion

Memory is a fundamental resource for any computer program, but it is hard to understand and manage. Tools exist to help the developer and the system administrator but their output requires some brain juice in order to be really meaningful.

This series of article tried to covere a large range of subjects, a lot more could be said (and a lot more as already been said by others). The topic we have selected is what we consider the minimal toolkit for both developers and system administrators, both in term of raw knowledge and for the comprehension of the various limitations. We just hope this has been helpful.

https://techtalk.intersec.com/2013/12/memory-part-5-debugging-tools/

CaspianSea

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Memory – Part 5: Debugging Tools

IntroductionHere we are! We spent 4 articles explaining what memory is, how to deal with it and what are the kind of problems you can expect from it. Even the best developers write bugs. A commonly
复制链接

扫一扫