Debugging Memory Errors in C/C++

This page describes a few key techniques I've learned about how to debug programs that are suspected of containing memory errors. Principally, this includes using memory after it has been freed, and writing beyond the end of an array. Memory leaks are considered briefly at the end.

It's of course rather presumptuous to even write these up, since so much has already been written. I'm not intending to write the be-all and end-all article, just to write up a few of the techniques I use since I recently had the opportunity to help a friend debug such an error. There's also some links at the end to other resources.

Note that I'm only interested here in memory errors that trash part of the heap. Overwriting the stack may be a cracker's favorite technique, but when it happens in front of the programmer it's usually very easy to track down.

The first thing to understand about memory errors is why they're different from other bugs. I claim the main reason they are harder to debug is that they are fragile. By fragile, I mean the bug will often only show up under certain conditions, and that attempts to isolate the bug by changing the program or its input often mask its effects. Since the programmer is then forced to find the needle in the haystack, and cannot use techniques to cut down on the size of the haystack, locating the cause of the problem is very difficult.

Consequently, the first priority when tracking down suspected memory errors is to make the bug more robust. There is a bug in your code, but you need to do something so that the bug's effects cannot be masked by other actions of the program.

I know of two main techniques for reducing the fragility of a memory bug:

  • Don't re-use memory.
  • Put empty space between memory blocks.

Why do these techniques help? First, by not re-using memory, we can eliminate temporal dependencies between the bug and the surrounding program. That is, if memory is not re-used, then it no longer matters in what order the relevant blocks are allocated and deallocated.

Second, by putting empty space between blocks, overwriting (or underwriting) past the end of one block won't corrupt another. Thus, we break spatial dependencies involving the bug. The space between the bugs should be filled with a known value, and the space should be periodically checked (at least when free is called on that block) to see if the known value has been changed.

With temporal and spatial dependencies reduced, it's less likely that a change to the program or its input will disturb the evidence of the bug's presence.

Of course, your machine must have enough spare memory to run the experiment. But, by making the bug more robust, we can now cut down on the input size! Thus in the end using more space in the short term can lead to using less space in the final, minimized input test case.

The above two techniques are easily implemented in any debug heap implementation. I've modified Doug Lea's malloc to implement the features; my modified version is here: malloc.cckheap.h. To compile with the debug features described, set the preprocessor variables DEBUG and DEBUG_HEAP. But of course you can use any implementation, and the debug versions can simply be wrappers around the real malloc.

Intel-compatible x86 processors include debug registers capable of watching up to four addresses. Whenever a read or write to any of the watched addresses happens, the program traps, and the debugger gets control. The debug registers offer a powerful way to find out what line of code is overwriting a given byte, once you know which byte is being overwritten.

In gdb, the notation for using hardware watchpoints is a little odd, because gdb likes to think of its input as a C expression. If you want to stop when address 0xABCDEF is accessed, then at the gdb prompt type

  (gdb) watch *((int*)0xABCDEF)

One difficulty is that you can't begin watching an address until the memory it refers to has been mapped (requested from the operating system for use by the program). The usual solution is to step through the program at a rather coarse granularity (skipping over most function calls) until you find a point in time where the address is mapped but has not yet been trashed. Add the watchpoint, then let the program run until the address is accessed.

Suppose I have a program with a suspected memory error. I compile it with the debug malloc.c, and when I run it I see:

  $ ./tmalloc
  trashed 1 bytes
  tmalloc: malloc.c:1591: checkZones: Assertion `!"right allocated zone trashed"' failed.
  Aborted

I first run the program in the debugger to find the offending address:

  (gdb) run
  Starting program: /home/scott/wrk/cplr/smbase/tmalloc
  trashed 1 bytes
  tmalloc: malloc.c:1591: checkZones: Assertion `!"right allocated zone trashed"' failed.

  Program received signal SIGABRT, Aborted.
  0x400539f1 in __kill () from /lib/libc.so.6
  (gdb) up
  #1  0x400536d4 in raise (sig=6) at ../sysdeps/posix/raise.c:27
  27      ../sysdeps/posix/raise.c: No such file or directory.
  (gdb) up
  #2  0x40054e31 in abort () at ../sysdeps/generic/abort.c:88
  88      ../sysdeps/generic/abort.c: No such file or directory.
  (gdb) up
  #3  0x4004dfd2 in __assert_fail () at assert.c:60
  60      assert.c: No such file or directory.
  (gdb) up
  #4  0x8048d55 in checkZones (p=0x8050838 "\016\001", bytes=270)
      at malloc.c:1591
  (gdb) print p[bytes-1-i]
  $1 = 7 '\a'                 <----- trashed! should be 0xAA
  (gdb) print p+bytes-1-i
  $2 = (unsigned char *) 0x80508c6 "\a", '\252' <repeats 127 times>
  (gdb)                  ^^^^^^^^^
                         this is the trashed address

Now I restart the program and attempt to set a hardware watchpoint:

  (gdb) break main
  Breakpoint 1 at 0x8048b91: file tmalloc.c, line 81.
  (gdb) run
  The program being debugged has been started already.
  Start it from the beginning? (y or n) y

  Starting program: /home/scott/wrk/cplr/smbase/tmalloc

  Breakpoint 1, main () at tmalloc.c:81
  (gdb) watch *((int*)0x80508c6)
  Cannot access memory at address 0x80508c6
  (gdb)

Ok, the memory isn't mapped yet. Single-stepping through main a few times, I find a place where I can insert the watchpoint but the memory in question hasn't yet been trashed. When I then continue the program, the debugger next stops at the bug.

  (gdb) watch *((int*)0x80508c6)
  Hardware watchpoint 3: *(int *) 134547654
  (gdb) c
  Continuing.
  Hardware watchpoint 3: *(int *) 134547654

  Old value = -1431655766
  New value = -1431655929
  offEnd () at tmalloc.c:33
  (gdb) print /x -1431655766
  $1 = 0xaaaaaaaa              <--- what it should be
  (gdb) print /x -1431655929
  $2 = 0xaaaaaa07              <--- what it became after trashing
  (gdb) list
  28
  29      void offEnd()
  30      {
  31        char *p = malloc(10);
  32        p[10] = 7;    // oops       <--- the bug
  33        free(p);
  34      }
  35
  36      void offEndCheck()
  37      {
  (gdb)

In this small program the bug would have been obvious upon inspection, but the technique of course generalizes to cases that are much more complicated.

As mentioned above, a debug heap shouldn't re-use memory. Going one step further, my debug malloc.c overwrites free()'d memory with another known pattern (but does not actually free it). Then, if the program continues to use the memory the mistake will become clear, especially if it tries to interpret the values it finds as pointers (they'll segfault). Double-deallocation is also easy to identify with this scheme.

I usually debug memory leaks by printing statistics about calls to malloc and free before and after certain sections of code. If there are more calls to malloc, but the code isn't supposed to be creating long-lived data, then that points to a potential problem. This doesn't easily generalize to long-running programs, but if the program can be broken into units and the leak properties of each unit checked in isolation, most leaks can be found relatively easily.

The C and C++ languages are much-maligned for lack of memory safety, but too often this is seen as a greater problem than it really is (setting security issues aside for the moment). Debugging memory requires a different approach than debugging other kinds of errors, but with a little practice they can actually be easier and faster to find, simply because the same techniques (and tools!) can be used over and over.

I'm not the first or last to write about methods for debugging memory errors. Here are some links to other people who also aren't the first or last either (actually only the first link really matches this description..).

  • Debugging Tools for Dynamic Storage Allocation and Memory Management: Ben Zorn's long list of tools people have written to help debug memory errors.
  • Doug Lea's malloc: Doug Lea's implementation of malloc.
  • malloc.c: My modified version of Doug Lea's malloc, version 2.7.0. I've added:
    • -DDEBUG_HEAP: don't re-use memory, put empty zones on both sides of allocated space, overwrite deallocated space
    • Statistics to track the number of calls to malloc and free.
    • A heap walker interface.
    • -DTRACE_MALLOC_CALLS: print a message to stderr on every malloc and free
  • The above malloc.c also needs the header ckheap.h. That's an oversight I plan to correct, but in the meantime this should be enough to compile malloc.c.
  • gdb: The GNU debugger. The de-facto standard on Linux, for better or worse.
  • Rational: The makers of Purify, one of the best-known tools for finding memory errors. Purify doesn't require recompiling the program, which certainly has its advantages, but as such it is limited in the ways it can make memory bugs more robust. I think sometimes people reach for a heaviweight solution like Purify when a simple debug heap would be faster and easier.
  • CCured: I'd be remiss if I didn't mention CCured, a research project I've done quite a bit of work on. CCured instruments the entire program so it can catch a wide variety of bugs, in a way that is sound: if CCured does not report a problem, then no problem occurred during that run of the program. I can't recommend it as the first solution to reach for during debugging, since it takes a fair bit of time and effort to get a program working under CCured. But in the long run, if you can use CCured, it provides a level of assurance well beyond that of any other current technique.
Introduction The usual implementation of malloc and free are unforgiving to errors in their callers' code, including cases where the programmer overflows an array, forgets to free memory, or frees a memory block twice. This often does not affect the program immediately, waiting until the corrupted memory is used later (in the case of overwrites) or gradually accumulating allocated but unused blocks. Thus, debugging can be extremely difficult. In this assignment, you will write a wrapper for the malloc package that will catch errors in the code that calls malloc and free. The skills you will have learned upon the completion of this exercise are pointer arithmetic and a greater understanding of the consequences of subtle memory mistakes. Logistics Unzip debugging_malloc.zip into an empty directory. The files contained are as follows: File(s): Function: debugmalloc.c Contains the implementation of the three functions you will be writing. This is the one file you will be editing and handing in. debugmalloc.h Contains the declaration of the functions, as well as the macros that will call them. driver.c Contains main procedure and the code that will be calling the functions in the malloc package dmhelper.c, dmhelper.h Contain the helper functions and macros that you will be calling from your code grader.pl Perl script that runs your code for the various test cases and gives you feedback based on your current code debugmalloc.dsp Exercise 3 project file debugmalloc.dsw Exercise 3 workspace file tailor.h, getopt.c, getopt.h Tools that are used only by the driver program for I/O purposes. You will not need to know what the code in these files do. Others Required by Visual C++. You do not need to understand their purpose Specification Programs that use this package will call the macros MALLOC and FREE. MALLOC and FREE are used exactly the same way as the malloc() and free() functions in the standard C malloc package. That is, the line void *ptr = MALLOC ( n ) ;will allocate a payload of at least n bytes, and ptr will point to the front of this block. The line FREE(ptr);will cause the payload pointed to by ptr to be deallocated and become available for later use. The macros are defined as follows: #define MALLOC(s) MyMalloc(s, __FILE__, __LINE__) #define FREE(p) MyFree(p, __FILE__, __LINE__) The __FILE__ macro resolves to the filename and __LINE__ resolves to the current line number. The debugmalloc.c file contains three functions that you are required to implement, as shown: void *MyMalloc(size_t size, char *filename, int linenumber); void MyFree(void *ptr, char *filename, int linenumber); int AllocatedSize(); Using the macros above allow MyMalloc and MyFree to be called with the filename and line number of the actual MALLOC and FREE calls, while retaining the same form as the usual malloc package. By default, MyMalloc and MyFree() simply call malloc() and free(), respectively, and return immediately. AllocatedSize() should return the number of bytes currently allocated by the user: the sum of the requested bytes through MALLOC minus the bytes freed using FREE. By default, it simply returns 0 and thus is unimplemented. The definitions are shown below: void *MyMalloc(size_t size, char *filename, int linenumber) { return (malloc(size)); } void MyFree(void *ptr, char *filename, int linenumber) { free(ptr); } int AllocatedSize() { return 0; } Your job is to modify these functions so that they will catch a number of errors that will be described in the next section. There are also two optional functions in the debugmalloc.c file that you can implement: void PrintAllocatedBlocks(); int HeapCheck(); PrintAllocatedBlocks should print out information about all currently allocated blocks. HeapCheck should check all the blocks for possible memory overwrites. Implementation Details To catch the errors, you will allocate a slightly larger amount of space and insert a header and a footer around the "requested payload". MyMalloc() will insert information into this area, and MyFree() will check to see if the information has not changed. The organization of the complete memory block is as shown below: Header Checksum ... Fence Payload Footer Fence Note:MyMalloc() returns a pointer to the payload, not the beginning of the whole block. Also, the ptr parameter passed into MyFree(void *ptr) will point to the payload, not the beginning of the block. Information that you might want to store in this extra (header, footer) area include: a "fence" immediately around the requested payload with a known value like 0xCCDEADCC, so that you can check if it has been changed when the block is freed. the size of the block a checksum for the header to ensure that it has not been corrupted (A checksum of a sequence of bits is calculated by counting the number of "1" bits in the stream. For example, the checksum for "1000100010001000" is 4. It is a simple error detection mechanism.) the filename and line number of the MALLOC() call The errors that can occur are: Error #1: Writing past the beginning of the user's block (through the fence) Error #2: Writing past the end of the user's block (through the fence) Error #3: Corrupting the header information Error #4: Attempting to free an unallocated or already-freed block Error #5: Memory leak detection (user can use ALLOCATEDSIZE to check for leaks at the end of the program) To report the first four errors, call one of these two functions: void error(int errorcode, char *filename, int linenumber); errorcode is the number assigned to the error as stated above. filename and linenumber contain the filename and line number of the line (the free call) in which the error is invoked. For example, call error(2, filename, linenumber) if you come across a situation where the footer fence has been changed. void errorfl(int errorcode, char *filename_malloc, int linenumber_malloc, char *filename_free, int linenumber_free); This is the same as the error(), except there are two sets of filenames and line numbers, one for the statement in which the block was malloc'd, and the other for the statement in which the block was free'd (and the error was invoked). The fact that MyMalloc() and MyFree() are given the filename and line number of the MALLOC() and FREE() call can prove to be very useful when you are reporting errors. The more information you print out, the easier it will be for the programmer to locate the error. Use errorfl() instead of error() whenever possible. errorfl() obviously cannot be used on situations where FREE() is called on an unallocated block, since it was not ever MALLOC'd. Note: You will only be reporting errors from MyFree(). None of the errors can be caught in MyMalloc() In the case of memory leaks, the driver program will call AllocatedSize(), and the grader will look at its return value and possible output. AllocatedSize() should return the number of bytes currently allocated from MALLOC and FREE calls. For example, the code segment: void *ptr1 = MALLOC(10), *ptr2 = MALLOC(8); FREE(ptr2); printf("%d\n", AllocatedSize()); should print out "10". Once you have gotten to the point where you can catch all of the errors, you can go an optional step further and create a global list of allocated blocks. This will allow you to perform analysis of memory leaks and currently allocated memory. You can implement the void PrintAllocatedBlocks() function, which prints out the filename and line number where all currently allocated blocks were MALLOC()'d. A macro is provided for you to use to print out information about a single block in a readable and gradeable format: PRINTBLOCK(int size, char *filename, int linenumber) Also, you can implement the int HeapCheck() function. This should check all of the currently allocated blocks and return -1 if there is an error and 0 if all blocks are valid. In addition, it should print out the information about all of the corrupted blocks, using the macro #define PRINTERROR(int errorcode, char *filename, int linenumber), with errorcode equal to the error number (according to the list described earlier) the block has gone through. You may find that this global list can also allow you to be more specific in your error messages, as it is otherwise difficult to determine the difference between an overwrite of a non-payload area and an attempted FREE() of an unallocated block. Evaluation You are given 7 test cases to work with, plus 1 extra for testing a global list. You can type "debugmalloc -t n" to run the n-th test. You can see the code that is being run in driver.c. If you have Perl installed on your machine, use grader.pl to run all the tests and print out a table of results. There are a total of 100 possible points. Here is a rundown of the test cases and desired output (do not worry about the path of the filename): Test case #1 Code char *str = (char *) MALLOC(12); strcpy(str, "123456789"); FREE(str); printf("Size: %d\n", AllocatedSize()); PrintAllocatedBlocks(); Error # None Correct Output Size: 0 Points worth 10 Details 10 points for not reporting an error and returning 0 in AllocatedSize() Test case #2 Code char *str = (char *) MALLOC(8); strcpy(str, "12345678"); FREE(str); Error # 2 Correct Output Error: Ending edge of the payload has been overwritten. in block allocated at driver.c, line 21 and freed at driver.c, line 23 Points worth 15 Details 6 pts for catching error 3 pts for printing the filename/line numbers 6 pts for correct error message Test case #3 Code char *str = (char *) MALLOC(2); strcpy(str, "12"); FREE(str); Error # 2 Correct Output Error: Ending edge of the payload has been overwritten. in block allocated at driver.c, line 28 and freed at driver.c, line 30 Points worth 15 Details 6 pts for catching error 3 pts for printing the filename/line numbers 6 pts for correct error message Test case #4 Code void *ptr = MALLOC(4); *ptr2 = MALLOC(6); FREE(ptr); printf("Size: %d\n", AllocatedSize()); PrintAllocatedBlocks(); Error # None Correct Output Size: 6 Currently allocated blocks: 6 bytes, created at driver.c, line 34 Points worth 15 Details 15 pts for not reporting an error and returning 6 from AllocatedSize Extra for printing out the extra block Test case #5 Code void *ptr = MALLOC(4); FREE(ptr); FREE(ptr); Error # 4 Correct Output Error: Attempting to free an unallocated block. in block freed at driver.c, line 43 Points worth 15 Details 15 pts for catching error Extra for correct error message Test case #6 Code char *ptr = (char *) MALLOC(4); *((int *) (ptr - 8)) = 8 + (1 << 31); FREE(ptr); Error # 1 or 3 Correct Output Error: Header has been corrupted.or Error: Starting edge of the payload has been overwritten. in block allocated at driver.c, line 47 and freed at driver.c, line 49 Points worth 15 Details 9 pts for catching error 6 pts for a correct error message Test case #7 Code char ptr[5]; FREE(ptr); Error # 4 Correct Output Error: Attempting to free an unallocated block. in block freed at driver.c, line 54 Points worth 15 Details 15 pts for recognizing error Extra for printing correct error message Test case #8 (Optional) Code int i; int *intptr = (int *) MALLOC(6); char *str = (char *) MALLOC(12); for(i = 0; i < 6; i++) { intptr[i] = i; } if (HeapCheck() == -1) { printf("\nCaught Errors\n"); } Error # None Correct Output Error: Ending edge of the payload has been overwritten. Invalid block created at driver.c, line 59 Caught Errors Points worth Extra Details "Caught Errors" indicates that the HeapCheck() function worked correctly. Extra points possible. Your instructor may give you extra credit for implementing a global list and the PrintAllocatedBlocks() and HeapCheck() functions.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值