Introduction  

  In this article I will discuss a way to build a memory leak detection program for C and C++ applications.
This is not an attempt at writing a fully fledged leak detector but rather an introduction to one way (of many) ways of finding leaks.

  The approach I've gone for is library injection and even if the C++ source code provided is for Linux (tested on Ubuntu 11.10) the method should work for any platform that allows library injection.

What is a leak

  There are many ways in which an application can leak resources, and there are many different resources that can be leaked.   In this article I'll focus on and only look for memory leaks (not file handles or sockets or any other resource that might be allocated), but even when constraining the scope to just memory there are still many different kinds of leaks.   When talking about memory leaks in C++, most people think of scenarios like this (very simplified one);

void foo()
{
  int* my_pointer = new int;
}   // my_pointer is never deleted, we've just leaked sizeof(int) bytes!

  Memory is allocated but is not deleted/unallocated before the reference to the allocated memory goes out of scope. This means of course that   the memory will be unavailable for use as well as for further allocations until the program terminates.

  But memory leaks can appear in other more subtle ways as well, sometimes the memory is still referenced but just not used, such as when items are regularly added to a std::vector without ever being released. If such astd::vector is never cleared and still never again looked at by the application it can be considered leak, even though the memory is still referenced.

  In this article, for simplicity, I'll only look at the first scenario; allocated memory that is not deallocated. In short, I'll show a way of tracking each call to malloc and free and record information so that they can be married up.

  You might argue that you're using C++ and are therefore allocating and deallocating your memory using new anddelete but in most cases the implementation of those will still call the C versions malloc and free.

Library Injection

  Library Injection is when a user tells the OS to first look in LibraryA for system wide methods before looking in the "standard" libraries, where LibraryA is a library containing an overload of a system function. On some Linux and Unix distributions this can be achieved using the LD_PRELOAD environment variable to set the library that is to be injected.   

  To track memory allocations and deallocations a shared library containing overloads for malloc and free must be created and then injected so that whenever an allocation of memory is requested the custom version is hit first.   

  First step is to create a C++ file, leakfinder.cpp, that will contain the two methods;

#include<iostream>

void* malloc(size_t size)
{
  std::cout << "malloc called" << std::endl;
  // Somehow track this allocation
}

void free(void* ptr)
{
  std::cout << "free called" << std::endl;
  // Somehow marry up this deallocation to a previous allocation
}

  The above example can be compiled into a shared library using g++ (which is the compiler I'll use for the C++ code through out this article) by running:

  ~/SomeFolder$g++ -shared -fPIC leakfinder.cpp -o leakfinder.so -ldl

  Using an example C program that is built to leak we can test this, the example C program is called c_example and it looks like this:

#include <stdlib.h>
#include <stdio.h>

void foo(int size)
{
  int* data = malloc(sizeof(int) * size);
  // Uncomment this to stop leak
  //free(data);
}

void bar(int size)
{
  char *data = malloc(sizeof(char) * size);
  foo(size);
  // Uncomment this to stop leak
  //free(data);
}

void foobar(int size)
{
  bar(size);  
}

int main(void)
{
  printf("leakfinder C example app\n");
  printf("This application is expected to leak\n");

  foobar(8);
  foobar(16);

  printf("leakfinder C example app all done\n");
  return 0;
}

  Important!   
      It is important to test the injection in a terminal other than the one used to compile the source code or the injection might (will) interfere with the compiler and any other commands that use malloc or free.
    Therefore, when working with this example make sure you have one terminal open for compiling and building (both the leak finder and the example applictions), and one where the LD_PRELOAD environment variable is set where you can run the test applications.  

Another way of making sure the pre-load only is applicable for the c_example application is by running it like this;LD_PRELOAD=./leakfinder.so c_example

  Set the LD_PRELOAD in your terminal to point to the leakfinder.so and run the c_example test application using;  

    ~/SomeFolder$export LD_PRELOAD=./leakfinder.so 
 
    ~/SomeFolder$./c_example 

  When first run it turns out that no allocations or deallocations are detected, this is because of the linkage. As g++(a C++ and not a C compiled) was used to compile the leakfinder.cpp file it has applied name mangling and that means he function names do not match the intended system functions malloc and free. To resolve this issue the functions has to be declared with C linkage:

#include<iostream>

extern "C" void* malloc(size_t size)
{
  std::cout << "malloc called" << std::endl;
  // Somehow track this allocation
}

extern "C" void free(void* ptr)
{
  std::cout << "free called" << std::endl;
  // Somehow marry up this deallocation to a previous allocation
}

  When the leak finder with the correct linkage is preloaded and the example is run it will generate an output that looks something like this;

  Loads and loads of allocations and deallocations intercepted!
  This is because even though the test application only explicitly allocates a few things other parts also allocates and deallocates stuff, such as the printf and even the cout calls in the injected code.

  The print out ends with a Segmentation fault which is of course because you cannot hope to replace an actual memory allocation with a print statement and expect stuff to still work Smile | <img src= " /> , but the method is proved to work.

Actual implementation

  As it is clearly very important for the overloaded methods malloc and free to still do the work they are intended to do (as in allocating and deallocating memory on and from the heap) the injected code must look up pointers to the actual implementation and delegate the calls to these before tracking them.

Finding it

  To find the actual implementation is done by calling dlsym (defined in dlfcn.h), this function returns function pointers to functions made available using dlopen. Using the constant RTLD_NEXT the previous entry can be retrieved, and that pointer can be stored in a static variable to cache it so the lookup doesn't have to take place every time;

static void* (*sys_malloc)(size_t) = 0;
static void (*sys_free)(void*) = 0;

static void initialize_functions(void)
{
  sys_malloc = reinterpret_cast<void*(*)(size_t)>(dlsym(RTLD_NEXT, "malloc"));
  if (sys_malloc == 0)
    cerr << "leakfinder failed to read malloc function; " << dlerror() << endl;

  sys_free = reinterpret_cast<void(*)(void*)>(dlsym(RTLD_NEXT, "free"));
  if (sys_free == 0)
    cerr << "leakfinder failed to read free function; " << dlerror() << endl;
}

extern "C" void* malloc(size_t size)
{
  cout << "malloc called" << endl;
  if (sys_malloc == 0)
    initialize_functions();

  void* ptr = sys_malloc(size);

  return ptr;
}

extern "C" void free(void* ptr)
{
  cout << "free called" << endl;

  if (sys_free == 0)
    initialize_functions();

  sys_free(ptr);
}

  In the code above the system version (or standard version) of malloc and free are looked up usingdlsym(RTLD_NEXT, [name of function]) and then stored in sys_malloc and sys_free respectively.   
  After the system version of the functions are cached (if not already cached) they're called to perform the allocation or deallocation and after that the leak finder will intercept the calls to track the information required to compile a list of leaks.

Allocation Info

  To allow the user to fix leaks, each leak needs to be associated with additional information to make it easy to track down the leak and correct it, in leakfinder I track four different things;  

  •    
  • References   
  • Stacktrace   
  • Size   
  • Thread 

References

  In order to marry up a deallocation to a previous allocation there need to be something that is unique and consistent across malloc and free calls, and one thing that can be used is the address of the allocated memory. It has to be unique as no two allocations are allowed to get the same piece of memory and it's consistent across calls as it's the return value of the allocation and the parameter to the deallocation.

Stacktrace

  In order for a memory leak detector to be useful it needs to be able to tell the user where the leak was allocated.   One way to get a stacktrace in Linux is to use the backtrace and backtrace_symbols functions inexecinfo.h.

  Function backtrace gets all the return addresses for all the functions that are currently active on the stack in a particular thread, which is essentially the stacktrace.   
  In order to get a more readable version of the stacktrace the return addresses from backtrace can be fed intobacktrace_symbols to get the names of the functions on the stack.

  void* frames[max_frame_depth];
  size_t stack_size = backtrace(frames, max_frame_depth);
  char** stacktrace = backtrace_symbols(frames, stack_size);

Size and Thread

  Less important but still sometimes useful is to record the size of the allocation and which thread it was allocated on.

  The size is passed in as a size_t argument to malloc so that is easy enough to grab and record.

  Recording the current thread id might be slightly harder depending on the thread library used, in this example I've used the pthread library so I get the thread id as a pthread_t by calling pthread_self().

allocation_info

  The information listed above, reference, stacktrace, size and thread are in leakfinder stored in a class calledallocation_info.

allocation_info.hpp
#ifndef __allocation_info
#define __allocation_info

#include <vector>
#include <string>

#include <pthread.h>

namespace bornander
{
  namespace memory
  {
    class allocation_info
    {
    public:
      typedef long long address_type;

    private:
      allocation_info::address_type address;
      size_t size;
      std::vector<std::string> stacktrace;
      pthread_t thread_id;

    public:
      allocation_info(void* address, size_t size, char** stacktrace, size_t depth, pthread_t thread_id);

      allocation_info::address_type get_address() const;
      size_t get_size() const;
      std::vector<std::string> get_stacktrace() const;
      pthread_t get_thread_id() const;
    };
  }
}

#endif
allocation_info.cpp
#include "allocation_info.hpp"

namespace bornander
{
  namespace memory
  {
    allocation_info::allocation_info(void* address, size_t size, char** stacktrace, size_t depth, pthread_t thread_id)
    {
      this->address = reinterpret_cast<allocation_info::address_type>(address);
      this->size = size;
      this->thread_id = thread_id;

      // Skip first frame as that is the overriden malloc method
      for(int i = 1; i < depth; ++i)
      {
        std::string frame = stacktrace[i];
        this->stacktrace.push_back(frame);
      }
    }

    allocation_info::address_type allocation_info::get_address() const
    {
      return address;
    }

    size_t allocation_info::get_size() const
    {
      return size;
    }

    std::vector<std::string> allocation_info::get_stacktrace() const
    {
      return stacktrace;
    }

    pthread_t allocation_info::get_thread_id() const
    {
      return thread_id;
    }
  }
}

Tracking leaks

  Armed with a way of intercepting allocations (library injection) and a way to store allocation information (allocation_info) we're now ready to implement our basic memory leak detector.

Allocations

  The first problem with tracking allocations inside the allocation method is that to track it memory needs to be allocated to store the allocation_info, and that obviously means that for every allocation another allocation is required. Since the additional allocation uses malloc as well the approach leads to a stack overflow.

  The solution is to only track allocations that are originating from outside of leakfinder. By declaring a staticboolean called isExternalSource that is set to false before the allocation is recorded and back to truewhen done it is possible to exclude allocation that arise from recording the source allocation. The overloadedmalloc method then looks something like this;

static bool isExternalSource = true;
static void* (*sys_malloc)(size_t) = 0;

extern "C" void* malloc(size_t size)
{
  // Make sure we're initialized
  if (sys_malloc == 0)
    initialize_functions();
  // Call the actual malloc and keep the result
  void* ptr = sys_malloc(size);

  if (isExternalSource)
  {
    isExternalSource = false;
    // Record the details of this allocation in an allocation_info
    isExternalSource = true;
  }

  return ptr;
}

  This takes care of the exclusion of internal allocation but suffers from threading issues as the staticisExternalSource might be read/written by two threads at the same time causing undefined behaviour.

  By guarding the inside of the if-statement with a lock (using pthread threads) the malloc method changes to this;

static pthread_mutex_t cs_mutex = PTHREAD_MUTEX_INITIALIZER;

static bool isExternalSource = true;
static void* (*sys_malloc)(size_t) = 0;

extern "C" void* malloc(size_t size)
{
  // Make sure we're initialized
  if (sys_malloc == 0)
    initialize_functions();

  // Call the actual malloc and keep the result
  void* ptr = sys_malloc(size);

  if (isExternalSource)
  {
    pthread_t thread_id = pthread_self();
    pthread_mutex_lock(&cs_mutex);
    isExternalSource = false;

    // Record the allocation here
    isExternalSource = true;
    pthread_mutex_unlock(&cs_mutex);
  }

  return ptr;
}

  Now the malloc implementation is thread safe (or thread-safeish, it still suffers from some issues but for sake of simplicity I'm going to keep it this way for this article).   
  The rest of the implementation is the matter of grabbing the stacktrace and storing it along with the reference, size and thread id. The size is already passed in and the thread id has been grabbed using pthread_self, the remaining this to store is simply the reference which is the address of ptr which is returned by the actual implementation of malloc.

  All of the above yields a malloc function that looks like this;

static pthread_mutex_t cs_mutex = PTHREAD_MUTEX_INITIALIZER;

static size_t allocation_count = 0;
static vector<allocation_info> allocations;

static const size_t max_frame_depth = 128;
static bool isExternalSource = true;
static void* (*sys_malloc)(size_t) = 0;

extern "C" void* malloc(size_t size)
{
  if (sys_malloc == 0)
    initialize_functions();

  void* ptr = sys_malloc(size);

  if (isExternalSource)
  {
    pthread_t thread_id = pthread_self();
    pthread_mutex_lock(&cs_mutex);
    isExternalSource = false;

    // Used for summary statistics
    ++allocation_count;

    // Grab stacktrace
    void* frames[max_frame_depth];
    size_t stack_size = backtrace(frames, max_frame_depth);
    char** stacktrace = backtrace_symbols(frames, stack_size);
    allocation_info allocation(ptr, size, stacktrace, stack_size, thread_id);

    allocations.push_back(allocation);

    // Make sure to release the memory allocated by backtrace_symbols
    sys_free(stacktrace);  
    isExternalSource = true;

    pthread_mutex_unlock(&cs_mutex);
  }

  return ptr;
}

Deallocations

  To marry up a deallocation to a previous allocation is simple. The free uses the same method of preventing it to be run for an internal free and thread safety and in addition to this it's just a matter of finding theallocation_info in the allocations vector. The unique key is the reference or address;

extern "C" void free(void* ptr)
{
  if (sys_free == 0)
    initialize_functions();

  allocation_info::address_type address = reinterpret_cast<allocation_info::address_type>(ptr);
  sys_free(ptr);

  if (isExternalSource)
  {
    pthread_mutex_lock(&cs_mutex);
    isExternalSource = false;
    for (int i = 0; i < allocations.size(); ++i)
    {
      allocation_info allocation = allocations[i];
      if (allocation.get_address() == address)
      {
        allocations.erase(allocations.begin() + i);
        break;
      }
    }
    isExternalSource = true;
    pthread_mutex_unlock(&cs_mutex);
  }
}

Summing it up

  After all the allocations and deallocations have been tracked and matched off the ones that were never deallocated must be reported on, so when is a suitable time to do that?   
  Since it's pretty much technically impossible to tell if a leak has happened before the program terminatesleakfinder uses program exit to sum up the leaks. One might argue that the leak happens when the pointer referencing the memory goes out of scope of the pointer has not been deallocated but since it is possible to store the pointer value in just about any other data structure that can be hard to rely on.

C style destructor

  There are different ways to hook into the termination of a program, which one to pick depends on platform and personal taste.   
  One approach is to use a pragma directive;

  #pragma fini (some_exit_handler)

  but for leakfinder I've gone for a C style destructor;

  static void compile_allocation() __attribute__((destructor));

  Using this approach, the method compile_allocation is executed when the shared library is unloaded, this is typically at program exit.

  Since all the not unallocated allocations are held in the vector allocations at program exit, the work of thecompile_allocation method is just to iterate through the leaks and somehow output the leak information.   
  Where the best place to output the leak information to I am not entirely sure of. In certain scenarios a file would be convinient but for simplicity I've decided to let leakfinder just dump the summary to standard out.

  To avoid extra work isExternalSource needs to be set to false at the beginning of compile_allocaionas printing the summary requires allocations to take place.

  To print addresses and pointers in hex variables hex and dec from iomanip are used.

void compile_allocation()
{
  isExternalSource = false;
  if (allocations.empty())
  {
    cout << "leakfinder found no leaks, not one of the " << allocation_count;
    cout << " allocations was not released." << endl;
  }
  else
  {
    cout << "leakfinder detected that " << allocations.size();
    cout << " out of " << allocation_count << " allocations was not released." << endl;
    for (int i = 0; i < allocations.size(); ++i)
    {
      allocation_info allocation = allocations[i];
      cout << "Leak " << (i+1) << "@0x" << hex << allocation.get_thread_id() << dec;
      cout << "; leaked " << allocation.get_size() << " bytes at position 0x";
      cout << hex << allocation.get_address() << dec << endl;

      vector<string> stacktrace = allocation.get_stacktrace();
      for (int j = 0; j < stacktrace.size(); ++j)
      {
        cout << "\t" << stacktrace[j] << endl;
      }
    }
  }
}

Trying it out

  To try it out first build the leakfinder shared library by typing (there's also a makefile included for those who cares not for compiling by hand);

  g++ -shared -fPIC allocation_info.cpp leakfinder.cpp -o leakfinder.so -lpthread -ldl

  Then build the c_example shared library by typing;

  cc c_example.c -o c_example

  Then, open up a different terminal and set the LD_PRELOAD;

  export LD_PRELOAD=./leakfinder.so

  Lastly, run the c_example in the terminal where LD_PRELOAD was set;

  ./c_example

  Running that application should produce an output similar to this;

  fredrik@ubuntu-01:~/Development/leakfinder$ ./c_example
  leakfinder C example app
  This application is expected to leak
  leakfinder C example app all done
  leakfinder detected that 4 out of 4 allocations was not released.
  Leak 1@0xb77876d0; leaked 8 bytes at position 0x8e63020
    ./c_example() [0x804843e]
    ./c_example() [0x804845f]
    ./c_example() [0x804848e]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xaef113]
    ./c_example() [0x8048381]
  Leak 2@0xb77876d0; leaked 32 bytes at position 0x8e632d0
    ./c_example() [0x8048428]
    ./c_example() [0x804844c]
    ./c_example() [0x804845f]
    ./c_example() [0x804848e]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xaef113]
    ./c_example() [0x8048381]
  Leak 3@0xb77876d0; leaked 16 bytes at position 0x8e63230
    ./c_example() [0x804843e]
    ./c_example() [0x804845f]
    ./c_example() [0x804849a]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xaef113]
    ./c_example() [0x8048381]
  Leak 4@0xb77876d0; leaked 64 bytes at position 0x8e63318
    ./c_example() [0x8048428]
    ./c_example() [0x804844c]
    ./c_example() [0x804845f]
    ./c_example() [0x804849a]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xaef113]
    ./c_example() [0x8048381]
  fredrik@ubuntu-01:~/Development/leakfinder$

  While that has indeed first printed the output of c_example followed by the summary of the four leaks, the stactrace isn't very easy to dechiper. That is because the program is lacking symbols but by compiling it again (in the first terminal) with option -rdynamic this can be corrected;

  cc -rdynamic c_example.c -o c_example

  A more useful stacktrace is provided;

  fredrik@ubuntu-01:~/Development/leakfinder$ ./c_example
  leakfinder C example app
  This application is expected to leak
  leakfinder C example app all done
  leakfinder detected that 4 out of 4 allocations was not released.
  Leak 1@0xb76fb6d0; leaked 8 bytes at position 0x91d9020
    ./c_example(bar+0x11) [0x804860e]
    ./c_example(foobar+0x11) [0x804862f]
    ./c_example(main+0x2d) [0x804865e]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x144113]
    ./c_example() [0x8048551]
  Leak 2@0xb76fb6d0; leaked 32 bytes at position 0x91d92f8
    ./c_example(foo+0x14) [0x80485f8]
    ./c_example(bar+0x1f) [0x804861c]
    ./c_example(foobar+0x11) [0x804862f]
    ./c_example(main+0x2d) [0x804865e]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x144113]
    ./c_example() [0x8048551]
  Leak 3@0xb76fb6d0; leaked 16 bytes at position 0x91d9258
    ./c_example(bar+0x11) [0x804860e]
    ./c_example(foobar+0x11) [0x804862f]
    ./c_example(main+0x39) [0x804866a]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x144113]
   ./c_example() [0x8048551]
  Leak 4@0xb76fb6d0; leaked 64 bytes at position 0x91d9340
    ./c_example(foo+0x14) [0x80485f8]
    ./c_example(bar+0x1f) [0x804861c]
    ./c_example(foobar+0x11) [0x804862f]
    ./c_example(main+0x39) [0x804866a]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x144113]
    ./c_example() [0x8048551]
  fredrik@ubuntu-01:~/Development/leakfinder$

  Now the method name is included and that makes it a whole lot easier to read, and while this works for C++programs as well their stacktraces are still a bit messy because of name mangling, but still very much readable when compiled with -rdynamic, as seen below in the output from cpp_example (also included in the download);

  fredrik@ubuntu-01:~/Development/leakfinder$ ./cpp_example
  leakfinder C++ thread example app
  This application is expected to leak
  leakfinder detected that 4 out of 5 allocations was not released.
  Leak 1@0xb77e66d0; leaked 4 bytes at position 0x8e79020
    /usr/lib/i386-linux-gnu/libstdc++.so.6(_Znwj+0x27) [0x2f19d7]
    ./cpp_example(_ZN10my_class_aC1Ev+0x12) [0x8048c64]
    ./cpp_example(_ZN10my_class_bC1Ev+0x11) [0x8048ce9]
    ./cpp_example(main+0x5f) [0x8048b97]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x346113]
    ./cpp_example() [0x8048a81]
  Leak 2@0xb77e66d0; leaked 4 bytes at position 0x8e79108
    /usr/lib/i386-linux-gnu/libstdc++.so.6(_Znwj+0x27) [0x2f19d7]
    ./cpp_example(_ZN10my_class_b3fooEv+0x12) [0x8048caa]
    ./cpp_example(_ZN10my_class_b3barEv+0x11) [0x8048cc1]
    ./cpp_example(_ZN10my_class_b6foobarEv+0x11) [0x8048cd5]
    ./cpp_example(main+0x6b) [0x8048ba3]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x346113]
    ./cpp_example() [0x8048a81]
  Leak 3@0xb77e66d0; leaked 4 bytes at position 0x8e791f0
    /usr/lib/i386-linux-gnu/libstdc++.so.6(_Znwj+0x27) [0x2f19d7]
    ./cpp_example(_ZN10my_class_b3fooEv+0x12) [0x8048caa]
    ./cpp_example(main+0x77) [0x8048baf]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x346113]
    ./cpp_example() [0x8048a81]
  Leak 4@0xb77e66d0; leaked 1 bytes at position 0x8e79250
    /usr/lib/i386-linux-gnu/libstdc++.so.6(_Znwj+0x27) [0x2f19d7]
    ./cpp_example(_Z12cpp_functionv+0x12) [0x8048b26]
    ./cpp_example(c_function+0xb) [0x8048b36]
    ./cpp_example(main+0x7c) [0x8048bb4]
    /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x346113]
    ./cpp_example() [0x8048a81]
  fredrik@ubuntu-01:~/Development/leakfinder$

Points of interest

  Like I stated at the beginning of the article, this is not an attempt at a finished leak detection product but an explanation of how such a product can be   created. The observant reader will have realised that the implementation included with this article is lacking in many areas, it does not explicily cater for (for example)calloc and that using a std::vector to store the allocation_info objects results in a linear performance penalty on the lookups.

  Regardless of this, I hope the article has provided a certain amount of insight into how leaks can be spotted in a non-intrusive way (in the sense that the application code does not need to be instrumented or otherwise augmented). 

And weirdly enough, the Code Project article submission wizard does not allow me to upload .tar files, so if you want one of those instead; you can get it from here (the allocation_info.cpp file is broken but I'll fix that at a later time, the .zip file is correct); https://sites.google.com/site/fredrikbornander/Home/leakfinder.tar?attredirects=0&d=1 

 

  Any comments are most welcome.

History

2012-05-29; First version.
2012-06-06; Second version, fixed formatting and added suggestion made by mossaiby, and fixed the broken cpp file in the .zip file as pointer out by ian4264