The Linux Programming Interface 笔记之malloc和free的实现_implementation of malloc and free in c-CSDN博客

本文链接：https://blog.csdn.net/libinjlu/article/details/54867374

Although malloc()and free() provide an interface for allocating memory that is
much easier to use than brk() and sbrk(), it is still possible to make various programming
errors when using them. Understanding how malloc() and free() are implemented

provides us with insights into the causes of these errors and how we can avoid them.

The implementation of malloc()is straightforward. It first scans the list of memory
blocks previously released by free() in order to find one whose size is larger than
or equal to its requirements. (Different strategies may be employed for this scan,
depending on the implementation; for example, first-fit or best-fit.) If the block is
exactly the right size, then it is returned to the caller. If it is larger, then it is split, so
that a block of the correct size is returned to the caller and a smaller free block is

left on the free list.

If no block on the free list is large enough, then malloc() calls sbrk() to allocate
more memory. To reduce the number of calls to sbrk(), rather than allocating
exactly the number of bytes required, malloc() increases the program break in
larger units (some multiple of the virtual memory page size), putting the excess
memory onto the free list.

Looking at the implementation of free(), things start to become more interesting.
When free() places a block of memory onto the free list, how does it know what
size that block is? This is done via a trick. When malloc() allocates the block, it
allocates extra bytes to hold an integer containing the size of the block. This integer is
located at the beginning of the block; the address actually returned to the caller
points to the location just past this length value, as shown in Figure 7-1.

When a block is placed on the (doubly linked) free list,free()uses the bytes of the
block itself in order to add the block to the list, as shown in Figure 7-2.

As blocks are deallocated and reallocated over time, the blocks of the free list will
become intermingled with blocks of allocated, in-use memory, as shown in Figure 7-3.

Now consider the fact that C allows us to create pointers to any location in the
heap, and modify the locations they point to, including the length,previous free block,
and next free block pointers maintained by free() and malloc(). Add this to the preceding
description, and we have a fairly combustible mix when it comes to creating
obscure programming bugs. For example, if, via a misdirected pointer, we accidentally
increase one of the length values preceding an allocated block of memory, and
subsequently deallocate that block, then free()will record the wrong size block of
memory on the free list. Subsequently, malloc() may reallocate this block, leading to
a scenario where the program has pointers to two blocks of allocated memory that
it understands to be distinct, but which actually overlap. Numerous other pictures
of what could go wrong can be drawn.
To avoid these types of errors, we should observe the following rules:
> After we allocate a block of memory, we should be careful not to touch any
bytes outside the range of that block. This could occur, for example, as a result
of faulty pointer arithmetic or off-by-one errors in loops updating the contents of
a block.
> It is an error to free the same piece of allocated memory more than once. With
glibc on Linux, we often get a segmentation violation (SIGSEGV signal). This is
good, because it alerts us that we’ve made a programming error. However,
more generally, freeing the same memory twice leads to unpredictable behavior.
> We should never call free() with a pointer value that wasn’t obtained by a call to
one of the functions in the malloc package.
> If we are writing a long-running program (e.g., a shell or a network daemon
process) that repeatedly allocates memory for various purposes, then we
should ensure that we deallocate any memory after we have finished using it.
Failure to do so means that the heap will steadily grow until we reach the limits
of available virtual memory, at which point further attempts to allocate memory fail. Such a condition is known as amemory leak.

Tools and libraries for malloc debugging

Failure to observe the rules listed above can lead to the creation of bugs that are
obscure and difficult to reproduce. The task of finding such bugs can be eased
considerably by using the malloc debugging tools provided by glibc or one of a number
of malloc debugging libraries that are designed for this purpose.

Among the malloc debugging tools provided by glibc are the following:

> The mtrace() and muntrace() functions allow a program to turn tracing of memory
allocation calls on and off. These functions are used in conjunction with
the MALLOC_TRACE environment variable, which should be defined to contain the
name of a file to which tracing information should be written. When mtrace() is
called, it checks to see whether this file is defined and can be opened for writing; if so,
then all calls to functions in the malloc package are traced and
recorded in the file. Since the resulting file is not easily human-readable, a
script—also called mtrace—is provided to analyze the file and produce a readable
summary. For security reasons, calls to mtrace()are ignored by set-user-ID and
set-group-ID programs.

> The mcheck() and mprobe() functions allow a program to perform consistency
checks on blocks of allocated memory; for example, catching errors such as
attempting to write to a location past the end of a block of allocated memory.
These functions provide functionality that somewhat overlaps with the malloc
debugging libraries described below. Programs that employ these functions
must be linked with the mcheck library using the cc -lmcheck option.

> The MALLOC_CHECK_ environment variable (note the trailing underscore) serves a
similar purpose to mcheck() and mprobe(). (One notable difference between the
two techniques is that using MALLOC_CHECK_ doesn’t require modification and
recompilation of the program.) By setting this variable to different integer values,
we can control how a program responds to memory allocation errors. Possible
settings are: 0, meaning ignore errors; 1, meaning print diagnostic errors on
stderr; and 2, meaning call abort() to terminate the program. Not all memory
allocation and deallocation errors are detected via the use of MALLOC_CHECK_; it
finds just the common ones. However, this technique is fast, easy to use, and
has low run-time overhead compared with the use of malloc debugging libraries.
For security reasons, the setting of MALLOC_CHECK_ is ignored by set-user-ID and
set-group-ID programs.

Further information about all of the above features can be found in the glibc manual.

A malloc debugging library offers the same API as the standard malloc package,
but does extra work to catch memory allocation bugs. In order to use such a
library, we link our application against that library instead of the malloc package in
the standard C library. Because these libraries typically operate at the cost of slower
run-time operation, increased memory consumption, or both, we should use them
only for debugging purposes, and then return to linking with the standard malloc
package for the production version of an application. Among such libraries are
Electric Fence (http://www.perens.com/FreeSoftware/),dmalloc(http://dmalloc.com/),
Valgrind (http://valgrind.org/), andInsure++(http://www.parasoft.com/).

Both Valgrind and Insure++ are capable of detecting many other kinds of bugs
aside from those associated with heap allocation. See their respective web sites
for details.

Controlling and monitoring the malloc package

The glibc manual describes a range of nonstandard functions that can be used to
monitor and control the allocation of memory by functions in the malloc package,
including the following:

> The mallopt() function modifies various parameters that control the algorithm
used by malloc(). For example, one such parameter specifies the minimum
amount of releasable space that must exist at the end of the free list before
sbrk() is used to shrink the heap. Another parameter specifies an upper limit
for the size of blocks that will be allocated from the heap; blocks larger than
this are allocated using the mmap() system call (refer to Section 49.7).

> The mallinfo() function returns a structure containing various statistics about
the memory allocated by malloc().

Many UNIX implementations provide versions of mallopt() and mallinfo().
However, the interfaces offered by these functions vary across implementations, so they
are not portable.