Chapter 9-04

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1.

9.9 Dynamic Memory Allocation

l  A dynamic memory allocator maintains an area of a process’s virtual memory known as the heap. Details vary from system to system, but without loss of generality, we will assume that the heap is an area of demand-zero memory that begins immediately after the uninitialized .bss area and grows upward (toward higher addresses). For each process, the kernel maintains a variable brk (pronounced “break”) that points to the top of the heap.

l  An allocator maintains the heap as a collection of various-sized blocks. Each block is a contiguous chunk of virtual memory that is either allocated or free.

²  An allocated block has been explicitly reserved for use by the application. An allocated block remains allocated until it is freed, either explicitly by the application, or implicitly by the memory allocator itself.

²  A free block is available to be allocated. A free block remains free until it is explicitly allocated by the application.

l  Allocators come in two basic styles. Both styles require the application to explicitly allocate blocks. They differ about which entity is responsible for freeing allocated blocks.

1.        Explicit allocators require the application to explicitly free any allocated blocks. For example, the C standard library provides an explicit allocator called the malloc package. C programs allocate a block by calling the malloc function, and free a block by calling the free function. The new and delete calls in C++ are comparable.

2.        Implicit allocators require the allocator to detect when an allocated block is no longer being used by the program and then free the block. Implicit allocators are also known as garbage collectors, and the process of automatically freeing unused allocated blocks is known as garbage collection. For example, higher-level languages such as Lisp, ML, and Java rely on garbage collection to free allocated blocks.

9.9.1 The malloc and free Functions

l  The C standard library provides an explicit allocator known as the malloc package. Programs allocate blocks from the heap by calling the malloc function.

#include <stdlib.h>

void *malloc(size_t size);

Returns: ptr to allocated block if OK, NULL on error

l  The malloc function returns a pointer to a block of memory of at least size bytes that is suitably aligned for any kind of data object that might be contained in the block. On the Unix systems, malloc returns a block that is aligned to an 8-byte (double word) boundary.
If malloc encounters a problem (e.g., the program requests a block of memory that is larger than the available virtual memory), then it returns NULL and sets errno.

l  Malloc does not initialize the memory it returns. Applications that want initialized dynamic memory can use calloc, a thin wrapper around the malloc function that initializes the allocated memory to zero. Applications that want to change the size of a previously allocated block can use the realloc function.

l  Dynamic memory allocators such as malloc can allocate or deallocate heap memory explicitly by using the mmap and munmap functions, or they can use the sbrk function:

#include <unistd.h>

void *sbrk(intptr_t incr);

Returns: old brk pointer on success, −1 on error

l  The sbrk function grows or shrinks the heap by adding incr to the kernel’s brk pointer. If successful, it returns the old value of brk, otherwise it returns −1 and sets errno to ENOMEM.
If incr is zero, then sbrk returns the current value of brk.
Calling sbrk with a negative incr is legal but tricky because the return value (the old value of brk) points to abs(incr) bytes past the new top of the heap.

l  Programs free allocated heap blocks by calling the free function.

#include <stdlib.h>

void free(void *ptr);

Returns: nothing

l  The ptr argument must point to the beginning of an allocated block that was obtained from malloc, calloc, or realloc. If not, then the behavior of free is undefined. Even worse, since it returns nothing, free gives no indication to the application that something is wrong.

l  Figure 9.34 shows how an implementation of malloc and free might manage a heap of 16 words for a C program. Each box represents a 4-byte word. The heavy-lined rectangles correspond to allocated blocks (shaded) and free blocks (unshaded). Initially, the heap consists of a single 16-word double-word aligned free block.

²  (d): The program frees the six-word block that was allocated (b). Notice that after the call to free returns, the pointer p2 still points to the freed block. It is the responsibility of the application not to use p2 again until it is reinitialized by a new call to malloc.

²  (e): The program requests a two-word block. In this case, malloc allocates a portion of the block that was freed in the previous step and returns a pointer to this new block.

9.9.2 Why Dynamic Memory Allocation?

l  The most important reason that programs use dynamic memory allocation is that often they do not know the sizes of certain data structures until the program actually runs.

9.9.3 Allocator Requirements and Goals

l  Explicit allocators must operate within some constraints.

²  Handling arbitrary request sequences.
An application can make an arbitrary sequence of allocate and free requests, subject to the constraint that each free request must correspond to a currently allocated block obtained from a previous allocate request. Thus, the allocator cannot make any assumptions about the ordering of allocate and free requests.

²  Making immediate responses to requests.
The allocator must respond immediately to allocate requests. Thus, the allocator is not allowed to reorder or buffer requests in order to improve performance.

²  Using only the heap.
In order for the allocator to be scalable, any non-scalar data structures used by the allocator must be stored in the heap itself.

²  Aligning blocks (alignment requirement).
The allocator must align blocks in such a way that they can hold any type of data object. On most systems, this means that the block returned by the allocator is aligned on an 8-byte (double-word) boundary.

²  Not modifying allocated blocks.
Allocators can only manipulate or change free blocks. In particular, they are not allowed to modify or move blocks once they are allocated. Thus, techniques such as compaction of allocated blocks are not permitted.

l  Working within these constraints, the author of an allocator attempts to meet the often conflicting performance goals of maximizing throughput and memory utilization.

Goal 1: Maximizing throughput.

²  Given some sequence of n allocate and free requests: R0, R1, . . . , Rk , . . . , Rn−1. We would like to maximize an allocator’s throughput, which is defined as the number of requests that it completes per unit time. For example, if an allocator completes 500 allocate requests and 500 free requests in 1 second, then its throughput is 1,000 operations per second. In general, we can maximize throughput by minimizing the average time to satisfy allocate and free requests.

Goal 2: Maximizing memory utilization.

²  The total amount of virtual memory allocated by all of the processes in a system is limited by the amount of swap space on disk.

²  There are a number of ways to characterize how efficiently an allocator uses the heap. In our experience, the most useful metric is peak utilization.

²  We are given some sequence of n allocate and free requests R0 , R1 , . . . , Rk , . . . , Rn−1. If an application requests a block of p bytes, then the resulting allocated block has a payload of p bytes. After request Rk has completed, let the aggregate payload, denoted Pk , be the sum of the payloads of the currently allocated blocks, and let Hk denote the current size of the heap.

²  Then the peak utilization over the first k requests, denoted by Uk , is given by

²  The objective of the allocator then is to maximize the peak utilization Un−1 over the entire sequence.

l  There is a tension between maximizing throughput and utilization. One challenge in any allocator design is finding an appropriate balance between the two goals.

9.9.4 Fragmentation

l  The primary cause of poor heap utilization is a phenomenon known as fragmentation, which occurs when otherwise unused memory is not available to satisfy allocate requests.

l  There are two forms of fragmentation: internal fragmentation and external fragmentation.

l  Internal fragmentation occurs when an allocated block is larger than the payload.

²  This might happen for a number of reasons. For example, the implementation of an allocator might impose a minimum size on allocated blocks that is greater than some requested payload. Or the allocator might increase the block size in order to satisfy alignment constraints.

²  Internal fragmentation is straightforward to quantify. It is the sum of the differences between the sizes of the allocated blocks and their payloads. At any point in time, the amount of internal fragmentation depends only on the pattern of previous requests and the allocator implementation.

l  External fragmentation occurs when there is enough aggregate free memory to satisfy an allocate request, but no single free block is large enough to handle the request.

²  External fragmentation is more difficult to quantify than internal fragmentation because it depends not only on the pattern of previous requests and the allocator implementation, but also on the pattern of future requests.

9.9.5 Implementation Issues

l  A practical allocator that strikes a better balance between throughput and utilization must consider the following issues:

²  Free block organization: How do we keep track of free blocks?

²  Placement: How do we choose an appropriate free block in which to place a newly allocated block?

²  Splitting: After we place a newly allocated block in some free block, what do we do with the remainder of the free block?

²  Coalescing: What do we do with a block that has just been freed?

9.9.6 Implicit Free Lists

l  Any practical allocator needs some data structure that allows it to distinguish block boundaries and to distinguish between allocated and free blocks. Most allocators embed this information in the blocks themselves. One simple approach is shown in Figure 9.35.

l  A block consists of a one-word header, the payload, and possibly some padding. The header encodes the block size (including the header and any padding) as well as whether the block is allocated or free.

l  If we impose a double-word alignment constraint, then the block size is always a multiple of 8 and the 3 low-order bits of the block size are always zero. Thus, we need to store only the 29 high-order bits of the block size, freeing the remaining 3 bits to encode other information. In this case, we are using the least significant of these bits (the allocated bit) to indicate whether the block is allocated or free.

l  Suppose we have an allocated block with a block size of 24 (0x18) bytes. Then its header would be

0x00000018 | 0x1 = 0x00000019

A free block with a block size of 40 (0x28) bytes would have a header of

0x00000028 | 0x0 = 0x00000028

l  The header is followed by the payload that the application requested when it called malloc. The payload is followed by a chunk of unused padding that can be any size.

l  There are a number of reasons for the padding. For example, the padding might be part of an allocator’s strategy for combating external fragmentation. Or it might be needed to satisfy the alignment requirement.

l  Given the block format in Figure 9.35, we can organize the heap as a sequence of contiguous allocated and free blocks, as shown in Figure 9.36.

l  We call this organization an implicit free list because the free blocks are linked implicitly by the size fields in the headers. The allocator can indirectly traverse the entire set of free blocks by traversing all of the blocks in the heap. We need some kind of specially marked end block, in this example a terminating header with the allocated bit set and a size of zero.

l  The advantage of an implicit free list is simplicity. A disadvantage is that the cost of any operation, such as placing allocated blocks, that requires a search of the free list will be linear in the total number of allocated and free blocks in the heap.

l  The system’s alignment requirement and the allocator’s choice of block format impose a minimum block size on the allocator. No allocated or free block may be smaller than this minimum. For example, if we assume a double-word alignment requirement, then the size of each block must be a multiple of two words (8 bytes). Thus, the block format in Figure 9.35 induces a minimum block size of two words: one word for the header, and another to maintain the alignment requirement. Even if the application were to request a single byte, the allocator would still create a two-word block.

9.9.7 Placing Allocated Blocks

l  When an application requests a block of k bytes, the allocator searches the free list for a free block that is large enough to hold the requested block. The manner in which the allocator performs this search is determined by the placement policy.

l  Some common policies are first fit, next fit, and best fit.

²  First fit searches the free list from the beginning and chooses the first free block that fits.

²  Next fit is similar to first fit, but instead of starting each search at the beginning of the list, it starts each search where the previous search left off.

²  Best fit examines every free block and chooses the free block with the smallest size that fits.

l  An advantage of first fit is that it tends to retain large free blocks at the end of the list. A disadvantage is that it tends to leave “splinters” of small free blocks toward the beginning of the list, which will increase the search time for larger blocks.

l  Next fit was proposed by Donald Knuth as an alternative to first fit, motivated by the idea that if we found a fit in some free block the last time, there is a good chance that we will find a fit the next time in the remainder of the block. Next fit can run significantly faster than first fit, especially if the front of the list becomes littered with many small splinters. However, some studies suggest that next fit suffers from worse memory utilization than first fit.

l  Best fit enjoys better memory utilization than either first fit or next fit. The disadvantage of best fit with free list organizations such as the implicit free list, is that it requires an exhaustive search of the heap.

9.9.8 Splitting Free Blocks

l  Once the allocator has located a free block that fits, it must make another policy decision about how much of the free block to allocate. One option is to use the entire free block. Although simple and fast, the main disadvantage is that it introduces internal fragmentation.

l  If the fit is not good, then the allocator will usually opt to split the free block into two parts. The first part becomes the allocated block, and the remainder becomes a new free block.

l  Figure 9.37 shows how the allocator might split the eight-word free block in Figure 9.36 to satisfy an application’s request for three words of heap memory.

9.9.9 Getting Additional Heap Memory

l  What happens if the allocator is unable to find a fit for the requested block?

l  One option is to try to create some larger free blocks by merging (coalescing) free blocks that are physically adjacent in memory.

l  If this does not yield a sufficiently large block, or if the free blocks are already maximally coalesced, then the allocator asks the kernel for additional heap memory by calling the sbrk function.

l  The allocator transforms the additional memory into one large free block, inserts the block into the free list, and then places the requested block in this new free block.

9.9.10 Coalescing Free Blocks

l  When the allocator frees an allocated block, there might be other free blocks that are adjacent to the newly freed block. Such adjacent free blocks can cause a phenomenon known as false fragmentation, where there is a lot of available free memory chopped up into small, unusable free blocks.

l  To combat false fragmentation, any practical allocator must merge adjacent free blocks in a process known as coalescing. But when to perform coalescing?

l  The allocator can opt for immediate coalescing by merging any adjacent blocks each time a block is freed. Immediate coalescing is straightforward and can be performed in constant time, but with some request patterns it can introduce a form of thrashing where a block is repeatedly coalesced and then split soon thereafter.

l  Or it can opt for deferred coalescing by waiting to coalesce free blocks at some later time. For example, the allocator might defer coalescing until some allocation request fails, and then scan the entire heap, coalescing all free blocks.

l  In our discussion of allocators, we will assume immediate coalescing, but you should be aware that fast allocators often opt for some form of deferred coalescing.

9.9.11 Coalescing with Boundary Tags

How does an allocator implement coalescing?

l  Let us refer to the block we want to free as the current block. Then coalescing the next free block in memory is straightforward and efficient. The header of the current block points to the header of the next block, which can be checked to determine if the next block is free. If so, its size is simply added to the size of the current header and the blocks are coalesced in constant time.

l  But how would we coalesce the previous block? Given an implicit free list of blocks with headers, the only option would be to search the entire list, remembering the location of the previous block, until we reached the current block.

l  With an implicit free list, this means that each call to free would require time linear in the size of the heap.

l  Knuth developed a technique, known as boundary tags, that allows for constant-time coalescing of the previous block.

l  The idea is to add a footer (the boundary tag) at the end of each block, where the footer is a replica of the header. If each block includes such a footer, then the allocator can determine the starting location and status of the previous block by inspecting its footer, which is always one word away from the start of the current block.

l  Consider all the cases that can exist when the allocator frees the current block:

1. The previous and next blocks are both allocated.

2. The previous block is allocated and the next block is free.

3. The previous block is free and the next block is allocated.

4. The previous and next blocks are both free.

l  In each case, the coalescing is performed in constant time.

l  However, requiring each block to contain both a header and a footer can introduce significant memory overhead if an application manipulates many small blocks.

l  There is an optimization of boundary tags that eliminates the need for a footer in allocated blocks. Since when we attempt to coalesce the current block with the previous and next blocks in memory, the size field in the footer of the previous block is only needed if the previous block is free. If we were to store the allocated/free bit of the previous block in one of the excess low-order bits of the current block, then allocated blocks would not need footers, and we could use that extra space for payload. However, that free blocks still need footers.

9.9.12 Putting It Together: Implementing a Simple Allocator

9.9.13 Explicit Free Lists

l  Because block allocation time is linear in the total number of heap blocks, the implicit free list is not appropriate for a general-purpose allocator.

l  A better approach is to organize the free blocks into some form of explicit data structure. Since by definition the body of a free block is not needed by the program, the pointers that implement the data structure can be stored within the bodies of the free blocks. For example, the heap can be organized as a doubly linked free list by including a pred (predecessor) and succ (successor) pointer in each free block, as shown in Figure 9.48.

l  Using a doubly linked list instead of an implicit free list reduces the first fit allocation time from linear in the total number of blocks to linear in the number of free blocks.

l  However, the time to free a block can be either linear or constant, depending on the policy we choose for ordering the blocks in the free list.

l  One approach is to maintain the list in last-in first-out (LIFO) order by inserting newly freed blocks at the beginning of the list. With a LIFO ordering and a first fit placement policy, the allocator inspects the most recently used blocks first. In this case, freeing a block can be performed in constant time. If boundary tags are used, then coalescing can also be performed in constant time.

l  Another approach is to maintain the list in address order, where the address of each block in the list is less than the address of its successor. In this case, freeing a block requires a linear-time search to locate the appropriate predecessor. The trade-off is that address-ordered first fit enjoys better memory utilization than LIFO-ordered first fit, approaching the utilization of best fit.

l  A disadvantage of explicit lists in general is that free blocks must be large enough to contain all of the necessary pointers, as well as the header and possibly a footer. This results in a larger minimum block size, and increases the potential for internal fragmentation.

9.9.14 Segregated Free Lists

9.10 Garbage Collection

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值