void *aligned_malloc(size_t required_bytes, size_t alignment) {
void *p1;
void **p2;
int offset=alignment-1+sizeof(void*);
if((p1=(void*)malloc(required_bytes+offset))==NULL)
return NULL;
p2=(void**)(((size_t)(p1)+offset)&~(alignment-1)); //line 5
p2[-1]=p1; //line 6
return p2;
}
p1 is the actual allocation. p2 is the pointer being returned, which references memory past the point of allocation and leaves enough space for both allignment AND storing the actual allocated pointer in the first place. when aligned_free() is called, p1 will be retrieved to do the "real" free().
Regarding the bit math, that gets a little more cumbersome, but it works.
p2=(void**)(((size_t)(p1)+offset)&~(alignment-1)); //line 5
Remember, p1 is the actual allocation reference. For kicks, lets assume the following, with 32bit pointers:
alignment = 64 bytes, 0x40
offset = 0x40-1+4 = 0x43
p1 = 0x20000110, a value returned from the stock malloc()
What is important is the original malloc()
that allocates an additional 0x43 bytes of space above and beyond the original request. This is to ensure both the alignment math and the space for the 32bit pointer can be accounted for:
p2=(void**)(((size_t)(p1)+offset)&~(alignment-1)); //line 5
p2 = (0x20000110 + 0x43) &~ (0x0000003F)
p2 = 0x20000153 &~ 0x0000003F
p2 = 0x20000153 & 0xFFFFFFC0
p2 = 0x20000140
p2 is aligned on a 0x40 boundary (i.e. all bits in 0x3F are 0) and enough space is left behind to store the 4-byte pointer for the original allocation, referenced by p1.
what does the code do?
- The strategy is to malloc more space than what we need (into p1), and return a p2 pointer somewhere after the beginning of the buffer.
- Since alignment is a power of two, in binary it has to form of 1 followed by zeros. e.g. if alignment is 32, it will be 00100000 in binary
- (alignment-1) in binary format will turn the 1 into 0, and all the 0's after the 1 into 1. For example: (32-1) is 00011111
- the ~ will reverse all the bits. That is: 11100000
- now, p1 is a pointer to the buffer (remember, the buffer is larger by offset than what we need). we add offset to p1: p1+offset.
- So now, (p1+offset) is greater that what we want to return. We'll nil all the insignificant bits by doing a bitwise and: (p1+offset) & ~(offset-1)
- This is p2, the pointer that we want to return. Note that because its last 5 digits are zero it is 32 aligned, as requested.
- p2 is what we'll return. However, we must be able to reach p1 when the user calls aligned_free. For this, note that we reserved location for one extra pointer when we calculated the offset (that's the sizeof(void*) in line 4.
- so, we want to put p1 immediately before p2. This is p2[-1]. This is a little bit tricky calculation. Remember that p2 is defined as void**. One way to look at it is as array of void. C array calculation say that p2[0] is exactly p2. p2[1] is p2 + sizeof(void*). In general p2[n] = p2 + n*sizeof(void*). The compiler also supports negative numbers, so p2[-1] is one void* (typically 4 bytes) before p2.
I'm going to guess that aligned_free is something like:
void aligned_free( void* p ) {
void* p1 = ((void**)p)[-1]; // get the pointer to the buffer we allocated
free( p1 );
}