redis对于内存的使用精确到了bit了,redis引入了zipmap数据结构,可以在hash表的元素比较少的时候,使用zipmap来节约内存。下面就分析一下zipmap的源代码(zipmap.c)
作者在源文件开头讲述了zipmap的数据格式,假定有这么个映射关系“foo”=>"bar", "hello"=>"world",则内存布局为
<zmlen(2)><len(3)>"foo"<len(3)><free(0)>"bar"<len(5)>"hello"<len(6)><free(0)>"world"
zmlen是用一字节表示,代表zipmap结构中实际key的个数。当zipmap的长度大于等于254,zmlen字段将不被使用,要想知道key的数目,只能遍历zipmap,这样会导致zipmap的性能下降,实际上,zipmap可以使用ziplist来代替,性能不会打折扣~所以个人觉得zipmap有点鸡肋。
len代表key或者是value的字节长度,len本身可能占用1个字节,也可能占用5个字节。如果len编码的第一个字节为0~252,则len只占用1字节,如果len的第一个字节是数值是253,则其后的四字节表示key或value的长度(这样len总共就有5字节了,首字节只作为标记)。如果首字节是255,则表示zipmap的结尾,若为254,则代表hash表还有空间可以增加新的键值对。
free表示没有被使用的字节树,比如说,”foo"映射为“bar”,但是后来将“foo“映射为了”hi",则会在原处空处1个字节,zipmap并不是直接释放掉这个1字节,相反,会保留,下次再设定value的时候,可能就不需要分配内存了。
zipmap的查询是O(N),N是key的数目。
下面看看zipmap的常用的api。
unsigned char *zipmapNew(void);//此函数用于创建一个新的zipmap,实际上就分配了两个字节。一个字节表示zmlen,并初始化为0,另外一字节设为255,表示结尾
zipmapSet函数
/* Set key to value, creating the key if it does not already exist.
* If 'update' is not NULL, *update is set to 1 if the key was
* already preset, otherwise to 0. */
unsigned char *zipmapSet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char *val, unsigned int vlen, int *update) {
unsigned int zmlen, offset;
unsigned int freelen, reqlen = zipmapRequiredLength(klen,vlen);//zipmapRequiredLength返回key和value总共需要的字节数目
unsigned int empty, vempty;
unsigned char *p;
freelen = reqlen;
if (update) *update = 0;
p = zipmapLookupRaw(zm,key,klen,&zmlen); //查询key是否在zm中,并得到zm的字节大小(zmlen),如果存在,则返回key所在的指针偏移
if (p == NULL) {
/* Key not found: enlarge */
zm = zipmapResize(zm, zmlen+reqlen); //调用的是realloc,将zm扩大
p = zm+zmlen-1;
zmlen = zmlen+reqlen; //更新zm的总字节数
/* Increase zipmap length (this is an insert) */
if (zm[0] < ZIPMAP_BIGLEN) zm[0]++; //更新key的数目
} else {
/* Key found. Is there enough space for the new value? */
/* Compute the total length: */
if (update) *update = 1;
freelen = zipmapRawEntryLength(p);
if (freelen < reqlen) {
/* Store the offset of this key within the current zipmap, so
* it can be resized. Then, move the tail backwards so this
* pair fits at the current position. */
offset = p-zm;
zm = zipmapResize(zm, zmlen-freelen+reqlen); //重新分配内存,以便扩大zm的容量
p = zm+offset; //找到要更新key的位置
/* The +1 in the number of bytes to be moved is caused by the
* end-of-zipmap byte. Note: the *original* zmlen is used. */
memmove(p+reqlen, p+freelen, zmlen-(offset+freelen+1));//将要修改的key/value之后的字符串向后顺移
zmlen = zmlen-freelen+reqlen;
freelen = reqlen;
}
}
/* We now have a suitable block where the key/value entry can
* be written. If there is too much free space, move the tail
* of the zipmap a few bytes to the front and shrink the zipmap,
* as we want zipmaps to be very space efficient. */
empty = freelen-reqlen;
if (empty >= ZIPMAP_VALUE_MAX_FREE) {//如果空的太多,大于4的话,就需要释放掉这些空出来的字节
/* First, move the tail <empty> bytes to the front, then resize
* the zipmap to be <empty> bytes smaller. */
offset = p-zm;
memmove(p+reqlen, p+freelen, zmlen-(offset+freelen+1));//移除free空间
zmlen -= empty;
zm = zipmapResize(zm, zmlen);//重新设定zm的大小
p = zm+offset;
vempty = 0;
} else {
vempty = empty;
}
/* Just write the key + value and we are done. */
/* Key: */
p += zipmapEncodeLength(p,klen);
memcpy(p,key,klen);
p += klen;
/* Value: */
p += zipmapEncodeLength(p,vlen);
*p++ = vempty;//每个value 没用到的空间(字段free)的大小
memcpy(p,val,vlen);
return zm;
}
zipmapDel是删除key/value的api
unsigned char *zipmapDel(unsigned char *zm, unsigned char *key, unsigned int klen, int *deleted) {
unsigned int zmlen, freelen;
unsigned char *p = zipmapLookupRaw(zm,key,klen,&zmlen); //首先去查找key的位置
if (p) { //找到key了
freelen = zipmapRawEntryLength(p);
memmove(p, p+freelen, zmlen-((p-zm)+freelen+1)); //移动整个zm,这样 就可以将key/value给覆盖掉,从而删除key/value
zm = zipmapResize(zm, zmlen-freelen); //收回空间
/* Decrease zipmap length */
if (zm[0] < ZIPMAP_BIGLEN) zm[0]--; //key的数目自减一
if (deleted) *deleted = 1;
} else {
if (deleted) *deleted = 0;
}
return zm;
}
淘宝核心系统团队博客的关于zipmap的内存布局
http://rdc.taobao.com/blog/cs/?p=1314 ,里面有图有真相~