前言:
在前面讲压缩列表ziplist的时候,我也提到了压缩列表的不足,虽然压缩列表是通过紧凑型的内存布局节省了内存开销,但是因为它的结构设计,如果保存的元素数量增加,或者元素变大了,压缩列表会有「连锁更新」的风险,一旦发生,会造成性能下降。
quicklist 的设计思想
很简单,将一个长ziplist
拆分为多个短ziplist
,避免插入
和删除
元素时导致大量的内存拷贝。
ziplist
存储数据的形式更类似于数组,而quicklist是真正意义上的链表结构,它由quicklistNode
节点链接而成,在quicklistNode
中使用ziplist
存储数据;
注
:代码如无特殊说明,均位于`quicklist.h/quicklist.c中
quicklist 结构设计
quicklist
的结构体跟链表的结构体类似,都包含了表头和表尾,区别在于 quicklist
的节点是 quicklistNode
。
quicklistNode
的定义如下:
/* Node, quicklist, and Iterator are the only data structures used currently. */
/* quicklistNode is a 32 byte struct describing a ziplist for a quicklist.
* We use bit fields keep the quicklistNode at 32 bytes.
* count: 16 bits, max 65536 (max zl bytes is 65k, so max count actually < 32k).
* encoding: 2 bits, RAW=1, LZF=2.
* container: 2 bits, NONE=1, ZIPLIST=2.
* recompress: 1 bit, bool, true if node is temporary decompressed for usage.
* attempted_compress: 1 bit, boolean, used for verifying during testing.
* extra: 10 bits, free for future use; pads out the remainder of 32 bits */
typedef struct quicklistNode {
struct quicklistNode *prev;
struct quicklistNode *next;
unsigned char *zl;
unsigned int sz; /* ziplist size in bytes */
unsigned int count : 16; /* count of items in ziplist */
unsigned int encoding : 2; /* RAW==1 or LZF==2 */
unsigned int container : 2; /* NONE==1 or ZIPLIST==2 */
unsigned int recompress : 1; /* was this node previous compressed? */
unsigned int attempted_compress : 1; /* node can't compress; too small */
unsigned int extra : 10; /* more bits to steal for future usage */
} quicklistNode;
pre、next
:指向前驱节点,后驱节点。
zl:ziplist
,负责存储数据
sz:ziplist
占用的字节数
count
:ziplist的元素数量
encoding
:2代表节点已压缩,1代表没有压缩。
container
:目前固定为2,代表使用ziplist存储数据
recompress
:1代表暂时解压(用于读取数据等),后续需要时再将其压缩。
extra
:预留属性,暂未使用
当链表很长时,中间节点数据访问频率较低。这时redis会将中间节点数据进行压缩,进一步节省内存空间。redis采用是无损压缩算法-LZF算法
/* quicklistLZF is a 4+N byte struct holding 'sz' followed by 'compressed'.
* 'sz' is byte length of 'compressed' field.
* 'compressed' is LZF data with total (compressed) length 'sz'
* NOTE: uncompressed length is stored in quicklistNode->sz.
* When quicklistNode->zl is compressed, node->zl points to a quicklistLZF */
typedef struct quicklistLZF {
unsigned int sz; /* LZF size in bytes*/
char compressed[];
}
1)zs
该压缩node的的总长度
2)compressed
压缩后的数据片段(多个),每个数据片段由解释字段和数据字段组成
3)当前ziplist未压缩长度存在于quicklistNode
->sz字段中
4)当ziplist被压缩时,node->zl字段将指向quicklistLZF
quicklist的定义如下:
/* quicklist is a 40 byte struct (on 64-bit systems) describing a quicklist.
* 'count' is the number of total entries.
* 'len' is the number of quicklist nodes.
* 'compress' is: 0 if compression disabled, otherwise it's the number
* of quicklistNodes to leave uncompressed at ends of quicklist.
* 'fill' is the user-requested (or default) fill factor.
* 'bookmakrs are an optional feature that is used by realloc this struct,
* so that they don't consume memory when not used. */
typedef struct quicklist {
quicklistNode *head;
quicklistNode *tail;
unsigned long count; /* total count of all entries in all ziplists */
unsigned long len; /* number of quicklistNodes */
int fill : QL_FILL_BITS; /* fill factor for individual nodes */
unsigned int compress : QL_COMP_BITS; /* depth of end nodes not to compress;0=off */
unsigned int bookmark_count: QL_BM_BITS;
quicklistBookmark bookmarks[];
} quicklist;
head
:头结点
tail
:尾结点
count
:在所有的 ziplist 中的 entry 总数
len
:quicklistNode 节点总数
fill
:每个 quicklist 节点的最大容量
compress
:quicklist 的压缩深度,0 表示所有节点都不压缩,否则就表示从两端开始有多少个节点不压缩
bookmark_count
:bookmarks 数组的大小
bookmarks
:是一个可选字段,用来 quicklist 重新分配内存空间时使用,不使用时不占用空
注意:
为什么不全部节点都压缩,而是流出 compress 这个可配置的口子呢?
其实从统计来看,list 两端的数据变更最为频繁,像 lpush,rpush,lpop,rpop 等命令都是在两端操作,如果频繁压缩或解压缩会代码不必要的性能损耗。
这里还有个 fill
字段,它的含义是每个 quicknode
节点的最大容量,不同的数值有不同的含义,默认是 -2,当然也可以配置为其他数值,具体数值含义如下:
-1:每个 quicklistNode 节点的 ziplist 所占字节数不能超过 4kb。(建议配置)
-2:每个 quicklistNode 节点的 ziplist 所占字节数不能超过 8kb。(默认配置 & 建议配置)
-3:每个 quicklistNode 节点的 ziplist 所占字节数不能超过 16kb。
-4:每个 quicklistNode 节点的 ziplist 所占字节数不能超过 32kb。
-5:每个 quicklistNode 节点的 ziplist 所占字节数不能超过 64kb。
任意正数
:表示 ziplist 结构所最多包含的 entry
个数,最大为215215。
可以通过Redis修改参数list-max-ziplist-size
配置节点所占内存大小
源码分析:
插入元素到quicklist头部:
/* Add new entry to head node of quicklist.
*
* Returns 0 if used existing head.
* Returns 1 if new head created. */
int quicklistPushHead(quicklist *quicklist, void *value, size_t sz) {
quicklistNode *orig_head = quicklist->head;
if (likely(
_quicklistNodeAllowInsert(quicklist->head, quicklist->fill, sz))) {
quicklist->head->zl =
ziplistPush(quicklist->head->zl, value, sz, ZIPLIST_HEAD);
quicklistNodeUpdateSz(quicklist->head);
} else {
quicklistNode *node = quicklistCreateNode();
node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_HEAD);
quicklistNodeUpdateSz(node);
_quicklistInsertNodeBefore(quicklist, quicklist->head, node);
}
quicklist->count++;
quicklist->head->count++;
return (orig_head != quicklist->head);
}
value
、sz
:插入元素的内容与大小
在quicklist
的指定位置插入元素:
/* Insert a new entry before or after existing entry 'entry'.
*
* If after==1, the new value is inserted after 'entry', otherwise
* the new value is inserted before 'entry'. */
REDIS_STATIC void _quicklistInsert(quicklist *quicklist, quicklistEntry *entry,
void *value, const size_t sz, int after) {
int full = 0, at_tail = 0, at_head = 0, full_next = 0, full_prev = 0;
int fill = quicklist->fill;
quicklistNode *node = entry->node;
quicklistNode *new_node = NULL;
if (!node) {
/* we have no reference node, so let's create only node in the list */
D("No node given!");
new_node = quicklistCreateNode();
new_node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_HEAD);
__quicklistInsertNode(quicklist, NULL, new_node, after);
new_node->count++;
quicklist->count++;
return;
}
/* Populate accounting flags for easier boolean checks later */
if (!_quicklistNodeAllowInsert(node, fill, sz)) {
D("Current node is full with count %d with requested fill %lu",
node->count, fill);
full = 1;
}
if (after && (entry->offset == node->count)) {
D("At Tail of current ziplist");
at_tail = 1;
if (!_quicklistNodeAllowInsert(node->next, fill, sz)) {
D("Next node is full too.");
full_next = 1;
}
}
if (!after && (entry->offset == 0)) {
D("At Head");
at_head = 1;
if (!_quicklistNodeAllowInsert(node->prev, fill, sz)) {
D("Prev node is full too.");
full_prev = 1;
}
}
/* Now determine where and how to insert the new element */
if (!full && after) {
D("Not full, inserting after current position.");
quicklistDecompressNodeForUse(node);
unsigned char *next = ziplistNext(node->zl, entry->zi);
if (next == NULL) {
node->zl = ziplistPush(node->zl, value, sz, ZIPLIST_TAIL);
} else {
node->zl = ziplistInsert(node->zl, next, value, sz);
}
node->count++;
quicklistNodeUpdateSz(node);
quicklistRecompressOnly(quicklist, node);
} else if (!full && !after) {
D("Not full, inserting before current position.");
quicklistDecompressNodeForUse(node);
node->zl = ziplistInsert(node->zl, entry->zi, value, sz);
node->count++;
quicklistNodeUpdateSz(node);
quicklistRecompressOnly(quicklist, node);
} else if (full && at_tail && node->next && !full_next && after) {
/* If we are: at tail, next has free space, and inserting after:
* - insert entry at head of next node. */
D("Full and tail, but next isn't full; inserting next node head");
new_node = node->next;
quicklistDecompressNodeForUse(new_node);
new_node->zl = ziplistPush(new_node->zl, value, sz, ZIPLIST_HEAD);
new_node->count++;
quicklistNodeUpdateSz(new_node);
quicklistRecompressOnly(quicklist, new_node);
} else if (full && at_head && node->prev && !full_prev && !after) {
/* If we are: at head, previous has free space, and inserting before:
* - insert entry at tail of previous node. */
D("Full and head, but prev isn't full, inserting prev node tail");
new_node = node->prev;
quicklistDecompressNodeForUse(new_node);
new_node->zl = ziplistPush(new_node->zl, value, sz, ZIPLIST_TAIL);
new_node->count++;
quicklistNodeUpdateSz(new_node);
quicklistRecompressOnly(quicklist, new_node);
} else if (full && ((at_tail && node->next && full_next && after) ||
(at_head && node->prev && full_prev && !after))) {
/* If we are: full, and our prev/next is full, then:
* - create new node and attach to quicklist */
D("\tprovisioning new node...");
new_node = quicklistCreateNode();
new_node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_HEAD);
new_node->count++;
quicklistNodeUpdateSz(new_node);
__quicklistInsertNode(quicklist, node, new_node, after);
} else if (full) {
/* else, node is full we need to split it. */
/* covers both after and !after cases */
D("\tsplitting node...");
quicklistDecompressNodeForUse(node);
new_node = _quicklistSplitNode(node, entry->offset, after);
new_node->zl = ziplistPush(new_node->zl, value, sz,
after ? ZIPLIST_HEAD : ZIPLIST_TAIL);
new_node->count++;
quicklistNodeUpdateSz(new_node);
__quicklistInsertNode(quicklist, node, new_node, after);
_quicklistMergeNodes(quicklist, node);
}
quicklist->count++;
}
void quicklistInsertBefore(quicklist *quicklist, quicklistEntry *entry,
void *value, const size_t sz) {
_quicklistInsert(quicklist, entry, value, sz, 0);
}
void quicklistInsertAfter(quicklist *quicklist, quicklistEntry *entry,
void *value, const size_t sz) {
_quicklistInsert(quicklist, entry, value, sz, 1);
}
参数设置:
emtry
:quicklistEntry结构,quicklistEntry.node指定元素插入的quicklistNode节点,quicklistEntry.offset指定插入ziplist的索引位置
after
:是否在quicklistEntery.offset之后插入
full
:待插入节点ziplist是否已满
at_tail
:是否ziplist尾插
at_head
:是否ziplist头插
full_next
:后驱节点是否已满
full_prev
:前驱节点是否已满
配置说明:
list-max-ziplist-size
:配置server.list_max_ziplist_size
属性,该值会赋值给quicklist.fill
。
list-compress-depth
:配置server.list_compress_depth
属性,该值会赋值给quicklist.compress
。
编码:
ziplist
由于结构紧凑,能高效使用内存,所以在redis中被广泛使用,可用于保持用户列表、散列、有序集合等数据。
列表类型只有一种编码格式OBJ_ENCODING_QUICKLIST
,使用quicklist存储数据(redisObject.ptr指向quicklist结构)
总结:
1)ziplist
是一种结构紧凑的数据结构,使用一块完整内存存储链表的所有数据;
2)ziplist
内的元素支持不同的编码格式,以最大限度地节省内存
;
3)quicklist
通过切分ziplist来提高插入
、删除
元素等操作的性能;