网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。
一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!
如何开始了解Redis
Redis如何开始运行的,就得从server.c的开始看起,另外redis使用了基于事件驱动机制的网络通信框架,涉及的代码包含ae_epoll.c等。除了事件驱动网络框架外,与网络通信相关的功能还包括底层TCP网络通信和客户端实现。如下图,
Redis的底层数据结构
Redis的底层数据结构如下图所示,
Redis常见的五种数据类型分别是string(字符串),hash(哈希),list(列表),set(集合)及sortset(有序集合)。Redis是基于C语言实现的,那么为什么在设计字符串类型使用了SDS(Simple Dynamic String,简单动态字符串)的结构?
SDS的实现方式会提升字符串的操作效率,并且可以用来保存二进制数据。如果使用C语言的char*字符数组的结构的话,字符数组最后一个字符是“\0”表示字符串的结束,因为char*字符数组分配的是一块连续的内存空间。这样redis保存任意二进制数据就会带来一定的负面影响(例如需要保存的数据中本身就有\0的数据,那可能因此会误判)。
SDS的结构中包含了一个字符数组buf[],用来保存实际数据。除此之外,还包含三个元数据,分别是字符数组现有长度len、分配给字符数组的空间长度alloc,以及SDS类型flags。另外,它还使用了专门的编译优化来节省内存空间,结构定义时通过_attribute_((_packed_))告诉编译器在编译sdshdr8结构时,不要使用字节对齐的方式,而是采用紧凑的方式分配内存。如下图,
双索引机制
到了本文的重点,索引是什么?索引是一种单独的、物理的对数据库表中一列或多列的值进行排序的一种存储结构,它能提升检索效率。
前面也有好几篇文章也提到了ZSet有序集合的数据结构,Sorted Set它采用了跳表使它能支持范围查询,而它同时采用了哈希表进行索引使其能以常数复杂度获取元素权重。这种双索引机制使其的查询复杂度为O(logN)+M或O(1)。
Sort Set基本结构
在server.h头文件中找到zset的定义,包含了哈希表dict和跳表zsl,如下图,
一个数据结构中包含了两个索引结构,哈希表支持单点查询,跳表支持范围查询,使其查询效率最高。那么跳表和哈希表各自保存了什么数据,如何保证数据一致性的呢?
跳表的数据结构
前面的文章也提到过,跳表是一种多层的有序链表,如下图,
直接看跳表的结构定义,
/* ZSETs use a specialized version of Skiplists */
typedef struct zskiplistNode {
// Sorted Set中的元素
sds ele;
// 元素权重值
double score;
// 后向指针
struct zskiplistNode *backward;
// 节点的level数组,保存每层上的前向指针和跨度
struct zskiplistLevel {
struct zskiplistNode *forward;
unsigned long span;
} level[];
} zskiplistNode;
因为跳表是一个多层的有序链表,每一层也是由多个结点通过指针连接起来。因此在跳表结点的结构定义中,还包含一个zskiplistLevel结构体类型的level数组,level数组中的每一个元素对应了一个zskiplistLevel结构体,也对应了跳表的一层。*forward指向了上一层,而span表示跨过了多少个结点。
上面是跳表结点的定义,再来看一下跳表的定义结构,如下所示,
typedef struct zskiplist {
struct zskiplistNode *header, *tail;
unsigned long length;
int level;
} zskiplist;
跳表定义了头结点、尾结点、跳表长度,以及跳表的最大层数,在查询时利用跳表的level数组加速查询。下面看一下查询的代码,查询逐层遍历,每一层结点数都约是下一层结点数的一半,查找类似于二分查找,所以查询时间复杂度是O(logN),如下所示,
/* Finds an element by its rank. The rank argument needs to be 1-based. */
zskiplistNode* zslGetElementByRank(zskiplist *zsl, unsigned long rank) {
zskiplistNode *x;
unsigned long traversed = 0;
int i;
// 获取跳表的表头
x = zsl->header;
// 从最大层数开始逐一遍历
for (i = zsl->level-1; i >= 0; i--) {
while (x->level[i].forward && (traversed + x->level[i].span) <= rank)
{
traversed += x->level[i].span;
x = x->level[i].forward;
}
if (traversed == rank) {
return x;
}
}
return NULL;
}
这种设计使得查找效率提升了,但也会有一定的负面影响,那就是为了维持相邻两层上的结点数的比例为2:1,一旦新增结点或删除结点就需要调整数据结构,从而带来额外的开销。为了避免这种问题,跳表在创建结点时,采用的是另外一种设计方法——随机生成每个结点的层数。此时,相邻的两层链表上的结点数不需要严格的是2:1的关系,降低了插入操作的复杂度。
看一下插入操作的代码,如下所示,
/* Insert a new node in the skiplist. Assumes the element does not already
* exist (up to the caller to enforce that). The skiplist takes ownership
* of the passed SDS string 'ele'. */
zskiplistNode *zslInsert(zskiplist *zsl, double score, sds ele) {
zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x;
unsigned int rank[ZSKIPLIST_MAXLEVEL];
int i, level;
serverAssert(!isnan(score));
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
/* store rank that is crossed to reach the insert position */
rank[i] = i == (zsl->level-1) ? 0 : rank[i+1];
while (x->level[i].forward &&
(x->level[i].forward->score < score ||
(x->level[i].forward->score == score &&
sdscmp(x->level[i].forward->ele,ele) < 0)))
{
rank[i] += x->level[i].span;
x = x->level[i].forward;
}
update[i] = x;
}
/* we assume the element is not already inside, since we allow duplicated
* scores, reinserting the same element should never happen since the
* caller of zslInsert() should test in the hash table if the element is
* already inside or not. */
level = zslRandomLevel();
if (level > zsl->level) {
for (i = zsl->level; i < level; i++) {
rank[i] = 0;
update[i] = zsl->header;
update[i]->level[i].span = zsl->length;
}
zsl->level = level;
}
x = zslCreateNode(level,score,ele);
for (i = 0; i < level; i++) {
x->level[i].forward = update[i]->level[i].forward;
update[i]->level[i].forward = x;
/* update span covered by update[i] as x is inserted here */
x->level[i].span = update[i]->level[i].span - (rank[0] - rank[i]);
update[i]->level[i].span = (rank[0] - rank[i]) + 1;
}
/* increment span for untouched levels */
for (i = level; i < zsl->level; i++) {
update[i]->level[i].span++;
}
x->backward = (update[0] == zsl->header) ? NULL : update[0];
if (x->level[0].forward)
x->level[0].forward->backward = x;
else
zsl->tail = x;
zsl->length++;
return x;
}
zslRandomLevel函数决定了跳表结点层数。层数初始化为1,然后生成随机数小于ZSKPLIST_P(随机数概率阈值)则增加1层,最大层数ZSKIPLIST_MAXLEVEL为64。代码如下,
#define ZSKIPLIST_MAXLEVEL 64 /* Should be enough for 2^64 elements */
#define ZSKIPLIST_P 0.25 /* Skiplist P = 1/4 */
/* Returns a random level for the new skiplist node we are going to create.
* The return value of this function is between 1 and ZSKIPLIST_MAXLEVEL
* (both inclusive), with a powerlaw-alike distribution where higher
* levels are less likely to be returned. */
int zslRandomLevel(void) {
// 初始化层数
int level = 1;
while ((random()&0xFFFF) < (ZSKIPLIST_P * 0xFFFF))
level += 1;
return (level<ZSKIPLIST_MAXLEVEL) ? level : ZSKIPLIST_MAXLEVEL;
}
哈希表和跳表组合
哈希表的数据结构就不多说了,那么这两种索引结构如何组合使用的?
在创建一个zset时,代码会先调用dictCreate函数创建哈希表,再调用zslCreate函数创建跳表。如下所示,
zs = zmalloc(sizeof(*zs));
zs->dict = dictCreate(&zsetDictType,NULL);
zs->zsl = zslCreate();
在Sorted Set插入数据时会调用zsetAdd函数,下面看一下该函数,
/* Add a new element or update the score of an existing element in a sorted
* set, regardless of its encoding.
*
* The set of flags change the command behavior. They are passed with an integer
* pointer since the function will clear the flags and populate them with
* other flags to indicate different conditions.
*
* The input flags are the following:
*
* ZADD_INCR: Increment the current element score by 'score' instead of updating
* the current element score. If the element does not exist, we
* assume 0 as previous score.
* ZADD_NX: Perform the operation only if the element does not exist.
* ZADD_XX: Perform the operation only if the element already exist.
*
* When ZADD_INCR is used, the new score of the element is stored in
* '*newscore' if 'newscore' is not NULL.
*
* The returned flags are the following:
*
* ZADD_NAN: The resulting score is not a number.
* ZADD_ADDED: The element was added (not present before the call).
* ZADD_UPDATED: The element score was updated.
* ZADD_NOP: No operation was performed because of NX or XX.
*
* Return value:
*
* The function returns 1 on success, and sets the appropriate flags
* ADDED or UPDATED to signal what happened during the operation (note that
* none could be set if we re-added an element using the same score it used
* to have, or in the case a zero increment is used).
*
* The function returns 0 on erorr, currently only when the increment
* produces a NAN condition, or when the 'score' value is NAN since the
* start.
*
* The commad as a side effect of adding a new element may convert the sorted
* set internal encoding from ziplist to hashtable+skiplist.
*
* Memory managemnet of 'ele':
*
* The function does not take ownership of the 'ele' SDS string, but copies
* it if needed. */
int zsetAdd(robj *zobj, double score, sds ele, int *flags, double *newscore) {
/* Turn options into simple to check vars. */
int incr = (*flags & ZADD_INCR) != 0;
int nx = (*flags & ZADD_NX) != 0;
int xx = (*flags & ZADD_XX) != 0;
*flags = 0; /* We'll return our response flags. */
double curscore;
/* NaN as input is an error regardless of all the other parameters. */
if (isnan(score)) {
*flags = ZADD_NAN;
return 0;
}
/* Update the sorted set according to its encoding. */
// 如果采用ziplist编码方式,zsetAdd函数的处理逻辑
if (zobj->encoding == OBJ_ENCODING_ZIPLIST) {
unsigned char *eptr;
if ((eptr = zzlFind(zobj->ptr,ele,&curscore)) != NULL) {
/* NX? Return, same element already exists. */
if (nx) {
*flags |= ZADD_NOP;
return 1;
}
/* Prepare the score for the increment if needed. */
if (incr) {
score += curscore;
if (isnan(score)) {
*flags |= ZADD_NAN;
return 0;
}
if (newscore) *newscore = score;
}
/* Remove and re-insert when score changed. */
if (score != curscore) {
zobj->ptr = zzlDelete(zobj->ptr,eptr);
zobj->ptr = zzlInsert(zobj->ptr,ele,score);
*flags |= ZADD_UPDATED;
}
return 1;
} else if (!xx) {
/* Optimize: check if the element is too large or the list
* becomes too long *before* executing zzlInsert. */
zobj->ptr = zzlInsert(zobj->ptr,ele,score);
if (zzlLength(zobj->ptr) > server.zset_max_ziplist_entries ||
sdslen(ele) > server.zset_max_ziplist_value)
zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);
if (newscore) *newscore = score;
*flags |= ZADD_ADDED;
return 1;
} else {
*flags |= ZADD_NOP;
return 1;
}
// 如果采用zipList的编码方式,zsetAdd函数的处理逻辑
} else if (zobj->encoding == OBJ_ENCODING_SKIPLIST) {
zset *zs = zobj->ptr;
zskiplistNode *znode;
![img](https://img-blog.csdnimg.cn/img_convert/af997ed21575e1eecb3ff37361a06b93.png)
![img](https://img-blog.csdnimg.cn/img_convert/4bc43de40201daf5187d6e0013f818b5.png)
![img](https://img-blog.csdnimg.cn/img_convert/5719f652e127c832f5722fd5a1912d2d.png)
**既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上Go语言开发知识点,真正体系化!**
**由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新**
**[如果你需要这些资料,可以戳这里获取](https://bbs.csdn.net/topics/618658159)**
stNode *znode;
[外链图片转存中...(img-a5TGjVAE-1715743470962)]
[外链图片转存中...(img-Pka2mLkN-1715743470962)]
[外链图片转存中...(img-BOMc6BsO-1715743470963)]
**既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上Go语言开发知识点,真正体系化!**
**由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新**
**[如果你需要这些资料,可以戳这里获取](https://bbs.csdn.net/topics/618658159)**