redis源码分析之三基础的数据结构

最新推荐文章于 2024-09-08 21:37:59 发布

fpcc

最新推荐文章于 2024-09-08 21:37:59 发布

阅读量185

点赞数

分类专栏：数据库开发文章标签： redis

本文链接：https://blog.csdn.net/fpcc/article/details/108896713

版权

数据库开发专栏收录该内容

47 篇文章 64 订阅

订阅专栏

一、基础数据结构

在整体上把握了Redis的架构流程后，先分析一下基础的数据结构。这样，一个是对以后各个模块分别分析时，不会因为对数据结构的陌生而增加源码分析的难度，又可以通过分析基础的数据结构来初步掌握redis的设计风格。在redis中，共有五种基础数据结构：
string:字符串，在KV结构中，Key都是字符串类型。其它的数据结构可以说是从这个基础上衍生出来的。它可以存储字符，复杂的字符串（JSON,XML），数字，甚至可以是图片和音视频。其长度控制在512M内。
list:列表，列表是用来存储多个有序的字符串的。列表最多可以存储2^32-1个元素。它分分linkedlist,quicklist（V3.2后）和ziplist。
hash:哈希，可以存储多个键值对间的映射，其存储的值既可以是数字也可以是字符串。为了区别KV，这里叫field-value。
set:集合，用来保存多个字符串，但其不能像列表一样有重复值。而且，其是无序的，不能通过下标索引来获取元素。其支持多集合的交、并、差集的运算。
zset:有序集合，和集合基础解释一样，但它是有顺序的。每个成员都是各不相同的，其值叫做分值（Score）。它既可以通过数据成员访问元素，又可以根据分值及其顺序来访问元素的数据结构。
而在了解这些数据结构过程中，发现其主要应用到的数据结构有：ziplist,quicklit,linkedlist,hashtable,skiplist,dict,SDS等，这些都会在下面进行初步的分析，并在以后的文章中进行更深入的源码层次的分析。

二、数据结构的初步分析

1、string

typedef char * sds;

/* Note: sdshdr5 is never used, we just access the flags byte directly.
 * However is here to document the layout of type 5 SDS strings. */
struct __attribute__ ((__packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length * /
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used * /
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits * /
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
    uint64_t len; /* used */
    uint64_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits * /
    char buf[];
};

这个真心的简单，就是用typedef重定义char*为SDS，然后开始操作string,为了提高效率，定义了不同的头来满足要求。

2、list

typedef struct list {
    listNode *head;
    listNode *tail;
    void *(*dup)(void *ptr);
    void (*free)(void *ptr);
    int (*match)(void *ptr, void *key);
    unsigned long len;
} list;
typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;
    unsigned int sz;             /* ziplist size in bytes */
    unsigned int count : 16;     /* count of items in ziplist */
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* NONE==1 or ZIPLIST==2 */
    unsigned int recompress : 1; /* was this node previous compressed? */
    unsigned int attempted_compress : 1; /* node can't compress; too small */
    unsigned int extra : 10; /* more bits to steal for future usage */
} quicklistNode;

typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        /* total count of all entries in all ziplists */
    unsigned long len;          /* number of quicklistNodes */
    int fill : 16;              /* fill factor for individual nodes */
    unsigned int compress : 16; /* depth of end nodes not to compress;0=off * /
} quicklist;

在老版本（❤️.2）中，使用linklist和 ziplist来完成列表的使用，而在新的版本中，使用quikclist来实现，它结合了二者的优势。

3、hash

typedef struct zlentry {
    unsigned int prevrawlensize; /* Bytes used to encode the previous entry len*/
    unsigned int prevrawlen;     /* Previous entry len. */
    unsigned int lensize;        /* Bytes used to encode this entry type/len.
                                    For example strings have a 1, 2 or 5 bytes
                                    header. Integers always use a single byte.*/
    unsigned int len;            /* Bytes used to represent the actual entry.
                                    For strings this is just the string length
                                    while for integers it is 1, 2, 3, 4, 8 or
                                    0 (for 4 bit immediate) depending on the
                                    number range. */
    unsigned int headersize;     /* prevrawlensize + lensize. */
    unsigned char encoding;      /* Set to ZIP_STR_* or ZIP_INT_* depending on
                                    the entry encoding. However for 4 bits
                                    immediate integers this can assume a range
                                    of values and must be range-checked. */
    unsigned char *p;            /* Pointer to the very start of the entry, that
                                    is, this points to prev-entry-len field. */
} zlentry;

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;


/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    unsigned long iterators; /* number of iterators currently running * /
} dict;

哈希的底层使用ziplist和hashtable来实现，主要是通过键的数量和值的大小来控制，KEY数量小于512，VALUE小于64时，使用前者，否则使用后者。

4、set

typedef struct intset {
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;

5、zset

/* ZSETs use a specialized version of Skiplists */

typedef struct zset {
    dict * dict;
    zskiplist * zsl;
} zset ;

有序集合还是用跳表和哈希字典实现的，看起来有点复杂。

6、跳表

/* ZSETs use a specialized version of Skiplists */
typedef struct zskiplistNode {
    sds ele;
    double score;
    struct zskiplistNode * backward;
    struct zskiplistLevel {
        struct zskiplistNode * forward;
        unsigned long span;
    } level[];
} zskiplistNode;

typedef struct zskiplist {
    struct zskiplistNode * header, * tail;
    unsigned long length;
    int level;
} zskiplist;

跳表，一种用水平线性的方式实现的分段控制链表的方式，在查找了有很大的优势，当然，付出了空间的代价。

7、对象定义

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). * /
    int refcount;
    void * ptr;
} robj;

做K-V中的V，都可以被封装成这个对象。

三、数据结构的使用

String，redis对于KV的操作效率很高，在实际应用中，这种应用方式也是最广泛直接的，比如多个进程共享一个数据，实现分布式锁的基本数据内容，它是二进制安全的，所以这就可以存储任意格式的数据了。
hash，这个比较有意思，设计它的目的是为了高效和存储和操作数据。比如快速命中用户、商品，从而进一步访问更详细的信息。其还可以进行分段存储，这样可以利用较少的位来存储更复杂的数据。
list，列表类型和普通的链表没什么区别，其实它就是链表，而且是双向的。它的应用就很广泛了，比如形成一个队列，维护一个域范围等。
set，集合，重点是可以去重。也就是相同的数据无法再次插入修改。当然支持一些集合的基本操作，如交、并集等。最典型的就画像时做某些特征的求共同部分和总体。
zset，有序集合，这就好理解了，排好序了。干啥呢，最典型就是权重配比，并依照配比进行操作。