Redis源码学习

最新推荐文章于 2024-08-26 07:52:21 发布

数学工具构造器

最新推荐文章于 2024-08-26 07:52:21 发布

阅读量360

点赞数

分类专栏： C++ 文章标签： redis 数据库 memcached

本文链接：https://blog.csdn.net/TQCAI666/article/details/120631089

版权

C++ 专栏收录该内容

16 篇文章 0 订阅

订阅专栏

本文深入探讨Redis的数据结构，特别是String类型的三种encoding方式：int、embstr和raw，以及Hash的渐进式rehash机制。文章还介绍了Redis中的sds和压缩链表ziplist，分析了Redis性能优势和内存管理策略。

摘要由CSDN通过智能技术生成

文章目录

1. 数据结构

Redis 命令参考

Redis 设计与实现

https://github.com/doocs/technical-books#database

在这里插入图片描述

1. 数据结构

【大课堂】Redis 简介——为什么选择Redis

memcached和redis性能差不多（10-30w qps）

Memcached 是多线程，非阻塞 IO 复用的网络模型；Redis 使用单线程的多路 IO 复用模型（Redis 6.0 引入了多线程 IO ）

redis提供更丰富的附加功能：发布订阅模型、Lua脚本、事务等

redis过期数据的删除策略有惰性删除和定期删除，而Memcached 只有惰性删除

最详细的Redis五种数据结构详解（理论+实战）

吊打面试官-Redis 为什么那么快？数据结构篇

我们知道一般cpu从内存中读取数据会先读取到 cache line（缓存行），
一个缓存行基本占64个字节，其中redisObject最少占16个字节（根据属性的类型计算得出），所以如果要读取一个 redisObject，会发现只读取了16个字节，剩下的48个字节的空间相当于浪费，所以为了提高性能（主要减少了内存读取的次数），所以再RedisObject空间后又开辟48个字节的连续空间，将ptr指向的值存入其中，注意此处存入的是字符串类型，48个字节对应的是sdshdr8存储结构。而
sdshdr8 在不存入数据的情况下，最少要 4 个字节（其中一个字节是字符串尾部的’\0’）,那么还剩余 44 个字节，所以如果在 44 个字节以内字符串就可以放在缓存行里面，从而减少了内存I/O次数

typedef struct redisObject {

    // 类型
    unsigned type:4;

    // 编码
    unsigned encoding:4;

    // 对象最后一次被访问的时间
    unsigned lru:REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */

    // 引用计数
    int refcount;

    // 指向实际值的指针
    void *ptr;

} robj;

在这里插入图片描述

127.0.0.1:6379> OBJECT encoding names
"embstr"
127.0.0.1:6379> set key 234
OK
127.0.0.1:6379> OBJECT encoding key
"int"
127.0.0.1:6379> type key
string

举一个简单的例子，你在Redis中设置一个字符串key 234，然后查看这个字符串的存储类型就会看到为int类型，非整数型的使用的是embstr储存类型，具体操作如下图所示：

在这里插入图片描述

1.1 string type

【大课堂】Redis中字符串的表示

1.1.1 string的3种encoding方式

1.1.1.1 int encoding

指的是 int64

127.0.0.1:6379> set int_test 9223372036854775807
OK
127.0.0.1:6379> OBJECT encoding int_test
"int"
127.0.0.1:6379> set int_test 9223372036854775808
OK
127.0.0.1:6379> OBJECT encoding int_test
"embstr"

1.1.1.2 embstr encoding

字符串长度小于等于32B

1.1.1.3 raw encoding

sds对象

1.1.2 sds

sds的优点

1.2 hash

1.2.1 hash 数据结构的定义

【大课堂】Redis底层数据存储原理

/* Redis database representation. There are multiple databases identified
 * by integers from 0 (the default database) up to the max configured
 * database. The database number is the 'id' field in the structure. */
typedef struct redisDb {

    // 数据库键空间，保存着数据库中的所有键值对
    dict *dict;                 /* The keyspace for this DB */

    // 键的过期时间，字典的键为键，字典的值为过期事件 UNIX 时间戳
    dict *expires;              /* Timeout of keys with a timeout set */

    // 正处于阻塞状态的键
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP) */

    // 可以解除阻塞的键
    dict *ready_keys;           /* Blocked keys that received a PUSH */

    // 正在被 WATCH 命令监视的键
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */

    struct evictionPoolEntry *eviction_pool;    /* Eviction pool of keys */

    // 数据库号码
    int id;                     /* Database ID */


    // 数据库的键的平均 TTL ，统计信息
    long long avg_ttl;          /* Average TTL, just for stats */

} redisDb;

/*
 * 字典类型特定函数
 */
typedef struct dictType {

    // 计算哈希值的函数
    unsigned int (*hashFunction)(const void *key);

    // 复制键的函数
    void *(*keyDup)(void *privdata, const void *key);

    // 复制值的函数
    void *(*valDup)(void *privdata, const void *obj);

    // 对比键的函数
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);

    // 销毁键的函数
    void (*keyDestructor)(void *privdata, void *key);
    
    // 销毁值的函数
    void (*valDestructor)(void *privdata, void *obj);

} dictType;

dictht

hash_value % size 和 hash_value & (size - 1) 是一个效果

table: 指向指针的数组


/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
/*
 * 哈希表
 *
 * 每个字典都使用两个哈希表，从而实现渐进式 rehash 。
 */
typedef struct dictht {
    
    // 哈希表数组
    dictEntry **table;

    // 哈希表大小
    unsigned long size;
    
    // 哈希表大小掩码，用于计算索引值
    // 总是等于 size - 1
    unsigned long sizemask;

    // 该哈希表已有节点的数量
    unsigned long used;

} dictht;

/*
 * 字典
 */
typedef struct dict {

    // 类型特定函数
    dictType *type;

    // 私有数据
    void *privdata;

    // 哈希表
    dictht ht[2];

    // rehash 索引
    // 当 rehash 不在进行时，值为 -1
    int rehashidx; /* rehashing not in progress if rehashidx == -1 */

    // 目前正在运行的安全迭代器的数量
    int iterators; /* number of iterators currently running */

} dict;

/*
 * 哈希表节点
 */
typedef struct dictEntry {
    
    // 键
    void *key;

    // 值
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
    } v;

    // 指向下个哈希表节点，形成链表
    struct dictEntry *next;

} dictEntry;

1.2.2 渐进式rehash

渐进式 rehash

浅谈Redis中的Rehash机制

多步rehash定义：

在这里插入图片描述


/* Performs N steps of incremental rehashing. Returns 1 if there are still
 * keys to move from the old to the new hash table, otherwise 0 is returned.
 *
 * 执行 N 步渐进式 rehash 。
 *
 * 返回 1 表示仍有键需要从 0 号哈希表移动到 1 号哈希表，
 * 返回 0 则表示所有键都已经迁移完毕。
 *
 * Note that a rehashing step consists in moving a bucket (that may have more
 * than one key as we use chaining) from the old to the new hash table.
 *
 * 注意，每步 rehash 都是以一个哈希表索引（桶）作为单位的，
 * 一个桶里可能会有多个节点，
 * 被 rehash 的桶里的所有节点都会被移动到新哈希表。
 *
 * T = O(N)
 */
int dictRehash(dict *d, int n) {

    // 只可以在 rehash 进行中时执行
    if (!dictIsRehashing(d)) return 0;

    // 进行 N 步迁移
    // T = O(N)
    while(n--) {
        dictEntry *de, *nextde;

        /* Check if we already rehashed the whole table... */
        // 如果 0 号哈希表为空，那么表示 rehash 执行完毕
        // T = O(1)
        if (d->ht[0].used == 0) {
            // 释放 0 号哈希表
            zfree(d->ht[0].table);
            // 将原来的 1 号哈希表设置为新的 0 号哈希表
            d->ht[0] = d->ht[1];
            // 重置旧的 1 号哈希表
            _dictReset(&d->ht[1]);
            // 关闭 rehash 标识
            d->rehashidx = -1;
            // 返回 0 ，向调用者表示 rehash 已经完成
            return 0;
        }

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        // 确保 rehashidx 没有越界
        assert(d->ht[0].size > (unsigned)d->rehashidx);

        // 略过数组中为空的索引，找到下一个非空索引
        while(d->ht[0].table[d->rehashidx] == NULL) d->rehashidx++;

        // 指向该索引的链表表头节点
        de = d->ht[0].table[d->rehashidx];
        /* Move all the keys in this bucket from the old to the new hash HT */
        // 将链表中的所有节点迁移到新哈希表
        // T = O(1)
        while(de) {
            unsigned int h;

            // 保存下个节点的指针
            nextde = de->next;

            /* Get the index in the new hash table */
            // 计算新哈希表的哈希值，以及节点插入的索引位置
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;

            // 插入节点到新哈希表
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;

            // 更新计数器
            d->ht[0].used--;
            d->ht[1].used++;

            // 继续处理下个节点
            de = nextde;
        }
        // 将刚迁移完的哈希表索引的指针设为空
        d->ht[0].table[d->rehashidx] = NULL;
        // 更新 rehash 索引
        d->rehashidx++;
    }

    return 1;
}

单步rehash定义：

/* This function performs just a step of rehashing, and only if there are
 * no safe iterators bound to our hash table. When we have iterators in the
 * middle of a rehashing we can't mess with the two hash tables otherwise
 * some element can be missed or duplicated.
 *
 * 在字典不存在安全迭代器的情况下，对字典进行单步 rehash 。
 *
 * 字典有安全迭代器的情况下不能进行 rehash ，
 * 因为两种不同的迭代和修改操作可能会弄乱字典。
 *
 * This function is called by common lookup or update operations in the
 * dictionary so that the hash table automatically migrates from H1 to H2
 * while it is actively used. 
 *
 * 这个函数被多个通用的查找、更新操作调用，
 * 它可以让字典在被使用的同时进行 rehash 。
 *
 * T = O(1)
 */
static void _dictRehashStep(dict *d) {
    if (d->iterators == 0) dictRehash(d,1);
}