【Redis-6.0.8】Redis中的哈希

最新推荐文章于 2022-05-13 15:03:25 发布

我要精通C++

最新推荐文章于 2022-05-13 15:03:25 发布

阅读量508

点赞数 1

分类专栏： redis

本文链接：https://blog.csdn.net/Edidaughter/article/details/115445365

版权

redis 专栏收录该内容

49 篇文章 0 订阅

订阅专栏

1.redis中的哈希算法-siphash&time33哈希算法

D:\005-01-代码\001-开源项目源码\007-redis\redis-6.0.8.tar\redis-6.0.8\redis-6.0.8\src\siphash.c(siphash-redis服务端使用)

D:\005-01-代码\001-开源项目源码\007-redis\redis-6.0.8.tar\redis-6.0.8\redis-6.0.8\deps\hiredis\dict.c(time33-redis自带的客户端使用)

time33哈希源码：

/* Generic hash function (a popular one from Bernstein).
 * I tested a few and this was the best. */
static unsigned int dictGenHashFunction(const unsigned char *buf, int len) {
    unsigned int hash = 5381;

    while (len--)
        hash = ((hash << 5) + hash) + (*buf++); /* hash * 33 + c */
    return hash;
}

2.Redis设计哈希算法考虑的问题

2.1 哈希函数需要满足的条件

redis中的哈希表的组成是数组(整数)+哈希函数，所以redis中的哈希函数需要满足的条件：

(1)能够将字符串哈希成一个整数，整数的位数越大越好；

(2)鉴于redis中的key一般是有规律的，所以我们的哈希函数需要具备强随机分布性.

2.2 哈希存在的问题

存在的问题：

(1)造成冲突，字符串是无限多的，但是整数是64位的，整数是有限的，所以当字符串足够多的时候一定会产生冲突.【抽屉原理，鸽舍原理】

(2)造成浪费和及处理浪费方案中的另一种形式的冲突，如果直接用2^64来作为数组的长度，会造成很大的浪费.所以我们会根据当前的数据量来设置当前的数组的长度。

初始值我们会将其设置成4，然后对4进行取余操作，但是即使是这样，我们依旧会产生冲突，并不是所有key都能恰好的分布.

在redis当中，处理hash冲突使用了拉链法(插入数据的时候是最近插入也是最近需要使用的，采用头插法，也符合我们数据库的操作习惯).

2.3 扩容策略

当我们选择扩容的时候，数组长度为4的时候，当我们需要使用aof和rdb来进行持久化的时候，持久化结束之后我们就开始扩容，

持久化结束之后里面可能已经有8个或者9个值，所以扩容后的容量我们选择的是变为原来的4倍，而不是原来的两倍.

2.4 缩容策略

为了避免频繁的缩容及频繁触发扩容，当当前数量小于数组长度的10%的时候，我们才去缩容.

2.5 哈希表结构定义

D:\005-01-代码\001-开源项目源码\007-redis\redis-6.0.8.tar\redis-6.0.8\redis-6.0.8\src\dict.h

/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;// 是⼀个数组,数组中的每个元素都是⼀个指向dict.h/dictEntry结构的指针,每个dictEntry 结构保存着⼀个键值对；
    unsigned long size;// 记录了哈希表的⼤⼩,也即是table数组的⼤⼩,⽽used属性则记录了哈希表⽬前已有节点(键值对)的数量
    unsigned long sizemask;// 总是等于size-1，这个属性和哈希值⼀起决定⼀个键应该被放到table 数组的哪个索引上⾯；
    unsigned long used;// 表示hash表⾥已有的数量
} dictht;

做出说明：

当redis在做持久化的时候，used可能会大于size.
sizemask的存在的意义是：对于C/C++来说，取余操作是一个比较耗费性能的操作，可以将取余操作变成一个位
运算操作，比如说对4取余操作可以转换成对3取&.比如说对5取余和5&3得到的结果是一样的.

2.6 dictEntry结构定义

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

2.7 借助dict与dictType结构用C语言实现了一个类似于类的封装

typedef struct dict {
    dictType *type;// 该字典对应的特定操作函数
    void *privdata;// 该字典依赖的数据，上下文，具体操作，如 set key value
    dictht ht[2];  // hash表，存储键值对，ht[0]是扩容或者缩容前的数组，ht[1]是扩容或者缩容后的数组，实现渐进性rehash，防止服务器搞挂了
    long rehashidx; // rehashing not in progress if rehashidx == -1，指定rehash的位置，实现渐进式rehash，也就是指定ht[0]的数组索引 
    unsigned long iterators; /* number of iterators currently running,安全迭代器的个数 */
} dict;

// 相当于C++类中的成员函数，将dict看成一个类，将dictType看成类的成员函数的的集合
typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key);
    void *(*valDup)(void *privdata, const void *obj);
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    void (*keyDestructor)(void *privdata, void *key);
    void (*valDestructor)(void *privdata, void *obj);
} dictType;

2.8 redisDb结构定义

/* Redis database representation. There are multiple databases identified
 * by integers from 0 (the default database) up to the max configured
 * database. The database number is the 'id' field in the structure. */
typedef struct redisDb {
    dict *dict;                 /* The keyspace for this DB */
    dict *expires;              /* Timeout of keys with a timeout set */
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP)*/
    dict *ready_keys;           /* Blocked keys that received a PUSH */
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */
    int id;                     /* Database ID */
    long long avg_ttl;          /* Average TTL, just for stats */
    unsigned long expires_cursor; /* Cursor of the active expire cycle. */
    list *defrag_later;         /* List of key names to attempt to defrag one by one, gradually. */
} redisDb;