Redis设计与实现——数据结构与对象（三）

Baymax_yan

于 2019-06-15 09:45:04 发布

阅读量161

点赞数 2

分类专栏：数据库

本文链接：https://blog.csdn.net/qq_40028201/article/details/92064004

版权

数据库专栏收录该内容

4 篇文章 0 订阅

订阅专栏

Redis设计与实现——数据结构与对象（二）

3、字典

字典的实现方法
- 最简单的就是使用链表或数组，但是这种方式只适用于元素个数不多的情况下；
- 要兼顾高效和简单性，可以使用哈希表；
- 如果追求更为稳定的性能特征，并希望高效地实现排序操作的话，则可使用更为复杂的平衡树；

redis中的实现犯法，选择了第二种

/*
 * 字典
 *
 * 每个字典使用两个哈希表，用于实现渐进式 rehash
 */
typedef struct dict {

    // 特定于类型的处理函数
    dictType *type;

    // 类型处理函数的私有数据
    void *privdata;

    // 哈希表（2 个）
    dictht ht[2];

    // 记录 rehash 进度的标志，值为 -1 表示 rehash 未进行
    int rehashidx;

    // 当前正在运作的安全迭代器数量
    int iterators;

} dict;

整个字典结构

$digraph hash_table_example { // setting rankdir = LR; node[shape=record, style = filled]; edge [style = bold]; // nodes dict [label="dict | type | privdata | ht[2] | rehashidx: -1 | iterators: 0", fillcolor = "#A8E270"]; ht0 [label="dictht | table | size: 4 | sizemask: 3 | used: 3", fillcolor = "#95BBE3"]; ht1 [label="dictht | table | size: 0 | sizemask: 0 | used: 0", fillcolor = "#95BBE3"]; bucket [label="dictEntry**\n(bucket) | 0 | 1 | 2 | 3 ", fillcolor = "#F2F2F2"]; pair_1 [label="dictEntry |{key1 | value1 |next}", fillcolor = "#FADCAD"]; pair_2 [label="dictEntry |{key2 | value2 |next}", fillcolor = "#FADCAD"]; pair_3 [label="dictEntry |{key3 | value3 |next}", fillcolor = "#FADCAD"]; null0 [label="NULL", shape=plaintext]; null1 [label="NULL", shape=plaintext]; null2 [label="NULL", shape=plaintext]; null3 [label="NULL", shape=plaintext]; tnull1 [label="NULL", shape=plaintext]; // lines dict:ht -> ht0:dictht [label="ht[0]"]; dict:ht -> ht1:dictht [label="ht[1]"]; ht0:table -> bucket:head; ht1:table -> tnull1; bucket:table0 -> pair_1:head; pair_1:next -> null0; bucket:table1 -> null1; bucket:table2 -> pair_2:head; pair_2:next -> null2; bucket:table3 -> pair_3:head; pair_3:next -> null3; }$
使用的哈希算法
- MurmurHash2 32 bit 算法：这种算法的分布率和速度都非常好
- 基于 djb 算法实现的一个大小写无关散列算法
算法使用取决于具体应用所处理的数据
- 命令表以及lua脚本缓存用的2
- 算法 1 的应用则更加广泛：数据库、集群、哈希键、阻塞操作等功能都用到了这个算法
具体的操作流程

$digraph dictAdd { node[shape=plaintext, style = filled]; edge [style = bold]; // start [label="dictAdd", fillcolor = "#A8E270"]; key_exists_or_not [label="键已经存在？", shape=diamond, fillcolor = "#95BBE3"]; start -> key_exists_or_not; return_null_if_key_exists [label="返回 NULL ，\n表示添加失败"]; key_exists_or_not -> return_null_if_key_exists [label="是"]; dict_empty_or_not [label="ht[0]\n 未分配任何空间？", shape=diamond, fillcolor = "#95BBE3"]; key_exists_or_not -> dict_empty_or_not [label="否"]; init_hash_table_one [label="初始化 ht[0]"]; dict_empty_or_not -> init_hash_table_one [label="是"]; init_hash_table_one -> need_rehash_or_not; need_rehash_or_not [label="需要 rehash ？", shape=diamond, fillcolor = "#95BBE3"]; dict_empty_or_not -> need_rehash_or_not [label="否"]; begin_incremental_rehash [label="开始渐进式 rehash "]; need_rehash_or_not -> begin_incremental_rehash [label="需要，\n并且 rehash 未进行"]; begin_incremental_rehash -> rehashing_or_not; rehashing_or_not [label="rehash\n 正在进行中？", shape=diamond, fillcolor = "#95BBE3"]; need_rehash_or_not -> rehashing_or_not [label="不需要，\n或者 rehash 正在进行"]; is_rehashing [label="选择 ht[1] 作为新键值对的添加目标"]; not_rehashing [label="选择 ht[0] 作为新键值对的添加目标"]; rehashing_or_not -> is_rehashing [label="是"]; rehashing_or_not -> not_rehashing [label="否"]; calc_hash_code_and_index_by_key [label="根据给定键，计算出哈希值，以及索引值"]; is_rehashing -> calc_hash_code_and_index_by_key; not_rehashing -> calc_hash_code_and_index_by_key; create_entry_and_assoc_key_and_value [label="创建新 dictEntry ，并保存给定键值对"]; calc_hash_code_and_index_by_key -> create_entry_and_assoc_key_and_value; add_entry_to_hashtable [label="根据索引值，将新节点添加到目标哈希表"]; create_entry_and_assoc_key_and_value -> add_entry_to_hashtable; }$

rehash操作

为了让哈希表的负载因子维持在一个合理的范围内，在哈希表保存的键值对数量太多或者太少时，程序需要对哈希表的大小进行相应的扩展和收缩

具体步骤
1. 创建一个比 ht[0]->table 更大的 ht[1]->table 大小至少为 ht[0]->used 的两倍；；
2. 将 ht[0]->table 中的所有键值对迁移到 ht[1]->table ；
3. 将原有 ht[0] 的数据清空，并将 ht[1] 替换为新的 ht[0]
渐进式rehash操作

假设这样一个场景：在一个有很多键值对的字典里，某个用户在添加新键值对时触发了 rehash 过程，如果这个 rehash 过程必须将所有键值对迁移完毕之后才将结果返回给用户，这样的处理方式将是非常不友好的。

另一方面，要求服务器必须阻塞直到 rehash 完成，这对于 Redis 服务器本身也是不能接受的。

过程：

字典会同时使用ht[0]和ht[1]这两个哈希表，字典的操作会在两个哈希表上进行操作，新增的会直接添加到h[1]中，有一个rehashidx=0时开始rehash，进行一次会增加1；而rehashidx = -1 时结束rehash操作。

Baymax_yan

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Redis设计与实现——数据结构与对象（三）

Redis设计与实现——数据结构与对象（二）3、字典字典的实现方法最简单的就是使用链表或数组，但是这种方式只适用于元素个数不多的情况下；要兼顾高效和简单性，可以使用哈希表；如果追求更为稳定的性能特征，并希望高效地实现排序操作的话，则可使用更为复杂的平衡树；redis中的实现犯法，选择了第二种/* * 字典 * * 每个字典使用两个哈希表，用于实现渐进式 rehas...
复制链接

扫一扫