redis 基本数据类型适用场景及数据结构

最新推荐文章于 2023-02-16 08:30:00 发布

shaofei_huai

最新推荐文章于 2023-02-16 08:30:00 发布

阅读量1.2k

点赞数 2

分类专栏： redis 文章标签： redis 数据结构内存管理

本文链接：https://blog.csdn.net/shaofei_huai/article/details/119300559

版权

redis 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

redisObject

redis对象由redisObject统一管理，可以理解为redisObject是redis对象的父类，目前版本redisObject代码如下

#define LRU_BITS 24
typedef struct redisObject {
    unsigned type:4;  /* redis数据类型 */
    unsigned encoding:4;        /* redis数据结构类型 */
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time).*/
    int refcount;  /* 引用计数 */
    void *ptr;  /* 指向存储数据的指针 */ 
} robj;

通过源码可以看出每个redis对象都有一个redisObject，其中type占用4b，encoding4b，lru24b，refcount4B加起来占用8B，64位指针占用8B。也就是说即使什么都不存，redisObject也要占用16B空间。

基本数据类型

String

数据结构

简单字符串SDS存储，截取部分代码如下

/* Note: sdshdr5 is never used, we just access the flags byte directly.
 * However is here to document the layout of type 5 SDS strings. */
struct __attribute__ ((__packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
    uint64_t len; /* 当前sds的长度 */
    uint64_t alloc; /* 为sds分配的内存大小 */
    unsigned char flags; /* 当前sds的类型 */
    char buf[];  /* sds实际存放的数据 */
};

可以看到，len，alloc，flags都是额外占用的存储空间，为了充分利用空间，redis使用int ，embstr，raw 编码格式对sds内存布局做了优化。

int ：当保存的是 Long 类型整数时，redisObject中指针直接赋值为整数数据，不需要额外的指针再指向整数，节省了指针的空间开销。

embstr：保存字符串数据，并且字符串<=44 字节，redisObject 与 sds是一块连续的内存区域，避免内存碎片。

raw ：保存字符串数据，并且字符串>44 字节,redisObject与sds分开存储，通过指针引用。

sds详细信息可以参考https://github.com/antirez/sds

适用场景

String类型属于那种万金油类型，什么都能存，比如数组，对象可以转成json然后使用String存储。但这种行为很low，实际开发中还是要选择对应类型。

hash

数据结构

使用压缩列表或哈希表存储。可通过修改redis.conf配置文件中参数修改

hash-max-ziplist-entries：压缩列表结构保存的元素最大元素个数，默认512

hash-max-ziplist-value：压缩列表保存的单个元素的最大长度，默认64

以上两个条件任意一个不满足则由压缩列表转为哈希表

适用场景

适合存储结构化数据，比如一级key为用户id，二级key为用户身份证，姓名等属性。但hash结构还有一个优点是可以代替一些特定的String。由于String类型占用内存空间比较大，所以对于一些有序key可以截取存储。例如根据手机号查找特定信息，手机号为13800001111可以根据实际情况截取前几位作为一级key，后几位作为二级key，value存储特定信息。

list

数据结构

使用快速列表（quicklist），其本质为双向链表+压缩列表组成，由压缩列表保存实际数据，压缩列表满了新建压缩列表通过指针前后连接。默认每个压缩列表8K，可通过修改redis.conf配置文件中参数修改

list-max-ziplist-size：其参数如下，默认-2，取正值表示每个快速列表有节点有几个压缩列表

        -5: 每个快速列表节点上的压缩列表大小不能超过64 Kb。
        -4: 每个快速列表节点上的压缩列表大小不能超过32 Kb。
        -3: 每个快速列表t节点上的压缩列表大小不能超过16 Kb。
        -2: 每个快速列表t节点上的压缩列表大小不能超过8 Kb，
        -1: 每个快速列表节点上的压缩列表大小不能超过4 Kb。

适用场景

存储变更频率不高或不需要分页的的数组结构数据。

例如现有list= {A,B,C,D}对于分页数据 
先lrange list 0,1 读取到A,B
删除A，再lrange list 2,3 读取数据时会读取到只能读取到D。这样两次读取丢失了C

list不适用于消息队列，消息队列可以使用stream类型。

set

数据结构

使用哈希表或数组存储。可通过修改redis.conf配置文件中参数修改

set-max-intset-entries ：哈希表保存的最大元素个数，默认512，超过此值使用数据保存

适用场景

set集合当中不允许重复的元素，利用此特性可以做一些取交集，并集，差集操作，比如群聊中判断是否有共同好友，推荐好友等操作

zset

数据结构

使用哈希表或跳表存储。可通过修改redis.conf配置文件中参数修改

zset-max-ziplist-entries：压缩列表结构保存的元素最大元素个数，默认128
zset-max-ziplist-value ：压缩列表保存的单个元素的最大长度，默认64

以上两个条件任意一个不满足则由压缩列表转为跳表

适用场景

包括set适用场景，除此之外由于其为有序集合，还可以应用在一些排序列表中

压缩列表

压缩列表（ziplist）这种数据结构不太常见，是redis为了节约内存开发的，其特点为一个压缩列表可以包含多个节点（entry），每个节点可以保存整数值或字节数组。ziplist结构如下

 <zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>

zlbytes：压缩链表占用的字节数

zltail：压缩链表头部到最后一个entry的偏移量

zllen：压缩列表中entry数量

entry：存储真正数据

zlend：压缩列表结尾

entry节点又由多个字段构成，源码中的英文翻译出来效果不太理想，entry中除保存数据外还保存上一个节点长度，当前节点长度，在遍历时不是通过指针建立每个entry的关系，而是通过计算长度获取到每个entry。

关于压缩列表更多信息可以参考https://github.com/redis/redis/blob/unstable/src/ziplist.c

/* We use this function to receive information about a ziplist entry.
 * Note that this is not how the data is actually encoded, is just what we
 * get filled by a function in order to operate more easily. */
typedef struct zlentry {
    unsigned int prevrawlensize; /* Bytes used to encode the previous entry len*/
    unsigned int prevrawlen;     /* Previous entry len. */
    unsigned int lensize;        /* Bytes used to encode this entry type/len.
                                    For example strings have a 1, 2 or 5 bytes
                                    header. Integers always use a single byte.*/
    unsigned int len;            /* Bytes used to represent the actual entry.
                                    For strings this is just the string length
                                    while for integers it is 1, 2, 3, 4, 8 or
                                    0 (for 4 bit immediate) depending on the
                                    number range. */
    unsigned int headersize;     /* prevrawlensize + lensize. */
    unsigned char encoding;      /* Set to ZIP_STR_* or ZIP_INT_* depending on
                                    the entry encoding. However for 4 bits
                                    immediate integers this can assume a range
                                    of values and must be range-checked. */
    unsigned char *p;            /* Pointer to the very start of the entry, that
                                    is, this points to prev-entry-len field. */
} zlentry;