HotFrameLearning(简称 HFL) Redis_07_string类型底层存储数据结构
-
一、大致介绍
```
1、大家天天在操作redis的string读写,是否知道string的底层有3种数据结构呢?
2、纳尼?string不就是string么?怎么底层居然还会有3种数据结构?大跌眼镜呀~
3、接下来我就通过参考 redis-6.0.6 源码给大家介绍下string的底层数据结构吧;
```
二、string数据结构
2.1 windows命令操作
```
1、图1中,通过一顿子操作,可以看到string的底层包含了3种类型的数据结构:
- int:长度小于等于20,且可以将string转为long型的字符串
- embstr:长度小于等于44,且不能转为long型的字符串
- raw:长度大于44的字符串
2、为什么通过这个图就能得出这么多信息呢?根本原因还是得从源码看起;
```
2.2 源码详解(字符串底层构建三种数据类型原理)
源码文件:object.c/* Try to encode a string object in order to save space */robj *tryObjectEncoding(robj *o) { long value; sds s = o->ptr; size_t len; /* Make sure this is a string object, the only type we encode * in this function. Other types use encoded memory efficient * representations but are handled by the commands implementing * the type. */ serverAssertWithInfo(NULL,o,o->type == OBJ_STRING); /* We try some specialized encoding only for objects that are * RAW or EMBSTR encoded, in other words objects that are still * in represented by an actually array of chars. */ if (!sdsEncodedObject(o)) return o; /* It's not safe to encode shared objects: shared objects can be shared * everywhere in the "object space" of Redis and may end in places where * they are not handled. We handle them only as values in the keyspace. */ if (o->refcount > 1) return o; /* Check if we can represent this string as a long integer. * Note that we are sure that a string larger than 20 chars is not * representable as a 32 nor 64 bit integer. */ len = sdslen(s); if (len <= 20 && string2l(s,len,&value)) { /* This object is encodable as a long. Try to use a shared object. * Note that we avoid using shared integers when maxmemory is used * because every object needs to have a private LRU field for the LRU * algorithm to work well. */ if ((server.maxmemory == 0 || !(server.maxmemory_policy & MAXMEMORY_FLAG_NO_SHARED_INTEGERS)) && value >= 0 && value < OBJ_SHARED_INTEGERS) { decrRefCount(o); incrRefCount(shared.integers[value]); return shared.integers[value]; } else { if (o->encoding == OBJ_ENCODING_RAW) { sdsfree(o->ptr); o->encoding = OBJ_ENCODING_INT; o->ptr = (void*) value; return o; } else if (o->encoding == OBJ_ENCODING_EMBSTR) { decrRefCount(o); return createStringObjectFromLongLongForValue(value); } } } /* If the string is small and is still RAW encoded, * try the EMBSTR encoding which is more efficient. * In this representation the object and the SDS string are allocated * in the same chunk of memory to save space and cache misses. */ if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT) { robj *emb; if (o->encoding == OBJ_ENCODING_EMBSTR) return o; emb = createEmbeddedStringObject(s,sdslen(s)); decrRefCount(o); return emb; } /* We can't encode the object... * * Do the last try, and at least optimize the SDS string inside * the string object to require little space, in case there * is more than 10% of free space at the end of the SDS string. * * We do that only for relatively large strings as this branch * is only entered if the length of the string is greater than * OBJ_ENCODING_EMBSTR_SIZE_LIMIT. */ trimStringObjectIfNeeded(o); /* Return the original object. */ return o;}/* Optimize the SDS string inside the string object to require little space, * in case there is more than 10% of free space at the end of the SDS * string. This happens because SDS strings tend to overallocate to avoid * wasting too much time in allocations when appending to the string. */void trimStringObjectIfNeeded(robj *o) { if (o->encoding == OBJ_ENCODING_RAW && sdsavail(o->ptr) > sdslen(o->ptr)/10) { o->ptr = sdsRemoveFreeSpace(o->ptr); }}
```
1、len <= 20 && string2l(s,len,&value):说明底层有个字符串20长度的判断,而且还尝试了string2l能否转long的判断;
2、len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT:这个 OBJ_ENCODING_EMBSTR_SIZE_LIMIT 常量的值就是 44,说明长度小于等于44,且不能转为long型的字符串,统一都当作 embstr 数类型;
3、trimStringObjectIfNeeded(o):剩下的就是raw类型的处理了,但是该方法里面直接判断是否等于 OBJ_ENCODING_RAW 类型了,难道初始化就给了 raw 类型?
```
2.3 源码详解(字符串初始化操作)
源码文件:object.crobj *createObject(int type, void *ptr) { robj *o = zmalloc(sizeof(*o)); o->type = type; o->encoding = OBJ_ENCODING_RAW; o->ptr = ptr; o->refcount = 1; /* Set the LRU to the current lruclock (minutes resolution), or * alternatively the LFU counter. */ if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) { o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL; } else { o->lru = LRU_CLOCK(); } return o;}
源码文件:server.htypedef struct redisObject { unsigned type:4; unsigned encoding:4; unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency * and most significant 16 bits access time). */ int refcount; void *ptr;} robj;
```
1、createObject 方法中可以看到,在构建对象的时候,就默认给了raw类型;
2、而创建的对象是一个叫做 redisObject 类型的数据结构:
- type:
-
- #define OBJ_STRING 0 /* String object. */
- #define OBJ_LIST 1 /* List object. */
- #define OBJ_SET 2 /* Set object. */
- #define OBJ_ZSET 3 /* Sorted set object. */
- #define OBJ_HASH 4 /* Hash object. */
- encoding:
-
- #define OBJ_ENCODING_RAW 0 /* Raw representation */
- #define OBJ_ENCODING_INT 1 /* Encoded as integer */
- #define OBJ_ENCODING_HT 2 /* Encoded as hash table */
- #define OBJ_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
- #define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
- #define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
- #define OBJ_ENCODING_INTSET 6 /* Encoded as intset */
- #define OBJ_ENCODING_SKIPLIST 7 /* Encoded as skiplist */
- #define OBJ_ENCODING_EMBSTR 8 /* Embedded sds string encoding */
- #define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
- #define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
- refcount:引用计数 当该变量值为0时,表示该对象不被任何其他对象引用,可以进行垃圾回收了;
- ptr:指针指向对象实际的数据结构;
3、因此我们可以得知,redis的底层对象都是用 redisObject 来表示的,该 redisObject 内部又通过了 type、encoding 字段进行了细分,从而产生了5大基本数据类型,而string类型就是其中一种,只不过string类型是通过 type = 0 来表示的;
```
欢迎关注+点赞,您的肯定是对我最大的支持!!!