- 存储的结构
在 redis 字符串对象 String 的介绍中,我们知道 redis 对于字符串的存储共有 3 种存储形式,其存储的内存结构如以下图片示例:
OBJ_ENCODING_INT: 保存的字符串长度小于 20,并且是可以解析为 long 类型的整数值,那么存储方式就是直接将 redisObject 的 ptr 指针指向这个整数值
OBJ_ENCODING_EMBSTR: 长度小于 44 (OBJ_ENCODING_EMBSTR_SIZE_LIMIT)的字符串将以简单动态字符串(SDS) 的形式存储在 redisObject 中,但是redisObject 对象头会和 SDS 对象连续存在一起
OBJ_ENCODING_RAW: 字符串以简单动态字符串(SDS) 的形式存储,redisObject 对象头和 SDS 对象在内存地址上一般是不连续的两块内存
2. 数据存储源码分析
2.1 数据存储过程
在 Redis 6.0 源码阅读笔记(1)-Redis 服务端启动及命令执行 中我们已经知道客户端保存字符串的 set 命令将会调用到 t_string.c#setCommand() 函数,其源码实现如下:
该方法中有以下两个重点函数被调用,本节主要关注 tryObjectEncoding() 函数
tryObjectEncoding() 将客户端传输过来的需保存的字符串对象尝试进行编码,以节约内存
setGenericCommand() 将 key-value 保存到数据库中
void setCommand(client *c) {
......
c->argv[2] = tryObjectEncoding(c->argv[2]);
setGenericCommand(c,flags,c->argv[1],c->argv[2],expire,unit,NULL,NULL);
}
object.c#tryObjectEncoding() 函数逻辑很清晰,可以看到主要进行了以下几个操作:
当字符串长度小于 20 并且可以被解析为 long 类型数据时,这个数据将以整数形式保存,并以 robj->ptr = (void*) value 这种直接赋值的形式存储
当字符串长度小于等于 OBJ_ENCODING_EMBSTR_SIZE_LIMIT 配置并且还是 raw 编码时,调用 createEmbeddedStringObject() 函数将其转化为 embstr 编码
这个字符串对象已经不能进行转码了,只好调用 trimStringObjectIfNeeded() 函数尝试从字符串对象中移除所有空余空间
robj *tryObjectEncoding(robj *o) {
long value;
sds s = o->ptr;
size_t len;
/* Make sure this is a string object, the only type we encode
* in this function. Other types use encoded memory efficient
* representations but are handled by the commands implementing
* the type. */
serverAssertWithInfo(NULL,o,o->type == OBJ_STRING);
/* We try some specialized encoding only for objects that are
* RAW or EMBSTR encoded, in other words objects that are still
* in represented by an actually array of chars. */
if (!sdsEncodedObject(o)) return o;
/* It's not safe to encode shared objects: shared objects can be shared
* everywhere in the "object space" of Redis and may end in places where
* they are not handled. We handle them only as values in the keyspace. */
if (o->refcount > 1) return o;
/* Check if we can represent this string as a long integer.
* Note that we are sure that a string larger than 20 chars is not
* representable as a 32 nor 64 bit integer. */
len = sdslen(s);
if (len <= 20 && string2l(s,len,&value)) {
/* This object is encodable as a long. Try to use a shared object.
* Note that we avoid using shared integers when maxmemory is used
* because every object needs to have a private LRU field for the LRU
* algorithm to work well. */
if ((server.maxmemory == 0 ||
!(server.maxmemory_policy & MAXMEMORY_FLAG_NO_SHARED_INTEGERS)) &&
value >= 0 &&
value < OBJ_SHARED_INTEGERS)
{
decrRefCount(o);
incrRefCount(shared.integers[value]);
return shared.integers[value];
} else {
if (o->encoding == OBJ_ENCODING_RAW) {
sdsfree(o->ptr);
o->encoding = OBJ_ENCODING_INT;
o->ptr = (void*) value;
return o;
} else if (o->encoding == OBJ_ENCODING_EMBSTR) {
decrRefCount(o);
return createStringObjectFromLongLongForValue(value);
}
}
}
/* If the string is small and is still RAW encoded,
* try the EMBSTR encoding which is more efficient.
* In this representation the object and the SDS string are allocated
* in the same chunk of memory to save space and cache misses. */
if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT) {
robj *emb;
if (o->encoding == OBJ_ENCODING_EMBSTR) return o;
emb = createEmbeddedStringObject(s,sdslen(s));
decrRefCount(o);
return emb;
}
/* We can't encode the object...
*
* Do the last try, and at least optimize the SDS string inside
* the string object to require little space, in case there
* is more than 10% of free space at the end of the SDS string.
*
* We do that only for relatively large strings as this branch
* is only entered if the length of the string is greater than
* OBJ_ENCODING_EMBSTR_SIZE_LIMIT. */
trimStringObjectIfNeeded(o);
/* Return the original object. */
return o;
}
object.c#createEmbeddedStringObject() 函数实现 embstr 编码也很简单,主要步骤如下:
首先调用 zmalloc() 函数申请内存,可以看到此处不仅申请了需要存储的字符串的内存及 redisObject 的内存,还申请了 SDS 实现结构体之一 sdshdr8 的内存,这也就是上文所说embstr 编码只申请一次内存,并且redisObject 对象头会和 SDS 对象连续存在一起的由来
将 redisObject 对象的 ptr 指针指向 sdshdr8 开始的内存地址
填充 sdshdr8 对象各个属性,包括 len 字符串长度,alloc 字符数组容量,实际存储字符串的 buf 字符数组
robj *createEmbeddedStringObject(const char *ptr, size_t len) {
robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);
struct sdshdr8 *sh = (void*)(o+1);
o->type = OBJ_STRING;
o->encoding = OBJ_ENCODING_EMBSTR;
o->ptr = sh+1;
o->refcount = 1;
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
} else {
o->lru = LRU_CLOCK();
}
sh->len = len;
sh->alloc = len;
sh->flags = SDS_TYPE_8;
if (ptr == SDS_NOINIT)
sh->buf[len] = '\0';
else if (ptr) {
memcpy(sh->buf,ptr,len);
sh->buf[len] = '\0';
} else {
memset(sh->buf,0,len+1);
}
return o;
}
raw 编码字符串的创建可参考 object.c#createRawStringObject() 函数,其涉及到两次内存申请,sds.c#sdsnewlen() 申请内存创建 SDS 对象,object.c#createObject() 申请内存创建 redisObject 对象
robj *createRawStringObject(const char *ptr, size_t len) {
return createObject(OBJ_STRING, sdsnewlen(ptr,len));
}
从检测容量大小的的函数t_string.c#checkStringLength()看,字符串最大长度为 512M,超出该数值将报错
static int checkStringLength(client *c, long long size) {
if (size > 512*1024*1024) {
addReplyError(c,"string exceeds maximum allowed size (512MB)");
return C_ERR;
}
return C_OK;
}
2.2 简单动态字符串 SDS
2.2.1 SDS 结构体
SDS(简单动态字符串) 在 Redis 中是实现字符串存储的工具,本质上依然是字符数组,但它不像C语言字符串那样以‘\0’来标识字符串结束
传统C字符串符合ASCII编码,这种编码的操作的特点就是:遇零则止 。即当读一个字符串时,只要遇到’\0’就认为到达末尾,忽略’\0’以后的所有字符。另外其获得字符串长度的做法是遍历字符串,遇零则止,时间复杂度为O(n),比较低效
SDS 的实现结构定义在 sds.h 中,其定义如下。因为 SDS 判断是否到达字符串末尾的依据是表头的 len 属性,所以能高效计算字符串长度并快速追加数据
sds 结构一共有 5 种 Header 定义,目的是为不同长度的字符串提供不同大小的 Header,以节省内存。以 sdshdr8 为例,其 len 属性为 uint8_t 类型,占用内存大小为 1 字节,则存储的字符串最大长度为256。Header 主要包含以下几个属性:
len: 字符串真正的长度,不包含空终止字符
alloc: 除去表头和终止符的 buf 数组长度,也就是最大容量
flags: 标志 header 的类型
buf: 字符数组,实际存储字符
/* Note: sdshdr5 is never used, we just access the flags byte directly.
* However is here to document the layout of type 5 SDS strings. */
struct __attribute__ ((__packed__)) sdshdr5 {
unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; /* used */
uint8_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
uint16_t len; /* used */
uint16_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
uint32_t len; /* used */
uint32_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
uint64_t len; /* used */
uint64_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
2.2.2 SDS 容量调整
SDS 扩容的函数是sds.c#sdsMakeRoomFor(),当字符串长度小于 1M 时,扩容都是加倍现有的空间,如果超过 1M,扩容时一次只会多扩 1M 的空间。以下为源码实现:
字符串在长度小于 SDS_MAX_PREALLOC(1024*1024,也就是1MB,定义在 sds.h) 之前,采用 2 倍扩容,也就是保留 100% 的冗余空间。当长度超过 SDS_MAX_PREALLOC 之后,每次扩容只会多分配 SDS_MAX_PREALLOC 大小的冗余空间,避免加倍扩容后的冗余空间过大导致浪费
sds sdsMakeRoomFor(sds s, size_t addlen) {
void *sh, *newsh;
size_t avail = sdsavail(s);
size_t len, newlen;
char type, oldtype = s[-1] & SDS_TYPE_MASK;
int hdrlen;
/* Return ASAP if there is enough space left. */
if (avail >= addlen) return s;
len = sdslen(s);
sh = (char*)s-sdsHdrSize(oldtype);
newlen = (len+addlen);
if (newlen < SDS_MAX_PREALLOC)
newlen *= 2;
else
newlen += SDS_MAX_PREALLOC;
type = sdsReqType(newlen);
/* Don't use type 5: the user is appending to the string and type 5 is
* not able to remember empty space, so sdsMakeRoomFor() must be called
* at every appending operation. */
if (type == SDS_TYPE_5) type = SDS_TYPE_8;
hdrlen = sdsHdrSize(type);
if (oldtype==type) {
newsh = s_realloc(sh, hdrlen+newlen+1);
if (newsh == NULL) return NULL;
s = (char*)newsh+hdrlen;
} else {
/* Since the header size changes, need to move the string forward,
* and can't use realloc */
newsh = s_malloc(hdrlen+newlen+1);
if (newsh == NULL) return NULL;
memcpy((char*)newsh+hdrlen, s, len+1);
s_free(sh);
s = (char*)newsh+hdrlen;
s[-1] = type;
sdssetlen(s, len);
}
sdssetalloc(s, newlen);
return s;
}
SDS 缩容函数为sds.c#sdsclear(),从源码实现来看,其主要有以下操作,也就是它并不释放实际占用的内存,体现出一种惰性策略
重置 SDS 表头的 len 属性值为 0
将结束符放到 buf 数组最前面,相当于惰性地删除 buf 中的内容
/* Modify an sds string in-place to make it empty (zero length).
* However all the existing buffer is not discarded but set as free space
* so that next append operations will not require allocations up to the
* number of bytes previously available. */
void sdsclear(sds s) {
sdssetlen(s, 0);
s[0] = '\0';
}