过期策略
Redis之所以能够有海量数据的高吞吐量,是因为他是基于内存
的实现的,而内存的读写速度远远高于磁盘。此外Redis的数据读取设计为单线程,消除了上下文切换和等待阻塞。Redis对于socket的处理采用了IO复用,大大增加每秒处理的连接数。Redis内部实现了多种高效的存储结构,让本身基于内存的应用如虎添翼。
但是这种基于内存的应用有一个很明显的瓶颈:内存的空间上限。因此对于本身就不大的空间,能省一点就尽量省一点。因此Redis对于存在内存中的key,会设置过期时间,由cpu开一个定时器进行监控,到期之后key失效无法使用。
定时过期
定时过期指使用定时器监控,到期后进行删除key,属于主动过期。当设置过期时间的key较多的时候,会启动多个计时器,影响服务器的性能。但是这种高强度的主动过期无疑对内存较为友好。
当定时到期后,key会被清理,无法获取value,但是value可能被多个key引用,因此value可能不会立刻被删除,而是把其中的refcount进行-1操作。
/* 在C语言中,0代表false,非0都是true*/
int expireIfNeeded(redisDb *db, robj *key) {
/*如果键没有过期,就返回0*/
if (!keyIsExpired(db,key)) return 0;
/* If we are running in the context of a slave, instead of
* evicting the expired key from the database, we return ASAP:
* the slave key expiration is controlled by the master that will
* send us synthesized DEL operations for expired keys.
*
* Still we try to return the right information to the caller,
* that is, 0 if we think the key should be still valid, 1 if
* we think the key is expired at this time. */
//如果配置有masterhost,说明是从节点,那么不操作删除
if (server.masterhost != NULL) return 1;
/* Delete the key */
server.stat_expiredkeys++;
propagateExpire(db,key,server.lazyfree_lazy_expire);
notifyKeyspaceEvent(NOTIFY_EXPIRED,
"expired",key,db->id);
/*如果设置了lazyfree-lazy-eviction yes,执行异步删除(会先评估删除耗时,如果耗时短就用主线程删除
如果耗时较长就后台删除;如果设置lazyfree-lazy-eviction no,则执行同步删除*/
int retval = server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
dbSyncDelete(db,key);
if (retval) signalModifiedKey(NULL,db,key);
return retval;
}
惰性过期
惰性过期指Redis中的key过期后,不做扫描和删除的操作,而是等到下次访问的时候才去判断可以是否可用,属于被动过期,惰性过期是一种对CPU友好但是对内存不友好的策略,过期的key会一直保留在内存中,可能会很长一段时间都不会再被使用,但是依然占据内存空间。
/* Delete a key, value, and associated expiration entry if any, from the DB.
* If there are enough allocations to free the value object may be put into
* a lazy free list instead of being freed synchronously. The lazy free list
* will be reclaimed in a different bio.c thread.
* 从DB中删除key,value和与之关联的过期Entry(如果有的话),如果有足够的内存来执行释放
* 内存操作,就使用同步的方式执行,否则使用一个BIO(blockIO,即阻塞)的方式执行 */
#define LAZYFREE_THRESHOLD 64
int dbAsyncDelete(redisDb *db, robj *key) {
/* Deleting an entry from the expires dict will not free the sds of
* the key, because it is shared with the main dictionary. */
if (dictSize(db->expires) > 0) dictDelete(db->expires,key->ptr);
/* If the value is composed of a few allocations, to free in a lazy way
* is actually just slower... So under a certain limit we just free
* the object synchronously.
* 如果value值被分配了很少的内存,释放只会花费一点时间,在这种确定的方式下,
* 会执行同步的删除操作 */
dictEntry *de = dictUnlink(db->dict,key->ptr);
if (de) {
robj *val = dictGetVal(de);
size_t free_effort = lazyfreeGetFreeEffort(val);
/* If releasing the object is too much work, do it in the background
* by adding the object to the lazy free list.
* Note that if the object is shared, to reclaim it now it is not
* possible. This rarely happens, however sometimes the implementation
* of parts of the Redis core may call incrRefCount() to protect
* objects, and then call dbDelete(). In this case we'll fall
* through and reach the dictFreeUnlinkedEntry() call, that will be
* equivalent to just calling decrRefCount().
* 如果释放这个对象需要花费很大的工作量,就通过键、将该对象添加到延迟释放的列表中,
* 用后台异步的方式进行删除。如果对象是共享的,则无法进行删除。
* 有一种极少发生的情况:Redis中部分代码会调用incrRefCount()来保护对象,然后调用dbDelete()
* 会出现删除失败,这时候删除操作会转到dictFreeUnlinkedEntry()方法,这时候就等同于调用
* decrRefCount()方法,只能对refCount进行减1操作
* */
if (free_effort > LAZYFREE_THRESHOLD && val->refcount == 1) {
atomicIncr(lazyfree_objects,1);
bioCreateBackgroundJob(BIO_LAZY_FREE,val,NULL,NULL);
dictSetVal(db->dict,de,NULL);
}
}
/* Release the key-val pair, or just the key if we set the val
* field to NULL in order to lazy free it later.
* 释放键值对,如果value置null失败,就只释放key,值会延时删除
* 执行unlink的意思是将键值对的映射关系清除,导致使用这个key无法查到原来的对象了。*/
if (de) {
dictFreeUnlinkedEntry(db->dict,de);
if (server.cluster_enabled) slotToKeyDel(key->ptr);
return 1;
} else {
return 0;
}
}
定期过期
定期过期的代码稍微长一点,执行的细节也较多。但是我们知道该策略下,是会定期去取样来进行判断,看是否需要进行删除。具体可以从几个问题入手:
- 每隔多久扫描一次?
- 每次扫描的是什么?
- 扫描后怎么去清除?
Redis的扫描基于时间事件驱动,在server.c文件中有一个serverCron()函数,在函数的底部有这样一段代码:
return 1000/server.hz;
在redis.conf文件中有一段这样的注释:
# Redis calls an internal function to perform many background tasks, like
# closing connections of clients in timeout, purging expired keys that are
# never requested, and so forth.
# Redis会在后台执行一些内部的函数,比如关闭超时的客户端连接,清除不会再被访问的过期key等等
# Not all tasks are performed with the same frequency, but Redis checks for
# tasks to perform according to the specified "hz" value.
# 并不是所有的任务都是相同的执行频率,但是Redis会按照配置的“hz”的值去执行,默认的hz值为10,
# 增大这个值会使Redis空闲的时候占用更多的CPU,但同时也会让Redis在有很多键同时过期的时候响应更快,
# 并且可以更精确的处理超时。
# By default "hz" is set to 10. Raising the value will use more CPU when
# Redis is idle, but at the same time will make Redis more responsive when
# there are many keys expiring at the same time, and timeouts may be
# handled with more precision.
# hz的范围是1-500,但是一般建议不要超过100,大多数情况下使用默认值10就够用了,
# 在服务器有很大的延迟的时候,提升到100也可。
# The range is between 1 and 500, however a value over 100 is usually not
# a good idea. Most users should use the default of 10 and raise this up to
# 100 only in environments where very low latency is required.
# 1000/hz毫秒执行一次定期过期扫描
hz 10
所以默认是1000/10=100ms执行一次,每秒钟执行10次,第一个问题就解决了
。配置为100的时候是10ms执行一次,扫描过期数据变得更加频繁,CPU会增加压力,但是清理掉的过期key会更多,同样是对CPU不友好,但是对内存友好的设置。
第二个问题,扫描什么?既然是定期过期,那扫描必定是扫描哪些设置了过期时间的key。
if ((num = dictSize(db->expires)) == 0) {
db->avg_ttl = 0;
break;
}
所以第二个问题也解决了
,就是扫描设置了过期时间的key。
如果没有设置过期时间的key,则不进行扫描。
第三个问题,怎么清除?
定期过期的逻辑代码较多,为了不影响思路,进贴出核心部分:
//循环DB,可配,默认16
for (j = 0; j < dbs_per_call && timelimit_exit == 0; j++) {
/* Expired and checked in a single loop. */
unsigned long expired, sampled;
redisDb *db = server.db+(current_db % server.dbnum);
/* Increment the DB now so we are sure if we run out of time
* in the current DB we'll restart from the next. This allows to
* distribute the time evenly across DBs. */
current_db++;
/* Continue to expire if at the end of the cycle there are still
* a big percentage of keys to expire, compared to the number of keys
* we scanned. The percentage, stored in config_cycle_acceptable_stale
* is not fixed, but depends on the Redis configured "expire effort". */
do {
unsigned long num, slots;
long long now, ttl_sum;
int ttl_samples;
iteration++;
/* If there is nothing to expire try next DB ASAP. */
//如果没有过期key,循环下一个DB
if ((num = dictSize(db->expires)) == 0) {
db->avg_ttl = 0;
break;
}
slots = dictSlots(db->expires);
now = mstime();
/* When there are less than 1% filled slots, sampling the key
* space is expensive, so stop here waiting for better times...
* The dictionary will be resized asap. */
if (num && slots > DICT_HT_INITIAL_SIZE &&
(num*100/slots < 1)) break;
/* The main collection cycle. Sample random keys among keys
* with an expire set, checking for expired ones. */
expired = 0;
sampled = 0;
ttl_sum = 0;
ttl_samples = 0;
//最多拿20个
if (num > config_keys_per_loop)
num = config_keys_per_loop;
/* Here we access the low level representation of the hash table
* for speed concerns: this makes this code coupled with dict.c,
* but it hardly changed in ten years.
*
* Note that certain places of the hash table may be empty,
* so we want also a stop condition about the number of
* buckets that we scanned. However scanning for free buckets
* is very fast: we are in the cache line scanning a sequential
* array of NULL pointers, so we can scan a lot more buckets
* than keys in the same time. */
long max_buckets = num*20;
long checked_buckets = 0;
//如果拿到的key大于20 或者 循环的checked_buckets大于400,跳出
while (sampled < num && checked_buckets < max_buckets) {
for (int table = 0; table < 2; table++) {
if (table == 1 && !dictIsRehashing(db->expires)) break;
unsigned long idx = db->expires_cursor;
idx &= db->expires->ht[table].sizemask;
//根据index拿到hash桶
dictEntry *de = db->expires->ht[table].table[idx];
long long ttl;
/* Scan the current bucket of the current table. */
checked_buckets++;
//循环hash桶里的key
while(de) {
/* Get the next entry now since this entry may get
* deleted. */
dictEntry *e = de;
de = de->next;
ttl = dictGetSignedIntegerVal(e)-now;
if (activeExpireCycleTryExpire(db,e,now)) expired++;
if (ttl > 0) {
/* We want the average TTL of keys yet
* not expired. */
ttl_sum += ttl;
ttl_samples++;
}
sampled++;
}
}
db->expires_cursor++;
}
total_expired += expired;
total_sampled += sampled;
/* Update the average TTL stats for this database. */
if (ttl_samples) {
long long avg_ttl = ttl_sum/ttl_samples;
/* Do a simple running average with a few samples.
* We just use the current estimate with a weight of 2%
* and the previous estimate with a weight of 98%. */
if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
db->avg_ttl = (db->avg_ttl/50)*49 + (avg_ttl/50);
}
/* We can't block forever here even if there are many keys to
* expire. So after a given amount of milliseconds return to the
* caller waiting for the other active expire cycle. */
if ((iteration & 0xf) == 0) { /* check once every 16 iterations. */
elapsed = ustime()-start;
if (elapsed > timelimit) {
timelimit_exit = 1;
server.stat_expired_time_cap_reached_count++;
break;
}
}
/* We don't repeat the cycle for the current database if there are
* an acceptable amount of stale keys (logically expired but yet
* not reclaimed). */
} while (sampled == 0 ||
(expired*100/sampled) > config_cycle_acceptable_stale);
}
会扫描16个数据库。如下代码:
if (num > config_keys_per_loop)
num = config_keys_per_loop;
限制了num的最大值,而这个值得计算为:
effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort,
active_expire_effort是Redis配置文件的配置,默认是1,那么effort就是0,config_keys_per_loop 计算出来就是20,也就是说每次最多只拿20个key。
Redis是基于hash来实现的, 因此会使用到hash桶,而在出现hash碰撞的时候,和jdk1.7及以前的版本对于HashMap的处理方式一样,使用链表进行处理。在取样的时候,也是按照hash桶的维度去取key。
do{...} while (sampled == 0 ||
(expired*100/sampled) > config_cycle_acceptable_stale);
config_cycle_acceptable_stale的值为10,单位为%,即如果取出的数据中,扫描到过期的数据大于
样本的10%,则说明现在的内存中的过期数据很多,因此程序会再次执行取样和删除。最多执行16次。
if ((iteration & 0xf) == 0) { /* check once every 16 iterations. */
elapsed = ustime()-start;
if (elapsed > timelimit) {
timelimit_exit = 1;
server.stat_expired_time_cap_reached_count++;
break;
}
}
hash碰撞:当两个不相同的key通过hash函数计算出来的hash值相同的时候,称为hash碰撞。
取key样本的时候,如果第一个桶15个,第二个桶20个,则会取35个。第一个桶20个,则只取20个。第一个桶25个,则取25个。
总结:
Redis的定期过期是默认是100毫秒执行一次,每次会从hash桶中取出一定量的数据,扫描是否是过期数据,是的话会删除key,如果删除的数量大于取样的的数量的10%,则会继续执行取样和扫描删除,这个循环最多执行16次。用这种方式虽然会占用CPU资源,但是带来的收益也是可以权衡的。因为这个操作的频率本身就很高。