内存淘汰策略
当我们对Redis设置了maxmemory,那么当Redis的内存达到了这个阈值后,就可以对内存中的内容进行淘汰,直到Redis的内存控制在maxmemory内。
Redis会使用某种策略对存储的数据进行清理,释放内存。称之为Redis内存淘汰策略
内置的淘汰策略
redis.conf中可配置Redis的最大内存量 maxmemory,如果配置为0,在64位系统下则表示无最大内存限制,在32位系统下则表示最大内存限制为 3 GB。
# redis.conf 最大内存配置示例
# 不带单位则 单位是 字节<bytes>
maxmemory 1048576
maxmemory 1048576B
maxmemory 1000KB
maxmemory 100MB
maxmemory 1GB
maxmemory 1000K
maxmemory 100M
maxmemory 1G
当实际使用内存,used_memory(使用info命令查看)达到设置的阀值maxmemory后,Redis将按照预设的淘汰策略进行内存清理工作,共设置有8种
淘汰策略名称 | 策略含义 |
---|---|
noeviction | 默认策略,不淘汰数据;大部分写命令都将返回错误(DEL等少数除外) |
allkeys-lru | 从所有数据中根据 LRU 算法挑选数据淘汰 |
volatile-lru | 从设置了过期时间的数据中根据 LRU 算法挑选数据淘汰 |
allkeys-random | 从所有数据中随机挑选数据淘汰 |
volatile-random | 从设置了过期时间的数据中随机挑选数据淘汰 |
volatile-ttl | 从设置了过期时间的数据中,挑选越早过期的数据进行删除 |
allkeys-lfu | 从所有数据中根据 LFU 算法挑选数据淘汰(4.0及以上版本可用) |
volatile-lfu | 从设置了过期时间的数据中根据 LFU 算法挑选数据淘汰(4.0及以上版本可用) |
通过配置项 maxmemory_policy
来设置,默认的淘汰策略是noeviction
按策略来分
ttl:设置了过期时间的key中,剩余时间更少的优先淘汰。
lru:最近最少使用的key,优先被淘汰。
lfu:最近访问频率最少的key,优先被淘汰。
random:随机淘汰内存中内容。
noeviction:无法再写入Redis,不会处理内存中的内容,是默认的淘汰策略。
按范围来分
allkeys-xxx:allkeys开头的是对Redis中的所有key都在淘汰范围内。
volatile-xxx:volatile开头的是对Redis中的设置了超时时间的key列入淘汰范围。
在提供的淘汰策略里面,有lru和lfu,但这都不是传统意义的LRU算法和LFU算法,只是Reids实现的近似算法。下面详细介绍
Redis的近似LRU算法
原生LRU算法
LRU(Least Recently Used)最近最少使用。优先淘汰最近未被使用的数据,其核心思想是“如果数据最近被访问过,那么将来被访问的几率也更高”。
LRU底层结构是 hash 表 + 双向链表。hash 表用于保证查询操作的时间复杂度是O(1),双向链表用于保证节点插入、节点删除的时间复杂度是O(1)。
为什么是 双向链表而不是单链表呢?单链表可以实现头部插入新节点、尾部删除旧节点的时间复杂度都是O(1),但是对于中间节点时间复杂度是O(n),因为对于中间节点c,我们需要将该节点c移动到头部,此时只知道他的下一个节点,要知道其上一个节点需要遍历整个链表,时间复杂度为O(n)。
LRU GET操作:如果节点存在,则将该节点移动到链表头部,并返回节点值;
LRU PUT操作:①节点不存在,则新增节点,并将该节点放到链表头部;②节点存在,则更新节点,并将该节点放到链表头部。
如果要运用完全的LRU算法,Redis需要额外维护一个双向链表 + HashTable,这样对Redis来说并不划算
- 原生LRU算法需要 双向链表 来管理数据,需要额外内存;
- 数据访问时涉及数据移动,有性能损耗;
- Redis现有数据结构需要改造;
近似算法的原理
为什么说是近似呢,因为Redis是通过随机抽样的方式来获取需要删除的数据,而不是遍历所有数据
在Redis中,Redis的key的底层结构是 redisObject,redisObject 中 lru:LRU_BITS 字段用于记录该key最近一次被访问时的Redis时钟 server.lruclock(Redis在处理数据时,都会调用lookupKey方法用于更新该key的时钟)。
server.lruclock 实际是一个 24bit 的整数,默认是 Unix 时间戳对 2^24 取模的结果,其精度是毫秒。
typedef struct redisObject {
unsigned type:4; // 类型
unsigned encoding:4; // 编码
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
int refcount; // 引用计数
void *ptr; // 指向存储实际值的数据结构的指针,数据结构由 type、encoding 决定。
} robj;
当Redis启动时,全局的serverCron时间事件(还有一个文件事件,主要实现一些网络任务),这个serverCron每100ms执行一次,在这个时间事件中,会触发更新Redis时钟的方法 server.lruclock = getLRUClock()
当 used_memory > maxmemory 时,Redis通过 freeMemoryIfNeeded 方法完成数据淘汰。LRU策略淘汰核心逻辑在 evictionPoolPopulate(淘汰数据集合填充) 方法。
实现逻辑
- 首次淘汰:随机抽样选出【最多N个数据】放入【待淘汰数据池 evictionPoolEntry】;
- 数据量N:由 redis.conf 配置的 maxmemory-samples 决定,默认值是5,配置为10将非常接近真实LRU效果,但是更消耗CPU;
- samples:n.样本;v.抽样;
- 再次淘汰:随机抽样选出【最多N个数据】,只要数据比【待淘汰数据池 evictionPoolEntry】中的【任意一条】数据的 lru 小,则将该数据填充至 【待淘汰数据池】;
- evictionPoolEntry 的容容量是 EVPOOL_SIZE = 16;
- 详见 源码 中 evictionPoolPopulate 方法的注释;
- 执行淘汰: 挑选【待淘汰数据池】中 lru 最小的一条数据进行淘汰;
Redis为了避免长时间或一直找不到足够的数据填充【待淘汰数据池】,代码里(dictGetSomeKeys 方法)强制写死了单次寻找数据的最大次数是 [maxsteps = count*10; ],count 的值其实就是 maxmemory-samples。从这里我们也可以获得另一个重要信息:单次获取的数据可能达不到 maxmemory-samples 个。此外,如果Redis数据量(所有数据 或 有过期时间 的数据)本身就比 maxmemory-samples 小,那么 count 值等于 Redis 中数据量个数。
// 源码位于 evict.c
/* This is a wrapper for freeMemoryIfNeeded() that only really calls the
* function if right now there are the conditions to do so safely:
*
* - There must be no script in timeout condition.
* - Nor we are loading data right now.
*
*/
int freeMemoryIfNeededAndSafe(void) {
if (server.lua_timedout || server.loading) return C_OK;
return freeMemoryIfNeeded();
}
// 源码位于 evict.c
int freeMemoryIfNeeded(void) {
// Redis内存释放核心逻辑代码
// 计算使用内存大小;
// 判断配置的数据淘汰策略,按对应的处理方式处理;
void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
...
}
}
// 【待淘汰数据池 evictionPoolEntry】填充 evictionPoolPopulate
// 源码位于 evict.c
/* This is an helper function for freeMemoryIfNeeded(), it is used in order
* to populate the evictionPool with a few entries every time we want to
* expire a key. Keys with idle time smaller than one of the current
* keys are added. Keys are always added if there are free entries.
*
* We insert keys on place in ascending order, so keys with the smaller
* idle time are on the left, and keys with the higher idle time on the
* right. */
void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
// sampledict :db->dict(从所有数据淘汰时值为 dict) 或 db->expires(从设置了过期时间的数据中淘汰时值为 expires);
// pool : 待淘汰数据池
// 获取 最多 maxmemory_samples 个数据,用于后续比较淘汰;
count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);
}
// // 【待淘汰数据池 evictionPoolEntry】
// 源码位于 evict.c
// EVPOOL_SIZE:【待淘汰数据池】存放的数据个数;
// EVPOOL_CACHED_SDS_SIZE:【待淘汰数据池】存放key的最大长度,大于255将单独申请内存空间,长度小于等于255的key将可以复用初始化时申请的内存空间;
// evictionPoolEntry 在 evictionPoolAlloc() 初始化,而initServer() 将调用evictionPoolAlloc()。
#define EVPOOL_SIZE 16
#define EVPOOL_CACHED_SDS_SIZE 255
struct evictionPoolEntry {
unsigned long long idle; /* key的空闲时间 (LFU访问频率的反频率) */
sds key; /* Key name. */
sds cached; /* Cached SDS object for key name. */
int dbid; /* Key DB number. */
};
static struct evictionPoolEntry *EvictionPoolLRU;
删除策略
Redis的内部数据一般会设置过期时间,Redis内部会有某些方式,来清除这些过期的数据,来达到CPU和内存间的一种平衡,这些方法称之为Redis删除策略
在进行信息存储的时候,Redis不仅仅会进行一个信息的存储,同时还会分配出expires空间,通过地址关联存储key的过期时间
当key过期时,需要过过期的key清除掉,以节省内存。共有四种删除策略
- 定时删除(立刻删除)
- 惰性删除
- 定期删除
Redis服务器实际使用的是惰性删除和定期删除两种策略:通过配合使用这两种删除策略,服务器可以很好地在合理使用CPU时间和避免浪费内存空间之间取得平衡。
ps:至于什么时候用的定期删除,什么是有惰性删除,不得而知。有知道的小老板请留言。
定时删除(立刻删除)
在设置键的过期时间时,创建一个定时器(timer),当过期时间达到时,由时间处理器自动执行键的删除操作(这里的删除是将key和expires区域的数据全部删除)
- 优点:节约内存,到时就删,快速释放掉不必要的内存占用
- 缺点:CPU压力变大,无论CPU此时负载量多高,均占用CPU,会影响redis服务器的响应时间和指令吞吐量
- 一句话:用CPU换内存
定时器Timer需要用到Redis中的时间事件,而当前时间事件是用无序链表实现的,查询一个事件的时间复杂度是O(n)
ps: 所谓无序链表,就是用指针实现的单链表,而不是一块连续内存里面的顺序表。
惰性删除
数据到期时不做删除,等下次访问时进行删除
-
如果未过期返回数据
-
发现已过期删除返回不存在
-
优点: 节约CPU的性能,发现必须删除的时候才删除
-
缺点: 长期大量占用内存
一句话:内存换CPU
定期删除
redis服务器初始化时,读取配置server.hz
的值,默认是10,意味着 每秒钟执行server.hz次serverCron(),serverCron是Redis的时间事件
执行流程如下:
- activeExpireCycle()对每一个expires[*]逐一进行检测,每次执行 250ms/server.hz 的时间
- 对某一个expires[*]检测时,随机挑选W个key检测
- 如果key超时,删除key
- 如果一轮中删除的key数量>W*25%,继续检测该expires(循环该过程)
- 如果一轮中删除的key数量<=W*25%,则检查下一个expires
- W= config_keys_per_loop ,
- config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP + ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort
- ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP = 20
周期性轮询redis库中的时效性数据,采用随机抽取的策略,利用过期数据占比的方式控制删除频度
源码
https://blog.csdn.net/why444216978/article/details/106959089
惰性删除
robj *lookupKeyReadWithFlags(redisDb *db, robj *key, int flags) {
robj *val;
if (expireIfNeeded(db,key) == 1) {
/* Key expired. If we are in the context of a master, expireIfNeeded()
* returns 0 only when the key does not exist at all, so it's safe
* to return NULL ASAP. */
if (server.masterhost == NULL) return NULL;
/* However if we are in the context of a slave, expireIfNeeded() will
* not really try to expire the key, it only returns information
* about the "logical" status of the key: key expiring is up to the
* master in order to have a consistent view of master's data set.
*
* However, if the command caller is not the master, and as additional
* safety measure, the command invoked is a read-only command, we can
* safely return NULL here, and provide a more consistent behavior
* to clients accessign expired values in a read-only fashion, that
* will say the key as non existing.
*
* Notably this covers GETs when slaves are used to scale reads. */
if (server.current_client &&
server.current_client != server.master &&
server.current_client->cmd &&
server.current_client->cmd->flags & CMD_READONLY)
{
return NULL;
}
}
val = lookupKey(db,key,flags);
if (val == NULL)
server.stat_keyspace_misses++;
else
server.stat_keyspace_hits++;
return val;
}
- 访问某个key时先检测key是否存在,如果存在继续向下执行,否则直接返回空
- 检查key是否过期,如果没有过期直接返回对应value
- 如果过期,判断当前是从还是主,如果是从直接返回null,是主则执行下面回收逻辑并返回null
- 更新key的miss计数
- 根据server.lazyfree_lazy_expire检查后台线是否在进行定期随机删除,如果不在直接同步删除,结束
- 如果正在进行定期删除任务,则调用bio方法创建一个后台job,放到bio的job链表尾部,等待定时任务执行删除
定期删除
struct redisServer {
...
int active_expire_effort; /* From 1 (default) to 10, active effort. */
double stat_expired_stale_perc; /* Percentage of keys probably expired */
...
}
/* Try to expire a few timed out keys. The algorithm used is adaptive and
* will use few CPU cycles if there are few expiring keys, otherwise
* it will get more aggressive to avoid that too much memory is used by
* keys that can be removed from the keyspace.
*
* Every expire cycle tests multiple databases: the next call will start
* again from the next db, with the exception of exists for time limit: in that
* case we restart again from the last database we were processing. Anyway
* no more than CRON_DBS_PER_CALL databases are tested at every iteration.
*
* The function can perform more or less work, depending on the "type"
* argument. It can execute a "fast cycle" or a "slow cycle". The slow
* cycle is the main way we collect expired cycles: this happens with
* the "server.hz" frequency (usually 10 hertz).
*
* However the slow cycle can exit for timeout, since it used too much time.
* For this reason the function is also invoked to perform a fast cycle
* at every event loop cycle, in the beforeSleep() function. The fast cycle
* will try to perform less work, but will do it much more often.
*
* The following are the details of the two expire cycles and their stop
* conditions:
*
* If type is ACTIVE_EXPIRE_CYCLE_FAST the function will try to run a
* "fast" expire cycle that takes no longer than EXPIRE_FAST_CYCLE_DURATION
* microseconds, and is not repeated again before the same amount of time.
* The cycle will also refuse to run at all if the latest slow cycle did not
* terminate because of a time limit condition.
*
* If type is ACTIVE_EXPIRE_CYCLE_SLOW, that normal expire cycle is
* executed, where the time limit is a percentage of the REDIS_HZ period
* as specified by the ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC define. In the
* fast cycle, the check of every database is interrupted once the number
* of already expired keys in the database is estimated to be lower than
* a given percentage, in order to avoid doing too much work to gain too
* little memory.
*
* The configured expire "effort" will modify the baseline parameters in
* order to do more work in both the fast and slow expire cycles.
*/
#define ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP 20 /* Keys for each DB loop. */
#define ACTIVE_EXPIRE_CYCLE_FAST_DURATION 1000 /* Microseconds. */
#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* Max % of CPU to use. */
#define ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE 10 /* % of stale keys after which
we do extra efforts. */
void activeExpireCycle(int type) {
/* Adjust the running parameters according to the configured expire
* effort. The default effort is 1, and the maximum configurable effort
* is 10. */
unsigned long
effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort,
config_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION +
ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort,
config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
2*effort,
config_cycle_acceptable_stale = ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE-
effort;
/* This function has some global state in order to continue the work
* incrementally across calls. */
static unsigned int current_db = 0; /* Last DB tested. */
static int timelimit_exit = 0; /* Time limit hit in previous call? */
static long long last_fast_cycle = 0; /* When last fast cycle ran. */
int j, iteration = 0;
int dbs_per_call = CRON_DBS_PER_CALL;
long long start = ustime(), timelimit, elapsed;
/* When clients are paused the dataset should be static not just from the
* POV of clients not being able to write, but also from the POV of
* expires and evictions of keys not being performed. */
if (clientsArePaused()) return;
if (type == ACTIVE_EXPIRE_CYCLE_FAST) {
/* Don't start a fast cycle if the previous cycle did not exit
* for time limit, unless the percentage of estimated stale keys is
* too high. Also never repeat a fast cycle for the same period
* as the fast cycle total duration itself. */
if (!timelimit_exit &&
server.stat_expired_stale_perc < config_cycle_acceptable_stale)
return;
if (start < last_fast_cycle + (long long)config_cycle_fast_duration*2)
return;
last_fast_cycle = start;
}
/* We usually should test CRON_DBS_PER_CALL per iteration, with
* two exceptions:
*
* 1) Don't test more DBs than we have.
* 2) If last time we hit the time limit, we want to scan all DBs
* in this iteration, as there is work to do in some DB and we don't want
* expired keys to use memory for too much time. */
if (dbs_per_call > server.dbnum || timelimit_exit)
dbs_per_call = server.dbnum;
/* We usually should test CRON_DBS_PER_CALL per iteration, with
* two exceptions:
*
* 1) Don't test more DBs than we have.
* 2) If last time we hit the time limit, we want to scan all DBs
* in this iteration, as there is work to do in some DB and we don't want
* expired keys to use memory for too much time. */
if (dbs_per_call > server.dbnum || timelimit_exit)
dbs_per_call = server.dbnum;
/* We can use at max 'config_cycle_slow_time_perc' percentage of CPU
* time per iteration. Since this function gets called with a frequency of
* server.hz times per second, the following is the max amount of
* microseconds we can spend in this function. */
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;
timelimit_exit = 0;
if (timelimit <= 0) timelimit = 1;
if (type == ACTIVE_EXPIRE_CYCLE_FAST)
timelimit = config_cycle_fast_duration; /* in microseconds. */
/* Accumulate some global stats as we expire keys, to have some idea
* about the number of keys that are already logically expired, but still
* existing inside the database. */
long total_sampled = 0;
long total_expired = 0;
for (j = 0; j < dbs_per_call && timelimit_exit == 0; j++) {
/* Expired and checked in a single loop. */
unsigned long expired, sampled;
redisDb *db = server.db+(current_db % server.dbnum);
/* Increment the DB now so we are sure if we run out of time
* in the current DB we'll restart from the next. This allows to
* distribute the time evenly across DBs. */
current_db++;
/* Continue to expire if at the end of the cycle there are still
* a big percentage of keys to expire, compared to the number of keys
* we scanned. The percentage, stored in config_cycle_acceptable_stale
* is not fixed, but depends on the Redis configured "expire effort". */
do {
unsigned long num, slots;
long long now, ttl_sum;
int ttl_samples;
iteration++;
/* If there is nothing to expire try next DB ASAP. */
if ((num = dictSize(db->expires)) == 0) {
db->avg_ttl = 0;
break;
}
slots = dictSlots(db->expires);
now = mstime();
/* When there are less than 1% filled slots, sampling the key
* space is expensive, so stop here waiting for better times...
* The dictionary will be resized asap. */
if (num && slots > DICT_HT_INITIAL_SIZE &&
(num*100/slots < 1)) break;
/* The main collection cycle. Sample random keys among keys
* with an expire set, checking for expired ones. */
expired = 0;
sampled = 0;
ttl_sum = 0;
ttl_samples = 0;
if (num > config_keys_per_loop)
num = config_keys_per_loop;
/* Here we access the low level representation of the hash table
* for speed concerns: this makes this code coupled with dict.c,
* but it hardly changed in ten years.
*
* Note that certain places of the hash table may be empty,
* so we want also a stop condition about the number of
* buckets that we scanned. However scanning for free buckets
* is very fast: we are in the cache line scanning a sequential
* array of NULL pointers, so we can scan a lot more buckets
* than keys in the same time. */
long max_buckets = num*20;
long checked_buckets = 0;
while (sampled < num && checked_buckets < max_buckets) {
for (int table = 0; table < 2; table++) {
if (table == 1 && !dictIsRehashing(db->expires)) break;
unsigned long idx = db->expires_cursor;
idx &= db->expires->ht[table].sizemask;
dictEntry *de = db->expires->ht[table].table[idx];
long long ttl;
/* Scan the current bucket of the current table. */
checked_buckets++;
while(de) {
/* Get the next entry now since this entry may get
* deleted. */
dictEntry *e = de;
de = de->next;
ttl = dictGetSignedIntegerVal(e)-now;
if (activeExpireCycleTryExpire(db,e,now)) expired++;
if (ttl > 0) {
/* We want the average TTL of keys yet
* not expired. */
ttl_sum += ttl;
ttl_samples++;
}
sampled++;
}
}
db->expires_cursor++;
}
total_expired += expired;
total_sampled += sampled;
/* Update the average TTL stats for this database. */
if (ttl_samples) {
long long avg_ttl = ttl_sum/ttl_samples;
/* Do a simple running average with a few samples.
* We just use the current estimate with a weight of 2%
* and the previous estimate with a weight of 98%. */
if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
db->avg_ttl = (db->avg_ttl/50)*49 + (avg_ttl/50);
}
/* We can't block forever here even if there are many keys to
* expire. So after a given amount of milliseconds return to the
* caller waiting for the other active expire cycle. */
if ((iteration & 0xf) == 0) { /* check once every 16 iterations. */
elapsed = ustime()-start;
if (elapsed > timelimit) {
timelimit_exit = 1;
server.stat_expired_time_cap_reached_count++;
break;
}
}
/* We don't repeat the cycle for the current database if there are
* an acceptable amount of stale keys (logically expired but yet
* not reclaimed). */
} while (sampled == 0 ||
(expired*100/sampled) > config_cycle_acceptable_stale);
}
elapsed = ustime()-start;
server.stat_expire_cycle_time_used += elapsed;
latencyAddSampleIfNeeded("expire-cycle",elapsed/1000);
/* Update our estimate of keys existing but yet to be expired.
* Running average with this sample accounting for 5%. */
double current_perc;
if (total_sampled) {
current_perc = (double)total_expired/total_sampled;
} else
current_perc = 0;
server.stat_expired_stale_perc = (current_perc*0.05)+
(server.stat_expired_stale_perc*0.95);
}
参考文献
https://blog.csdn.net/qq_42282547/article/details/105656955
https://www.cnblogs.com/liushoudong/p/12679174.html
https://ld246.com/article/1592404555781
https://www.cnblogs.com/itplay/p/10162935.html
https://blog.csdn.net/why444216978/article/details/106959089