初识Redis内存淘汰

大华架构

已于 2022-11-30 17:10:46 修改

阅读量479

点赞数

分类专栏： Redis 文章标签： redis 数据库缓存

于 2022-11-30 17:02:16 首次发布

本文链接：https://blog.csdn.net/qq_45495532/article/details/128112058

版权

Redis 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

想要深入理解Redi内存淘汰的原理，我们先要看一下redisObject这个结构体

typedef struct redisObject {
    unsigned type:4; // 表示Value的类型
    unsigned encoding:4; // 表示内部存储的编码
    unsigned lru:LRU_BITS; // 24bit
     /* LRU time (relative to global lru_clock)(24位都用来存储时间) or
      * LFU data (least significant 8 bits frequency
      * and most significant 16 bits access time)
      * (8位要用来存储访问的频率，因为8位最多只能存                
      * 储255，所以为了存储更多，就要使用存储频次的对数的方式进行存储). */
    int refcount; // 引用计数
    void *ptr; // 指向具体的数据的指针，Redis新版String长度<=44字节的数据，字符串sds和    redisobject一起分配，从而只要一次内存操作即可
} robj;

1、我们看到在类似于lazyfreeFreeObject这样的方法中，他会执行decrRefCount()这样一个扣减引用数量的方法，当o->refcount == 1时，就会执行freeXXXXObject()这样释放内存的方法。由此可见，Redis使用的是应用计数法来实现内存回收。

void decrRefCount(robj *o) {
    if (o->refcount == 1) {
        switch(o->type) {
        case OBJ_STRING: freeStringObject(o); break;
        case OBJ_LIST: freeListObject(o); break;
        case OBJ_SET: freeSetObject(o); break;
        case OBJ_ZSET: freeZsetObject(o); break;
        case OBJ_HASH: freeHashObject(o); break;
        case OBJ_MODULE: freeModuleObject(o); break;
        case OBJ_STREAM: freeStreamObject(o); break;
        default: serverPanic("Unknown object type"); break;
        }
        zfree(o);
    } else {
        if (o->refcount <= 0) serverPanic("decrRefCount against refcount <= 0");
        if (o->refcount != OBJ_SHARED_REFCOUNT) o->refcount--;
    }
}

缓存淘汰算法

当Redis内存超出物理内存限制时，内存的数据会开始和磁盘产生频繁的交换(swap)，会让Redis性能急剧下降，这就不符合Redis的设计初衷。

所以，我们要配置maxmemory来限制内存超出期望大小，当实际内存超出 maxmemory 时，Redis 提供了以下几种内存淘汰策略(maxmemory-policy)供用户来决定如何腾出内存供用户读写。

# volatile-lru -> Evict using approximated LRU, only keys with an expire set.
# allkeys-lru -> Evict any key using approximated LRU.
# volatile-lfu -> Evict using approximated LFU, only keys with an expire set.
# allkeys-lfu -> Evict any key using approximated LFU.
# volatile-random -> Remove a random key having an expire set.
# allkeys-random -> Remove a random key, any key.
# volatile-ttl -> Remove the key with the nearest expire time (minor TTL)
# noeviction -> Don't evict anything, just return an error on write operations.

下面我们来看看redis设计的LRU与传统的LRU有什么不同，先看下我用Java在leetcode上刷题时写的LRU算法,

class LRUCache {

    MyCache<Integer, Integer> cache;
    public LRUCache(int capacity) {
        cache = new MyCache<>(capacity);
    }
    public void put(int key,int value){
        cache.addCache(key,value);
    }
    public int get(int key){
        Integer integer = cache.get(key);
        return integer == null ? -1 : integer;
    }
    public static class Node<K,V>{
        public K key;
        public V value;
        public Node(K key, V value) {
            this.key = key;
            this.value = value;
        }
        Node<K,V> last;
        Node<K,V> next;
    }
    public static class NodeDoubleLinkedList<K,V>{
        public Node<K,V> head;
        public Node<K,V> tail;

        public NodeDoubleLinkedList() {
            head = null;
            tail = null;
        }
        public void add(Node<K,V> newNode){
            if(newNode == null)return;
            if(head == null){
                head = newNode;
                tail = newNode;
            }else {
                tail.next = newNode;
                newNode.last = tail;
                tail = newNode;
            }
        }
        public Node<K,V> removeHead(){
            if (head == null)
            return null;
            else {
                Node<K,V> res = head;
                head = head.next;
                return res;
            }
        }
        // 移动当前节点到尾部，头部元素会被优先淘汰
        public void moveNodeToTail(Node<K,V> node){
            if(tail == node)return;
            if(head == node){
                head = head.next;
                head.last = null;
            }else {
                node.last.next = node.next;
                node.next.last = node.last;
            }
            add(node);
        }
    }

    public static class MyCache<K,V>{
        // 存储数据的集合
        private HashMap<K,Node<K,V>> cache;
        // 双向链表的头和尾
        private NodeDoubleLinkedList<K,V> nodeList;
        private final int capacity;
        public MyCache(int cap) {
            cache = new HashMap<K,Node<K,V>>(cap);
            nodeList = new NodeDoubleLinkedList<K,V>();
            capacity = cap;
        }
        public void addCache(K key, V value){
            if(cache.containsKey(key)){
                Node<K, V> kvNode = cache.get(key);
                kvNode.value = value;
                nodeList.moveNodeToTail(kvNode);
            }else {
                Node<K, V> kvNode = new Node<>(key, value);
                nodeList.add(kvNode);
                cache.put(key,kvNode);
                if(cache.size()>capacity){
                    cache.remove(nodeList.removeHead().key);
                }

            }
        }
        public V get(K key){
            Node<K, V> kvNode = cache.get(key);
            if(kvNode == null){
                return null;
            }else {
                nodeList.moveNodeToTail(kvNode);
                return kvNode.value;
            }
        }
    }
}

/**
 * Your LRUCache object will be instantiated and called as such:
 * LRUCache obj = new LRUCache(capacity);
 * int param_1 = obj.get(key);
 * obj.put(key,value);
 */

本质上就是改变链表的指针指向，将最新操作过的元素放到尾部。

然而，redis要保证高性能，毕竟命令是单线程执行的，如果用传统的LRU那肯定会产生阻塞，影响性能，所以，Redis使用的是近似LRU算法的实现，具体是这样的：

1、当 Redis 执行写操作时，发现内存超出maxmemory，

2、随机采样出 5(可以配置maxmemory-samples)个Key，

3、根据redisObject的unsigned lru:LRU_BITS字段，淘汰掉最旧的 key

这里说到redisObject的unsigned lru:24字段，不得不提这个值是哪来的

首先在创建redisObject的时候给lru字段赋值了

/* ===================== Creation and parsing of objects ==================== */

robj *createObject(int type, void *ptr) {
    robj *o = zmalloc(sizeof(*o));
    o->type = type;
    o->encoding = OBJ_ENCODING_RAW;
    o->ptr = ptr;
    o->refcount = 1;

    /* Set the LRU to the current lruclock (minutes resolution), or
     * alternatively the LFU counter. */
    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
        o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
    } else {
        o->lru = LRU_CLOCK();
    }
    return o;
}

我们这里看看LFUGetTimeInMinutes这个方法吧；

unsigned long LFUGetTimeInMinutes(void) {
    return (server.unixtime/60) & 65535;
}

这里我们看到他用了server中的unixtime变量，那我们在server.c中看看这个unixtime是怎么来的

/* We take a cached value of the unix time in the global state because with
 * virtual memory and aging there is to store the current time in objects at
 * every object access, and accuracy is not needed. To access a global var is
 * a lot faster than calling time(NULL).
 *
 * This function should be fast because it is called at every command execution
 * in call(), so it is possible to decide if to update the daylight saving
 * info or not using the 'update_daylight_info' argument. Normally we update
 * such info only when calling this function from serverCron() but not when
 * calling it from call(). */
void updateCachedTime(int update_daylight_info) {
    server.ustime = ustime();
    server.mstime = server.ustime / 1000;
    time_t unixtime = server.mstime / 1000;
    atomicSet(server.unixtime, unixtime);

    /* To get information about daylight saving time, we need to call
     * localtime_r and cache the result. However calling localtime_r in this
     * context is safe since we will never fork() while here, in the main
     * thread. The logging function will call a thread safe version of
     * localtime that has no locks. */
    if (update_daylight_info) {
        struct tm tm;
        time_t ut = server.unixtime;
        localtime_r(&ut,&tm);
        server.daylight_active = tm.tm_isdst;
    }
}

看到updateCachedTime()方法上面的注释了吗，翻译过来就是：“

我们在全局状态下获取unix时间的缓存值，因为随着虚拟内存和老化，在每次对象访问时都要在对象中存储当前时间，而不需要精确度。访问全局变量比调用time(NULL)要快得多。

这个函数应该很快，因为call()中的每个命令执行都会调用它，所以可以使用'update_daylight_info'参数来决定是否更新夏令时信息。通常我们只在从serverCron()调用此函数时更新此类信息，而在从call()调用时则不更新。”

也就是说Redis把系统时间缓存起来了，这样不用每次都系统调用，我们知道系统调用是需要从用户态切换到内核态，对于redis来说就违背了其追求急速设计的宗旨。

扯远了，还是说回内存淘汰这块，接下来来说说LFU算法

LFU算法是Redis4.0里面新加的一种淘汰策略。它的全称是Least Frequently Used，它的核心思想是根据key的最近被访问的频率进行淘汰，很少被访问的优先被淘汰，被访问的多的则被留下来。这样就不会因为新增加了一堆数据，把真正的热数据给挤掉了。

LFU算法能更好的表示一个key被访问的热度。假如你使用的是LRU算法，一个key很久没有被访问到，只刚刚是偶尔被访问了一次，那么它就被认为是热点数据，不会被淘汰，而有些key将来是很有可能被访问到的则被淘汰了。如果使用LFU算法则不会出现这种情况，因为使用一次并不会使一个key成为热点数据。LFU原理使用计数器来对key进行排序，每次key被访问的时候，计数器增大。计数器越大，可以约等于访问越频繁。具有相同引用计数的数据块则按照时间排序。

LFU一共有两种策略：

volatile-lfu：在设置了过期时间的key中使用LFU算法淘汰key

allkeys-lfu：在所有的key中使用LFU算法淘汰数据

LFU把原来的key对象的内部时钟的24位分成两部分，前16位ldt还代表时钟，后8位logc代表一个计数器。前面在redisObject中有提到。

logc是8个 bit，用来存储访问频次，因为8个 bit能表示的最大整数值为255，存储频次肯定远远不够，所以这8个 bit存储的是频次的对数值，并且这个值还会随时间衰减，如果它的值比较小，那么就很容易被回收。为了确保新创建的对象不被回收，新对象的这8个bit会被初始化为一个大于零的值LFU INIT_VAL（默认是=5）。

ldt是16个bit，用来存储上一次 logc的更新时间。因为只有16个 bit，所精度不可能很高。它取的是分钟时间戳对2的16次方进行取模。

ldt的值和LRU模式的lru字段不一样的地方是,
ldt不是在对象被访问时更新的,而是在Redis 的淘汰逻辑进行时进行更新，淘汰逻辑只会在内存达到 maxmemory 的设置时才会触发，在每一个指令的执行之前都会触发。每次淘汰都是采用随机策略，随机挑选若干个 key，更新这个 key 的“热度”，淘汰掉“热度”最低的key。因为Redis采用的是随机算法，如果
key比较多的话，那么ldt更新得可能会比较慢。不过既然它是分钟级别的精度，也没有必要更新得过于频繁。

ldt更新的同时也会一同衰减logc的值。

下面来聊聊过期策略和惰性删除，来看看redis是怎么处理过期的key的

redis 会将每个设置了过期时间的key 放入到一个独立的字典中，以后会定时遍历这个字典来删除到期的 key。除了定时遍历之外，它还会使用惰性策略来删除过期的 key，所谓惰性策略就是在客户端访问这个 key 的时候，redis 对 key 的过期时间进行检查，如果过期了就立即删除。

定时删除是在设置了过期时间的key字典中每秒进行十次扫描获取随机 20 个 key，删除里面已经过期的 key；如果过期的 key 比率超过 1/4，那就再一次执行扫描。

设想一个大型的 Redis 实例中所有的 key 在同一时间过期了，会出现怎样的结果？

这就是八股文常说的缓存雪崩，解决方法相信大家都知道，就是在过期时间上加上一个随机的时间戳，那缓存雪崩会导致什么后果，又是什么原因导致的呢?

上面我们说到过期的 key 比率超过 1/4，Redis 会持续扫描过期字典 (循环多次)，直到过期字典中过期的key 变得稀疏，才会停止 (循环次数明显下降)。这就会导致线上读写请求出现明显的卡顿现象。导致这种卡顿的另外一种原因是内存管理器需要频繁回收内存页，这也会产生一定的 CPU 消耗。

我们知道redis一般都是以集群的方式在企业中大型项目中使用，那slave节点的过期策略与主节点一样吗？

答案是不一样的，从库不会进行过期扫描，从库对过期的处理是被动的。主库在 key 到期时，会在 AOF 文件里增加一条 del 指令，同步到所有的从库，从库通过执行这条 del 指令来删除过期的 key。

惰性删除

所谓惰性策略就是在客户端访问这个key的时候，redis对key的过期时间进行检查，如果过期了就立即删除，不会给你返回任何东西。

定期删除可能会导致很多过期key到了时间并没有被删除掉。所以就有了惰性删除。假如你的过期 key，靠定期删除没有被删除掉，还停留在内存里，除非你的系统去查一下那个 key，才会被redis给删除掉。这就是所谓的惰性删除，即当你主动去查过期的key时,如果发现key过期了,就立即进行删除,不返回任何东西。

总结：定期删除是集中处理，惰性删除是零散处理。

lazyfree

下面把redis的配置文件搬运过来

############################# LAZY FREEING ####################################

# Redis has two primitives to delete keys. One is called DEL and is a blocking
# deletion of the object. It means that the server stops processing new commands
# in order to reclaim all the memory associated with an object in a synchronous
# way. If the key deleted is associated with a small object, the time needed
# in order to execute the DEL command is very small and comparable to most other
# O(1) or O(log_N) commands in Redis. However if the key is associated with an
# aggregated value containing millions of elements, the server can block for
# a long time (even seconds) in order to complete the operation.
#
# For the above reasons Redis also offers non blocking deletion primitives
# such as UNLINK (non blocking DEL) and the ASYNC option of FLUSHALL and
# FLUSHDB commands, in order to reclaim memory in background. Those commands
# are executed in constant time. Another thread will incrementally free the
# object in the background as fast as possible.
#
# DEL, UNLINK and ASYNC option of FLUSHALL and FLUSHDB are user-controlled.
# It's up to the design of the application to understand when it is a good
# idea to use one or the other. However the Redis server sometimes has to
# delete keys or flush the whole database as a side effect of other operations.
# Specifically Redis deletes objects independently of a user call in the
# following scenarios:
#
# 1) On eviction, because of the maxmemory and maxmemory policy configurations,
#    in order to make room for new data, without going over the specified
#    memory limit.
# 2) Because of expire: when a key with an associated time to live (see the
#    EXPIRE command) must be deleted from memory.
# 3) Because of a side effect of a command that stores data on a key that may
#    already exist. For example the RENAME command may delete the old key
#    content when it is replaced with another one. Similarly SUNIONSTORE
#    or SORT with STORE option may delete existing keys. The SET command
#    itself removes any old content of the specified key in order to replace
#    it with the specified string.
# 4) During replication, when a replica performs a full resynchronization with
#    its master, the content of the whole database is removed in order to
#    load the RDB file just transferred.
#
# In all the above cases the default is to delete objects in a blocking way,
# like if DEL was called. However you can configure each case specifically
# in order to instead release memory in a non-blocking way like if UNLINK
# was called, using the following configuration directives.

# 内存达到 maxmemory 时进行淘汰
lazyfree-lazy-eviction no
# key 过期删除
lazyfree-lazy-expire no
# RENAME 命令删除 destKey
lazyfree-lazy-server-del no
# 从库接受完 rdb 文件后的 flush 操作
replica-lazy-flush no

在使用 DEL 命令删除体积较大的键，又或者在使用FLUSHDB 和 FLUSHALL 删除包含大量键的数据库时，造成redis阻塞的情况；另外redis在清理过期数据和淘汰内存超限的数据时，如果碰巧撞到了大体积的键也会造成服务器阻塞。

为了解决以上问题， redis 4.0 引入了lazyfree的机制，它可以将删除键或数据库的操作放在后台线程里执行，从而尽可能地避免服务器阻塞。

大华架构

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
初识Redis内存淘汰

想要深入理解Redi内存淘汰的原理，我们先要看一下这个结构体1、我们看到在类似于lazyfreeFreeObject这样的方法中，他会执行decrRefCount()这样一个扣减引用数量的方法，当o->refcount == 1时，就会执行freeXXXXObject()这样释放内存的方法。由此可见，Redis使用的是应用计数法来实现内存回收。
复制链接

扫一扫