Cache的设计和实现 LRU Cache

最新推荐文章于 2024-07-11 17:44:38 发布

ojshilu

最新推荐文章于 2024-07-11 17:44:38 发布

阅读量2.6k

点赞数

分类专栏： Linux 操作系统 LeetCode 高效计算链式结构问题我爱算法

本文链接：https://blog.csdn.net/ojshilu/article/details/22648729

版权

LeetCode 同时被 3 个专栏收录

151 篇文章 0 订阅

订阅专栏

我爱算法

142 篇文章 20 订阅

订阅专栏

高效计算

51 篇文章 0 订阅

订阅专栏

Cache的应用非常广泛，其实Cache是一种思想，一个广义的词汇，一种在性能和金钱的权衡上的思想。

Cache的思想用在很多地方，使用的载体也不同，都是位于相对高速设备和相对低速设备之间，起到缓存的作用。

1、最常用处：用在内存和CPU之间，以SRAM作为Cache的载体。（内存是DRAM，CPU是寄存器）

2、用在内存和硬盘之间。

3、用在硬盘和网络之间。

Cache的操作方法：

1、外部操作：数据访问，Get和Set。高速设备要找一个数据，先从Cache找，如果有就直接用；如果没有则去低速设备找，并调入到Cache。

2、内部操作：Cache更新。Cache的大小必定是有限的。所以需要一个淘汰算法（使得命中率尽可能高），最常用的是LRU。当Cache满了却又要加入新的数据，就需要淘汰旧的数据。

Cache的典型实现：

LRU的典型实现是 hash map + double linked list

为什么要用两种数据结构？

理由：hash表是用来访问数据的，因为hash的查找速度快，O(1)。双向链表是用来记录最近访问情况的，因为Hash做不到排序。

如果访问键值为key的数据已经在Cache中，那就访问hash中的该数据，同时要将存储该key的结点移到双向链表头部。

如果访问的数据不在Cache中：

若此时Cache还没满，那么我们将新结点插入到链表头部，同时在哈希表添加（键值，结点地址）。

若此时Cache已经满了，我们就将链表中的最后一个结点(注意不是尾结点)的内容替换为新内容，然后移动到头部，更新哈希表中对应的（键值，结点地址）。

查询操作的时间是O(1)，直接通过hash找到节点，而不是通过链表顺序查找的。

对链表的操作插入操作和移动操作也都是O(1)。

LRUCache简单实例（源自于Leetcode）:

实现get(key) 函数：如果key对应的value存在则返回value，否则返回-1。 Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.

实现set(key, value)函数：设置或插入value值。 Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

具体实现方法：

哈希表用map实现，结点存储的是key和对应的双向链表结点指针。

双向链表用list实现，结点存储的是key和value。

这样，由key可以用哈希表快速找到链表结点，进行数据访问和结点位置更新。

class LRUCache{
public:
    LRUCache(int capacity) {
        cap = capacity;
        size = 0;
        hash_table.clear();
        double_list.clear();
    }
    
    int get(int key) {
        map<int, list<pair<int, int> >::iterator>::iterator it = hash_table.find(key);

        if(it != hash_table.end())  //update double_list
        {
            int value = (*( (*it).second ) ).second; 
            double_list.splice(double_list.begin(), double_list, (*it).second);
            return value;
        }
        else //Not in Cache
        {
            return -1;
            //In fact, we should go to another place to get it.
        }
    }
    
    void set(int key, int value) {
        map<int, list<pair<int, int> >::iterator>::iterator it = hash_table.find(key);

        if(it != hash_table.end()) //update double_list and hash_table's value
        {
            double_list.splice(double_list.begin(), double_list, (*it).second);
	        (*( (*it).second ) ).second = value;
        }
        else
        {
            if(size == cap) //invalidate the lru one
            {
                int oldkey = double_list.back().first;
		        list<pair<int, int> >::iterator oldone = double_list.end();
		        oldone--;
                double_list.splice(double_list.begin(), double_list, oldone);
                
                hash_table.erase(oldkey);
		        (*double_list.begin()).first = key;
		        (*double_list.begin()).second = value;
                hash_table[key] = double_list.begin();
            }
            else // insert new one to cache
            {
                double_list.push_front(pair<int, int>(key, value));
                hash_table[key] = double_list.begin();
                size++;
            }
        }
    }
    
private:
        //注意：对双向链表的操作不要使迭代器失效
        map<int, list<pair<int, int> >::iterator> hash_table; //存储：(关键词key , 双链表中的结点指针)
        list<pair<int, int> > double_list; //结点存储：(key, value)
        int size;
        int cap;
};

注意细节：

对双向链表list的修改绝对不可以使原有的迭代器失效，否则hash里存储的内容就失效了。

这里使用的是splice方法来将中间的某个结点移动到首位置。