文章目录
题目来源:LRU Cache
Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and put.
get(key) - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
put(key, value) - Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.
The cache is initialized with a positive capacity.
最近最少使用(LRU, Least Recently Used),是CPU在内存分页管理中的常见置换算法。该算法的思路是,发生缺页中断时,选择未使用时间最长的页面置换出去。这其实是基于局部性原理所设计的一种优良算法,目的在于提高缓存命中率。
The most important program property that we regularly exploit is locality of references : Programs tend to reuse data and instructions they have used recently.
90/10 rule comes from empirical observation: “A program spends 90% of its time in 10% of its code”
Two different types
Temporal locality: states that recently accessed items are likely to be accessed in the near future.
Spatial locality: says that items whose addresses are near one another tend to be referenced close together in time.
从我的角度讲,其实还有一个关键要素:缓存容量受限。不论是寄存器或是L1, L2, L3 Cache,抑或是内存,又或是其他可以对存取速度进行优化存储介质,一般来说有一个朴素的直觉:越快的越贵,越贵的容量越少。站在“LRU Cache”这一题的角度,仅仅是对算法的一种结构化描述,而真正应用到操作系统,肯定不可能是如此简单的形式。所以该算法的核心在于,如何在达到容量上限时,对最久之前使用过的元素进行删除。
方法1:LinkedList + HashMap,O(n) - 241ms
public class LRUCache {
// 容量上限
int capacity;
// 维护顺序,默认先进先出,
// 提供O(n)的元素(非头尾)删除以及O(1)的addFirst/removeLast
LinkedList<Integer> list = new LinkedList<>();
// 保存key到Node的映射,提供O(1)的读取
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
public LRUCache(int capacity) {
this.capacity = capacity;
}
public int get(int key) {
if (map.containsKey(key)) {
moveFirst(key);
return map.get(key);
}
return -1;
}
public void put(int key, int value) {
if (capacity <= 0) {
return;
}
// already contains key
if (map.containsKey(key)) {
map.put(key, value);
moveFirst(key);
return;
}
// not contains key
map.put(key, value);
list.addFirst(key);
if (map.size() > capacity) {
map.remove(list.removeLast());
}
}
private void moveFirst(Integer key) {
// 注意LinkedList有两个remove方法
// 需要使用remove(Object o)
// 不能使用remove(int index)
list.remove(key); // O(n)
list.addFirst(key);
}
}
方法2:O(1)删除操作的双向链表,O(1) - 59ms
打开源码就可以确定,java.util.LinkedList
实现时使用了链表遍历:
public boolean remove(Object o) {
if (o == null) {
for (Node<E> x = first; x != null; x = x.next) {
if (x.item == null) {
unlink(x);
return true;
}
}
} else {
for (Node<E> x = first; x != null; x = x.next) {
if (o.equals(x.item)) {
unlink(x);
return true;
}
}
}
return false;
}
由于java.util.LinkedList
无法提供O(1)的删除操作,我们可以自己实现双向链表,于是有了下面的解法。有点复杂,但熟悉双线链表结构的看官应该一眼就可以看明白。
class Node {
Node prev;
Node next;
int key;
int val;
Node(int key, int val) {
this.key = key;
this.val = val;
}
}
public class LRUCache {
int capacity;
// 头部哨兵
Node head = new Node(-1, -1);
// 尾部哨兵
Node tail = new Node(-1, -1);
Map<Integer, Node> key2Node = new HashMap<Integer, Node>();
public LRUCache(int capacity) {
this.capacity = capacity;
head.next = tail;
tail.prev = head;
}
public int get(int key) {
if (key2Node.containsKey(key)) {
Node target = key2Node.get(key);
moveFirst(target);
return target.val;
}
return -1;
}
public void put(int key, int value) {
if (capacity <= 0) {
return;
}
// already contains key,
// override value and move node to the head of linked-list
if (key2Node.containsKey(key)) {
Node node = key2Node.get(key);
moveFirst(node);
node.val = value;
return;
}
// not contains key
// 1. if LRUCache is full, then delete the last element
// 2. add node to the head of link-list
if (key2Node.size() >= capacity) {
Node last = delLast();
if (last != null) {
key2Node.remove(last.key);
}
}
Node node = new Node(key, value);
addFirst(node);
key2Node.put(key, node);
}
private void moveFirst(Node node) {
node.prev.next = node.next;
node.next.prev = node.prev;
node.next = head.next;
node.prev = head;
head.next.prev = node;
head.next = node;
}
private Node delLast() {
if (tail.prev == head) {
// empty list
return null;
}
Node last = tail.prev;
last.prev.next = last.next;
last.next.prev = last.prev;
// set null to help gc
last.next = null;
last.prev = null;
return last;
}
private void addFirst(Node node) {
node.next = head.next;
node.prev = head;
head.next.prev = node;
head.next = node;
}
}
方法3:LinkedHashMap,O(1) - 57ms
java.util.LinkedHashMap
是可以保持插入顺序的一种HashMap。有一篇文章对其实现描述的已经很完善了。注意java.util.LinkedHashMap
中对于Map.Entry
的实现,和方法2中的Node
其实非常类似(也维护了到前后节点的双向引用before
和after
,而next
是用于维持HashMap中解决冲突的链表用的,如图所示)。所以理论上来说,它应该是实现“LRU Cache”的最佳Java容器。
我们再往深入研究,会发现对于java.util.LinkedHashMap
默认情况下是维持插入顺序。而我们想要的是更近一步的访问顺序,即每次读取需要重新排序,它有没有呢?答案是肯定的。
public LinkedHashMap(int initialCapacity,
float loadFactor,
boolean accessOrder) {
super(initialCapacity, loadFactor);
// accessOrder指明是否使用访问次序,即LRU次序
// 默认情况下为false
this.accessOrder = accessOrder;
}
接下来似乎万事具备,但是在实现时发现,java.util.LinkedHashMap
中居然没有办法删除双向链表的末尾元素。如果要删除末尾元素,只能从前往后遍历,那么复杂度就退化为O(n)。我尝试使用反射对其内部元素进行访问,试图获取到tail
节点:
public int getTailByReflection() {
try {
Field tail = map.getClass().getDeclaredField("tail");
tail.setAccessible(true);
return ((Map.Entry<Integer, Integer>) tail.get(map)).getKey();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
但是失败了,似乎是Leecode进行了一些权限控制:
到这里似乎没有别的办法了。但是一个很偶然的情况,发现我们可以对末位元素实现淘汰策略。至此,这个问题可以优雅地解决了。
public class LRUCache {
int capacity;
// 1. init accessOrder to true: maintain access order
// 2. override removeEldestEntry: when size() is overhead, then map will automatically evicted
Map<Integer, Integer> map = new LinkedHashMap<Integer, Integer>(128, 0.75f, true) {
protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) {
return size() > capacity;
}
};
public LRUCache(int capacity) {
this.capacity = capacity;
}
public int get(int key) {
return map.getOrDefault(key, -1);
}
public void put(int key, int value) {
if (capacity <= 0) {
return;
}
map.put(key, value);
}
}
总结
《Java编程思想(第四版)》中第17.8.3节,LinkedHashMap相关内容里,对LRU也有过阐释。描述了和上面描述的大概一致:
可以在构造器中设定,使之采用基于访问的最近最少使用(LRU)算法,于是没有被访问过的(可被看作需要删除的)元素就会出现在队列的前面1。对于需要定期清理元素以节省空间的程序来说,此功能使得程序很容易得以实现。
可见,java.util.LinkedHashMap
的实现本来就考虑了LRU算法,所以对于上面方法3中,因此可以通过重载removeEldestEntry
来改变删除行为是很自然而然的了。
这里指的是前文所说的队尾 ↩︎