[leetCode] LRU Cache (Java)

最新推荐文章于 2022-03-24 22:11:16 发布

CareerBowl

最新推荐文章于 2022-03-24 22:11:16 发布

阅读量938

点赞数

分类专栏：算法Algorithm Java语言

本文链接：https://blog.csdn.net/u014779993/article/details/24554257

版权

算法Algorithm 同时被 2 个专栏收录

12 篇文章 0 订阅

订阅专栏

Java语言

5 篇文章 0 订阅

订阅专栏

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and set.

get(key) – Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
set(key, value) – Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

下面列出几种LRU Cache 的思路：

1 如果在面试题里考到如何实现LRU，考官一般会要求使用双链表 + hashtable 的方式。双链表 + hashtable实现原理：

LRU的思想是基于“最近用到的数据被重用的概率比较早用到的大的多”这个设计规则来实现的。Cache中的所有块位置都用双向链表链接起来，当一个位置被命中后，就将通过调整链表的指向将该位置调整到链表的头位置，新加入的内容直接放在链表的头上。这样，在进行过多次查找操作后，最近被命中过的内容就向链表的头移动，而没有被命中的内容就向链表的后面移动。当需要替换时，链表最后的位置就是最近最少被命中的位置，我们只需要将新的内容放在链表前面，淘汰链表最后的位置就是想了。

对于双向链表的使用，基于两个考虑。首先是Cache中块的命中可能是随机的，和Load进来的顺序无关，所以我们需要用链表这种结构来保存位置队列，使得其可以灵活的调整相互间的次序。其次，双向链表使得在知道一个位置的情况下可以很迅速的移到其他的地方，时间复杂度为O(1)。

查找一个链表中元素的时间复杂度是O(n)，每次命中的时候，我们就需要花费O(n)的时间来进行查找，如果不添加其他的数据结构，这个就是我们能实现的最高效率了。目前看来，整个算法的瓶颈就是在查找这里了，怎么样才能提高查找的效率呢？Hash表，对，就是它，数据结构中之所以有它，就是因为它的查找时间复杂度是O(1)。梳理一下思路：对于Cache的每个位置，我们设计一个数据结构来储存Cache块的内容，并实现一个双向链表，其中属性next和prev时双向链表的两个指针，key用于存储对象的键值，value用户存储要cache块对象本身，然后用Hash表来查找具体被命中的Cache块。剩下的就是写Code的事了：我们使用一个hashmap作为cache，用hashmap的检索机制来实现cache查找；并用head和last两个属性来记录链表的头和尾。并提供putEntry（），getEntry（）方法来操作该cache.

将Cache的所有位置都用双连表连接起来，当一个位置被命中之后，就将通过调整链表的指向，将该位置调整到链表头的位置，新加入的Cache直接加到链表头中。这样，在多次进行Cache操作后，最近被命中的，就会被向链表头方向移动，而没有命中的，而想链表后面移动，链表尾则表示最近最少使用的Cache。当需要替换内容时候，链表的最后位置就是最少被命中的位置，我们只需要淘汰链表最后的部分即可。（转自博客 http://gogole.iteye.com/blog/692103 )

2. 直接使用SDK LinkedHashMap ，一种是委托（delegate）方式，一种是继承(inheritance)方式，两种都是override了removeEldestEntry方法。

先看委托方式，下面的代码来自 http://www.source-code.biz/snippets/java/6.htm

import java.util.LinkedHashMap;
import java.util.Collection;
import java.util.Map;
import java.util.ArrayList;

/**
* An LRU cache, based on <code>LinkedHashMap</code>.
*
* <p>
* This cache has a fixed maximum number of elements (<code>cacheSize</code>).
* If the cache is full and another entry is added, the LRU (least recently used) entry is dropped.
*
* <p>
* This class is thread-safe. All methods of this class are synchronized.
*
* <p>
* Author: Christian d'Heureuse, Inventec Informatik AG, Zurich, Switzerland<br>
* Multi-licensed: EPL / LGPL / GPL / AL / BSD.
*/
public class LRUCache<K,V> {

private static final float   hashTableLoadFactor = 0.75f;

private LinkedHashMap<K,V>   map;
private int                  cacheSize;

/**
* Creates a new LRU cache.
* @param cacheSize the maximum number of entries that will be kept in this cache.
*/
public LRUCache (int cacheSize) {
   this.cacheSize = cacheSize;
   int hashTableCapacity = (int)Math.ceil(cacheSize / hashTableLoadFactor) + 1;
   map = new LinkedHashMap<K,V>(hashTableCapacity, hashTableLoadFactor, true) {
      // (an anonymous inner class)
      private static final long serialVersionUID = 1;
      @Override protected boolean removeEldestEntry (Map.Entry<K,V> eldest) {
         return size() > LRUCache.this.cacheSize; }}; }

/**
* Retrieves an entry from the cache.<br>
* The retrieved entry becomes the MRU (most recently used) entry.
* @param key the key whose associated value is to be returned.
* @return    the value associated to this key, or null if no value with this key exists in the cache.
*/
public synchronized V get (K key) {
   return map.get(key); }

/**
* Adds an entry to this cache.
* The new entry becomes the MRU (most recently used) entry.
* If an entry with the specified key already exists in the cache, it is replaced by the new entry.
* If the cache is full, the LRU (least recently used) entry is removed from the cache.
* @param key    the key with which the specified value is to be associated.
* @param value  a value to be associated with the specified key.
*/
public synchronized void put (K key, V value) {
   map.put (key, value); }

/**
* Clears the cache.
*/
public synchronized void clear() {
   map.clear(); }

/**
* Returns the number of used entries in the cache.
* @return the number of entries currently in the cache.
*/
public synchronized int usedEntries() {
   return map.size(); }

/**
* Returns a <code>Collection</code> that contains a copy of all cache entries.
* @return a <code>Collection</code> with a copy of the cache content.
*/
public synchronized Collection<Map.Entry<K,V>> getAll() {
   return new ArrayList<Map.Entry<K,V>>(map.entrySet()); }

} // end class LRUCache

// Test routine for the LRUCache class.
public static void main (String[] args) {
   LRUCache<String,String> c = new LRUCache<String, String>(3);
   c.put ("1", "one");                           // 1
   c.put ("2", "two");                           // 2 1
   c.put ("3", "three");                         // 3 2 1
   c.put ("4", "four");                          // 4 3 2
   if (c.get("2") == null) throw new Error();    // 2 4 3
   c.put ("5", "five");                          // 5 2 4
   c.put ("4", "second four");                   // 4 5 2
   // Verify cache content.
   if (c.usedEntries() != 3)              throw new Error();
   if (!c.get("4").equals("second four")) throw new Error();
   if (!c.get("5").equals("five"))        throw new Error();
   if (!c.get("2").equals("two"))         throw new Error();
   // List cache content.
   for (Map.Entry<String, String> e : c.getAll())
      System.out.println (e.getKey() + " : " + e.getValue()); }

下面是继承了 LinkedHashMap方式，实现起来非常简单，注意AccessOrder

A cache is a mechanism by which future requests for that data are served faster and/or at a lower cost. This article describes a data structure to hold the cache data and an implementation in Java to service the cache requests.

Requirements

Fixed size: The cache needs to have some bounds to limit memory usage.
Fast access: The cache insert and lookup operations need to be fast preferably O(1) time.
Entry replacement algorithm: When the cache is full, the less useful cache entries are purged from cache. The algorithm to replace these entries is Least Recently Used (LRU) - or the cache entries which have not been accessed recently will be replaced.

Design discussion

Since the lookup and insert operationed need to fast a HashMap would be a good candidate. The HashMap accepts an initial capacity parameter but it re-sizes itself if more entries are inserted. So we need to override the put() operation and remove (or purge) an entry before inserting.

How do we select the entry to be purged? One approach is to maintain a timestamp at which the entry was inserted and select the entry with the oldest timestamp. But this search would be linear taking O(N) time.

So we need the entries to be maintained in a sorted list based on the order in which the entries were accessed. An alternate way to achieve this would be to maintain the entries in a doubly linked list using which everytime an entry is accessed ( a cache lookup operation), the entry is also moved to the end of the list. When we need to purge the entries it is done from the top of the list. In an ArrayList when an element is removed the rest of the entries need to be moved by one to fill the gap. A doubly linked list does not have this issue.

We have come up with a design that meets our requirements and guarantees O(1) insert and O(1) lookup operations and also has a configurable limit on the number of entries. Let's begin the implementation.

Lucky for us, JDK already provides a class that is very suitable for our purpose - LinkedHashMap. This class maintains the entries in a HashMap for fast lookup at the same time maintains a doubly linked list of the entries either inAccessOrder or InsertionOrder. This is configurable so use use AccessOrder as true. It also has a methodremoveOldestEntry() which we can override to return true when the cache size exceeds the specified capacity(upper limit). So here is the implementation. Enjoy.

import java.util.LinkedHashMap;
import java.util.Map.Entry;

public class LRUCache < K, V > extends LinkedHashMap < K, V > {

	private int capacity; // Maximum number of items in the cache.
	
	public LRUCache(int capacity) { 
		super(capacity+1, 1.0f, true); // Pass 'true' for accessOrder.
		this.capacity = capacity;
	}
	
	protected boolean removeEldestEntry(Entry entry) {
		return (size() > this.capacity);
	} 
}

3. 在Android的源码里有一个完整的工业级实现，考虑了线程安全，值得学习。

Cache miss 有三类，简单来说:
Compulsory misses : 首次读写时,造成的miss
Capacity misses : 此时cache已满,即超出了cache本身的能力,这时如果要读取内存数据,而此数据还没有移到cache里面,就会造成cache miss,这是比较常见的一种.
Conflict misses : 这是一种可以避免的cache miss,主要由于我们的cache替换策略不当造成的.

下面的实现是Capacity Miss：

import java.util.LinkedHashMap;
import java.util.Map;

/**
 * A cache that holds strong references to a limited number of values. Each time
 * a value is accessed, it is moved to the head of a queue. When a value is
 * added to a full cache, the value at the end of that queue is evicted and may
 * become eligible for garbage collection.
 *
 * <p>If your cached values hold resources that need to be explicitly released,
 * override {@link #entryRemoved}.
 *
 * <p>If a cache miss should be computed on demand for the corresponding keys,
 * override {@link #create}. This simplifies the calling code, allowing it to
 * assume a value will always be returned, even when there's a cache miss.
 *
 * <p>By default, the cache size is measured in the number of entries. Override
 * {@link #sizeOf} to size the cache in different units. For example, this cache
 * is limited to 4MiB of bitmaps:
 * <pre>   {@code
 *   int cacheSize = 4 * 1024 * 1024; // 4MiB
 *   LruCache<String, Bitmap> bitmapCache = new LruCache<String, Bitmap>(cacheSize) {
 *       protected int sizeOf(String key, Bitmap value) {
 *           return value.getByteCount();
 *       }
 *   }}</pre>
 *
 * <p>This class is thread-safe. Perform multiple cache operations atomically by
 * synchronizing on the cache: <pre>   {@code
 *   synchronized (cache) {
 *     if (cache.get(key) == null) {
 *         cache.put(key, value);
 *     }
 *   }}</pre>
 *
 * <p>This class does not allow null to be used as a key or value. A return
 * value of null from {@link #get}, {@link #put} or {@link #remove} is
 * unambiguous: the key was not in the cache.
 */
public class LruCache<K, V> {
    private final LinkedHashMap<K, V> map;

    /** Size of this cache in units. Not necessarily the number of elements. */
    private int size;
    private int maxSize;

    private int putCount;
    private int createCount;
    private int evictionCount;
    private int hitCount;
    private int missCount;

    /**
     * @param maxSize for caches that do not override {@link #sizeOf}, this is
     *     the maximum number of entries in the cache. For all other caches,
     *     this is the maximum sum of the sizes of the entries in this cache.
     */
    public LruCache(int maxSize) {
        if (maxSize <= 0) {
            throw new IllegalArgumentException("maxSize <= 0");
        }
        this.maxSize = maxSize;
        this.map = new LinkedHashMap<K, V>(0, 0.75f, true);
    }

    /**
     * Returns the value for {@code key} if it exists in the cache or can be
     * created by {@code #create}. If a value was returned, it is moved to the
     * head of the queue. This returns null if a value is not cached and cannot
     * be created.
     */
    public final V get(K key) {
        if (key == null) {
            throw new NullPointerException("key == null");
        }

        V mapValue;
        synchronized (this) {
            mapValue = map.get(key);
            if (mapValue != null) {
                hitCount++;
                return mapValue;
            }
            missCount++;
        }

        /*
         * Attempt to create a value. This may take a long time, and the map
         * may be different when create() returns. If a conflicting value was
         * added to the map while create() was working, we leave that value in
         * the map and release the created value.
         */

        V createdValue = create(key);
        if (createdValue == null) {
            return null;
        }

        synchronized (this) {
            createCount++;
            mapValue = map.put(key, createdValue);

            if (mapValue != null) {
                // There was a conflict so undo that last put
                map.put(key, mapValue);
            } else {
                size += safeSizeOf(key, createdValue);
            }
        }

        if (mapValue != null) {
            entryRemoved(false, key, createdValue, mapValue);
            return mapValue;
        } else {
            trimToSize(maxSize);
            return createdValue;
        }
    }

    /**
     * Caches {@code value} for {@code key}. The value is moved to the head of
     * the queue.
     *
     * @return the previous value mapped by {@code key}.
     */
    public final V put(K key, V value) {
        if (key == null || value == null) {
            throw new NullPointerException("key == null || value == null");
        }

        V previous;
        synchronized (this) {
            putCount++;
            size += safeSizeOf(key, value);
            previous = map.put(key, value);
            if (previous != null) {
                size -= safeSizeOf(key, previous);
            }
        }

        if (previous != null) {
            entryRemoved(false, key, previous, value);
        }

        trimToSize(maxSize);
        return previous;
    }

    /**
     * @param maxSize the maximum size of the cache before returning. May be -1
     *     to evict even 0-sized elements.
     */
    private void trimToSize(int maxSize) {
        while (true) {
            K key;
            V value;
            synchronized (this) {
                if (size < 0 || (map.isEmpty() && size != 0)) {
                    throw new IllegalStateException(getClass().getName()
                            + ".sizeOf() is reporting inconsistent results!");
                }

                if (size <= maxSize) {
                    break;
                }

                Map.Entry<K, V> toEvict = map.eldest();
                if (toEvict == null) {
                    break;
                }

                key = toEvict.getKey();
                value = toEvict.getValue();
                map.remove(key);
                size -= safeSizeOf(key, value);
                evictionCount++;
            }

            entryRemoved(true, key, value, null);
        }
    }

    /**
     * Removes the entry for {@code key} if it exists.
     *
     * @return the previous value mapped by {@code key}.
     */
    public final V remove(K key) {
        if (key == null) {
            throw new NullPointerException("key == null");
        }

        V previous;
        synchronized (this) {
            previous = map.remove(key);
            if (previous != null) {
                size -= safeSizeOf(key, previous);
            }
        }

        if (previous != null) {
            entryRemoved(false, key, previous, null);
        }

        return previous;
    }

    /**
     * Called for entries that have been evicted or removed. This method is
     * invoked when a value is evicted to make space, removed by a call to
     * {@link #remove}, or replaced by a call to {@link #put}. The default
     * implementation does nothing.
     *
     * <p>The method is called without synchronization: other threads may
     * access the cache while this method is executing.
     *
     * @param evicted true if the entry is being removed to make space, false
     *     if the removal was caused by a {@link #put} or {@link #remove}.
     * @param newValue the new value for {@code key}, if it exists. If non-null,
     *     this removal was caused by a {@link #put}. Otherwise it was caused by
     *     an eviction or a {@link #remove}.
     */
    protected void entryRemoved(boolean evicted, K key, V oldValue, V newValue) {}

    /**
     * Called after a cache miss to compute a value for the corresponding key.
     * Returns the computed value or null if no value can be computed. The
     * default implementation returns null.
     *
     * <p>The method is called without synchronization: other threads may
     * access the cache while this method is executing.
     *
     * <p>If a value for {@code key} exists in the cache when this method
     * returns, the created value will be released with {@link #entryRemoved}
     * and discarded. This can occur when multiple threads request the same key
     * at the same time (causing multiple values to be created), or when one
     * thread calls {@link #put} while another is creating a value for the same
     * key.
     */
    protected V create(K key) {
        return null;
    }

    private int safeSizeOf(K key, V value) {
        int result = sizeOf(key, value);
        if (result < 0) {
            throw new IllegalStateException("Negative size: " + key + "=" + value);
        }
        return result;
    }

    /**
     * Returns the size of the entry for {@code key} and {@code value} in
     * user-defined units.  The default implementation returns 1 so that size
     * is the number of entries and max size is the maximum number of entries.
     *
     * <p>An entry's size must not change while it is in the cache.
     */
    protected int sizeOf(K key, V value) {
        return 1;
    }

    /**
     * Clear the cache, calling {@link #entryRemoved} on each removed entry.
     */
    public final void evictAll() {
        trimToSize(-1); // -1 will evict 0-sized elements
    }

    /**
     * For caches that do not override {@link #sizeOf}, this returns the number
     * of entries in the cache. For all other caches, this returns the sum of
     * the sizes of the entries in this cache.
     */
    public synchronized final int size() {
        return size;
    }

    /**
     * For caches that do not override {@link #sizeOf}, this returns the maximum
     * number of entries in the cache. For all other caches, this returns the
     * maximum sum of the sizes of the entries in this cache.
     */
    public synchronized final int maxSize() {
        return maxSize;
    }

    /**
     * Returns the number of times {@link #get} returned a value that was
     * already present in the cache.
     */
    public synchronized final int hitCount() {
        return hitCount;
    }

    /**
     * Returns the number of times {@link #get} returned null or required a new
     * value to be created.
     */
    public synchronized final int missCount() {
        return missCount;
    }

    /**
     * Returns the number of times {@link #create(Object)} returned a value.
     */
    public synchronized final int createCount() {
        return createCount;
    }

    /**
     * Returns the number of times {@link #put} was called.
     */
    public synchronized final int putCount() {
        return putCount;
    }

    /**
     * Returns the number of values that have been evicted.
     */
    public synchronized final int evictionCount() {
        return evictionCount;
    }

    /**
     * Returns a copy of the current contents of the cache, ordered from least
     * recently accessed to most recently accessed.
     */
    public synchronized final Map<K, V> snapshot() {
        return new LinkedHashMap<K, V>(map);
    }

    @Override public synchronized final String toString() {
        int accesses = hitCount + missCount;
        int hitPercent = accesses != 0 ? (100 * hitCount / accesses) : 0;
        return String.format("LruCache[maxSize=%d,hits=%d,misses=%d,hitRate=%d%%]",
                maxSize, hitCount, missCount, hitPercent);
    }
}

补充材料：

在虚拟存储器中常用的页面替换算法有如下几种：
　　(1) 随机算法，即RAND算法（Random algorithm）。利用软件或硬件的随机数发生器来确定主存储器中被替换的页面。这种算法最简单，而且容易实现。但是，这种算法完全没有利用主存储器中页面调度情况的历史信息，也没有反映程序的局部性，所以命中率比较低。
　　(2) 先进先出算法，即FIFO算法（First-In First-Out algorithm）。这种算法选择最先调入主存储器的页面作为被替换的页面。它的优点是比较容易实现，能够利用主存储器中页面调度情况的历史信息，但是，没有反映程序的局部性。因为最先调入主存的页面，很可能也是经常要使用的页面。
　　(3) 近期最少使用算法，即LFU算法（Least Frequently Used algorithm）。这种算法选择近期最少访问的页面作为被替换的页面。显然，这是一种非常合理的算法，因为到目前为止最少使用的页面，很可能也是将来最少访问的页面。该算法既充分利用了主存中页面调度情况的历史信息，又正确反映了程序的局部性。但是，这种算法实现起来非常困难，它要为每个页面设置一个很长的计数器，并且要选择一个固定的时钟为每个计数器定时计数。在选择被替换页面时，要从所有计数器中找出一个计数值最大的计数器。因此，通常采用如下一种相对比较简单的方法。
　　(4) 最久没有使用算法，即LRU算法（Least Recently Used algorithm）。这种算法把近期最久没有被访问过的页面作为被替换的页面。它把LFU算法中要记录数量上的"多"与"少"简化成判断"有"与"无"，因此，实现起来比较容易。
　　(5) 最优替换算法，即OPT算法（OPTimal replacement algorithm）。上面介绍的几种页面替换算法主要是以主存储器中页面调度情况的历史信息为依据的，它假设将来主存储器中的页面调度情况与过去一段时间内主存储器中的页面调度情况是相同的。显然，这种假设不总是正确的。最好的算法应该是选择将来最久不被访问的页面作为被替换的页面，这种替换算法的命中率一定是最高的，它就是最优替换算法。
　　要实现OPT算法，唯一的办法是让程序先执行一遍，记录下实际的页地址流情况。根据这个页地址流才能找出当前要被替换的页面。显然，这样做是不现实的。因此，OPT算法只是一种理想化的算法，然而，它也是一种很有用的算法。实际上，经常把这种算法用来作为评价其它页面替换算法好坏的标准。在其它条件相同的情况下，哪一种页面替换算法的命中率与OPT算法最接近，那么，它就是一种比较好的页面替换算法。

CareerBowl

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[leetCode] LRU Cache (Java)

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and set.get(key) – Get the value (will always be positive) of the key if
复制链接

扫一扫