6.1 （番外）深入源码理解HashMap、LinkedHashMap，DiskLruCache

最新推荐文章于 2023-06-07 17:27:41 发布

Mirs_sir

最新推荐文章于 2023-06-07 17:27:41 发布

阅读量741

点赞数

分类专栏： Android 源码阅读 OkHttp源码分析文章标签：源码 hashmap 算法缓存 DiskLru

本文链接：https://blog.csdn.net/a109340/article/details/74357242

版权

Android 同时被 3 个专栏收录

21 篇文章 0 订阅

订阅专栏

源码阅读

11 篇文章 0 订阅

订阅专栏

OkHttp源码分析

8 篇文章 4 订阅

订阅专栏

本文深入探讨HashMap和LinkedHashMap的源码，包括它们的put和get操作。HashMap的散列算法是关键，而LinkedHashMap在HashMap基础上增加了访问顺序或插入顺序的维护。接着，文章介绍了DiskLruCache的工作原理，它是基于LinkedHashMap实现的一种磁盘缓存，通过编辑器进行写入操作，并在特定条件下触发垃圾清理。

摘要由CSDN通过智能技术生成

6.1 （番外）深入源码理解HashMap、LinkedHashMap，DiskLruCache

我们看OkHttp的源码可以知道，他的缓存算法主要是用LruCache算法实现的，Lru的一个典型的实现就是LinedkHashMap，LinkedHashMap又是基于HashMap实现的，所以要探究他的原理，我们要从HashMap开始说起了

(有什么问题的话可以进群交流群号 579508560,会有视频讲解，一般人我不告诉他)

（前排出售香烟啤酒火腿肠。。。）

（本文用的是java7，java8 HashMap有部分改动）

HashMap

HashMap继承自AbstactMap，实现的是Map接口,Map的实现类有一下几种

Map
├Hashtable
├HashMap
└WeakHashMap

get知识点

HashMap 和 HashTable 的区别是什么

自行搜索去，我不说，有本事你们打我��

我们要看HashMap的话首先要看他的父类，我们点到AbstractMap\

HashMap()   
HashMap(int initialCapacity)
HashMap(int initialCapacity, float loadFactor)
HashMap(Map<? extends K, ? extends V> m)

上面的构造函数，前两个都是调用的第三个，默认都是传的默认值，我们直接看第三个

/**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * 构造具有指定的初始容量和负载因子的空HashMap。
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);


        //如果是用的这个构造函数，那么初始大小和加载因子需要指明
        this.loadFactor = loadFactor;
        threshold = initialCapacity;
        init();
    }

     void init() {
    }

这边就是对于默认的几个数值做了一下过滤与判断，init方法是个空方法，接着我们看常用的put和get方法

put（key,value）

   /**
     * An empty table instance to share when the table is not inflated.
     */
    static final Entry<?,?>[] EMPTY_TABLE = {};

    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

 (省略 、、、、、、)

 public V put(K key, V value) {
        //如果table等于一个空table的话，初始化table，大小如果没有指定的话就是默认的大小
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        //如果是null键的话 ，添加null的字符串
        if (key == null)
            return putForNullKey(value);
        //获取key的hash值进行两次hash计算  
        int hash = hash(key);
        //根据hash的值进行寻址，返回的是下标
        int i = indexFor(hash, table.length);
        //如果第table里面的的entry不为空，然后根据下标和引用开始向散列表里放值
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            //如果添加成功则返回
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                //每当一个条目中的值被对于已经在HashMap中的密钥k调用put（k，v）覆盖时，这个方法被调用。例如linkedHashMap
                e.recordAccess(this);
                return oldValue;
            }
        }

        /**这里当他是当前索引的第一个，调用addEntry添加**/

        //模数加一
        modCount++;
        //添加到table上，如果大小不够的话会进行扩容，倍数是2的整数倍
        addEntry(hash, key, value, i);
        return null;
    }

把以上的步骤总结来说就是这样

如果table是空的话初始化table的大小为默认值
如果key是null的话处理null值的value，hash值为0，index也是0，添加之后return
如果key不为null的话，获取他的hash值，获得再次进行两次hash的hash值，与table的长度进行&操作获取下标
遍历table，如果当前的下边的entry不为null，并且key和hash相同，就进行覆盖操作
如果不是的话新建一个entry，添加到table下标为i的地方，原来下标上的entry做为entry的下个节点

整个步骤拿图来讲的话是这样

流程.png

如果是null的键的话他永远在table的第0个下标，如果存在就是覆盖掉原来的value, 这就是他put 操作，核心就在他的散列算法，如何让各个不同的键均匀的分布，不然就会造成一个index下的entry有很长，这样效率会被降低。

他的散列就是Hash算法

final int hash(Object k) {
        int h = hashSeed;
        //hashSeed只有在hash表的大与等于 Integer.MAX_VALUE或者是设置的系统变量jdk.map.althashing.threshold， 才不为0(好多博客对这块描述不多或者理解有误，以为key是String 直接就进来了)
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        //此功能可确保每个位位置上不同倍数不同的hashCode具有有界数量的冲突（默认加载因子约为8）。
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

里面的hashSeed正常情况下都是0，非0的情况是你的table size大于等于设置的系统变量jdk.map.althashing.threshold 或者 Integer.MAX_VALU，一般都不会碰到这种情况

后面的hash看起来就关系就简单多了。获取key的hashCode，因为是native方法看不到。h默认是0，进行的是一个^（异或）运算，然后下面的就是对h一顿无符号位移操作，进行了两次，确保他的不同。

接着的一个算法是寻找合适的下标的算法，indexFor

    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

这不用多讲，一个与运算，基本put方法重要且难理解的都在这边讲解了，下面讲get方法

get

get的代吗还是比较少比较好理解的

public V get(Object key) {
        //null key 获取null key的值 table的第0位 寻找key为null的
        if (key == null)
            return getForNullKey();
        //和上面的套路差不多，通过调用hash 和 indexFor的方法找到他的下标，在下标里面遍历寻找到他的值
        Entry<K,V> entry = getEntry(key);
        //null的话返回null
        return null == entry ? null : entry.getValue();
    }

    -----  我是一个漂亮的分割线  ------
    //上面方法用用到的方法 

    //找null key
     private V getForNullKey() {
        if (size == 0) {
            return null;
        }
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
    }
    //找非null key
     final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

其他几个方法我就不起一一分析了，基本上就是根据getEntry（）来进行判断的，没办法谁让我这么懒。。

HashMap看完之后我们来看LinkedHashMap，就轻松多了

LinkedHashMap

LinkedHashMap继承自HashMap，我们按照一样的套路去看LinkedHashMap，发现他的构造函数和HashMap的构造函数是一样的，不同的地方是多了一个accessOrder，acessOrder上面的描述是

该链接哈希映射的迭代排序方法：对于访问顺序为true，对于插入顺序为false。

这样的操作像不像LRU，不过默认是按照插入顺序排序的，也就是默认值是false

init方法里，初始化了header，是个Entry，继承自HashMap的Entry，是个双向链表，里面包含他的前一个节点和后一个节点

private static class Entry<K,V> extends HashMap.Entry<K,V> {
        // These fields comprise the doubly linked list used for iteration.
        Entry<K,V> before, after;

        Entry(int hash, K key, V value, HashMap.Entry<K,V> next) {
            super(hash, key, value, next);
        }

        (省略 添加和删除部分。。。。)

        //这个方法在HashMap中put操作的时候有被调用过，我们发现LinkedHashMap没有重写put方法，所以我们就分析这个方法，这个方法就是在插入的时候记录他的插入顺序(心机boy，藏得这么深，搞的这么巧妙)
         void recordAccess(HashMap<K,V> m) {
            LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;
            if (lm.accessOrder) {
                lm.modCount++;
                remove();
                addBefore(lm.header);
            }
        }
}

用图片表示的话就是这样

他这边记录了两份，一份是在Hash表里的，一份是双向链表，如果accessOrder为true的话，每次都会把最近调用的放到链表的最后方，他调用的地方有两处

get方法
HashMap中put的方法（覆盖）

我们来运行个例子


        LinkedHashMap linedHashMap = new LinkedHashMap(16,0.75f,true);
        linedHashMap.put("key1","test1");
        linedHashMap.put("key2","test2");
        linedHashMap.put("key3","test3");
        linedHashMap.put("key4","test4");
        linedHashMap.put("key5","test5");

        for (Iterator<String> iterator = linedHashMap.values().iterator(); iterator
                   .hasNext();) {
                   String name = (String) iterator.next();
                   System.out.println(name);
        }

当我们进行一些操作的时候


        LinkedHashMap linedHashMap = new LinkedHashMap(16,0.75f,true);
        linedHashMap.put("key1","test1");
        linedHashMap.put("key2","test2");
        linedHashMap.put("key3","test3");
        linedHashMap.put("key4","test4");
        linedHashMap.put("key5","test5");

        linedHashMap.get("key1");
        linedHashMap.get("key3");

        for (Iterator<String> iterator = linedHashMap.values().iterator(); iterator
                   .hasNext();) {
                   String name = (String) iterator.next();
                   System.out.println(name);
        }

我们发现，最近使用的就排到了这个链表的最后面，nice，知道了这个原理之后，感觉自己都能写一个LRUcache算法了是不是

有没有很开心很激动？

别着急，，接着看

DiskLruCache

Jake Wharton的微笑镇楼

DisLruCache 这边我们拿JW大神的来讲，OkHttp里面的是被改造过的，会有点小的不同，不过问题不大，还有就是，我们是按照大体的流程来梳理一下，具体的细节就要考大家去自己看源码了。

我们的测试程序如下

public class TestMain {

    private static final long cacheSize = 1024 * 1024 * 20;//缓存文件最大限制大小20M
    private static String cachedirectory = "/Users/Mirsfang/Desktop" + "/caches";  //设置缓存文件路径,自己设置


    public static void main(String[] args) {
        testWirteDiskLruCache();
    }

    //写入
    private static void testWirteDiskLruCache() {

        DiskLruCache diskCache = null;
        try {
            diskCache = DiskLruCache.open(new File(cachedirectory), 10010, 1, cacheSize);

            String urlKey = "test_request1_1";
            String testJson = "{\"result\":{\"errmsg\":\"令牌失效\",\"errcode\":\"101\"}}";

            try {
                DiskLruCache.Editor editor = diskCache.edit(urlKey);

                OutputStream outPutStream = editor.newOutputStream(0);
                OutputStreamWriter steamWriter = new OutputStreamWriter(outPutStream);
                steamWriter.write(testJson);
                steamWriter.close();
                outPutStream.flush();
                outPutStream.close();

                editor.commit();

                diskCache.flush();
                diskCache.close();
            } catch (IOException e) {
                e.printStackTrace();
            } finally {

            }


        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    //读取
    private static void testGetDiskLruCache() {
           DiskLruCache diskCache = null;
        StringBuilder sb = new StringBuilder();
        try {
            diskCache = DiskLruCache.open(new File(cachedirectory), 10010, 1, cacheSize);

            String urlKey = "test_request1_1";

            try {
                DiskLruCache.Snapshot snapshot = diskCache.get(urlKey);

                String inputSteam = snapshot.getString(0);


                diskCache.flush();
                diskCache.close();

                System.out.print("得到文本: "+inputSteam);
            } catch (IOException e) {
                e.printStackTrace();
            } finally {

            }


        } catch (IOException e) {
            e.printStackTrace();
        }
    }


}

首先还是从构建方法搞起，构建方法是不对外公开的只能通过open去构建

/**
* 传入的参数
* @param directory  可写入的目录
* @param valueCount 每个缓存条目的值的数量。 必须是正的。
* @param maxSize 该缓存应用于存储的最大字节数
* @throws IOException if reading or writing the cache directory fails
*
**/

 public static DiskLruCache open(File directory, int appVersion, int valueCount, long maxSize)
      throws IOException {
    if (maxSize <= 0) {
      throw new IllegalArgumentException("maxSize <= 0");
    }
    if (valueCount <= 0) {
      throw new IllegalArgumentException("valueCount <= 0");
    }

    // If a bkp file exists, use it instead.如果存在bkp文件，请改用它。
    File backupFile = new File(directory, JOURNAL_FILE_BACKUP);
    if (backupFile.exists()) {
      File journalFile = new File(directory, JOURNAL_FILE);
      // If journal file also exists just delete backup file.
      // 如果日记文件也存在只是删除备份文件。
      if (journalFile.exists()) {
        backupFile.delete();
      } else {
        renameTo(backupFile, journalFile, false);
      }
    }

    // 初始化DisLruCache ，如果日志文件存在，，那么就用原来的
    DiskLruCache cache = new DiskLruCache(directory, appVersion, valueCount, maxSize);
    if (cache.journalFile.exists()) {
      try {
        cache.readJournal();
        cache.processJournal();
        return cache;
      } catch (IOException journalIsCorrupt) {
        System.out
            .println("DiskLruCache "
                + directory
                + " is corrupt: "
                + journalIsCorrupt.getMessage()
                + ", removing");
        cache.delete();
      }
    }

    //否则的话重新创建
    directory.mkdirs();
    cache = new DiskLruCache(directory, appVersion, valueCount, maxSize);
    //构建日志文件，写入日志的头
    cache.rebuildJournal();
    return cache;
  }

然后是edit（key）方法，构建一个DiskLruCache.Editor

public Editor edit(String key) throws IOException {
    return edit(key, ANY_SEQUENCE_NUMBER);
  }

  private synchronized Editor edit(String key, long expectedSequenceNumber) throws IOException {
    //判断journalWriter是否关闭
    checkNotClosed();
    //验证key是否合法(正则验证)
    validateKey(key);
    //从linkedHashMap中检查一下是否存在Entry
    Entry entry = lruEntries.get(key);

    //如果缓存是无效的，返回null
    if (expectedSequenceNumber != ANY_SEQUENCE_NUMBER && (entry == null
        || entry.sequenceNumber != expectedSequenceNumber)) {
      return null; // Snapshot is stale.
    }

    //如果entry为空，新new一个entry，添加到linkedHashMap中
    if (entry == null) {
      entry = new Entry(key);
      lruEntries.put(key, entry);
    } else if (entry.currentEditor != null) {
      return null; // Another edit is in progress.
    }

    //新建一个editor对象
    Editor editor = new Editor(entry);
    entry.currentEditor = editor;

    // 写入日志文件记录
    journalWriter.write(DIRTY + ' ' + key + '\n');
    journalWriter.flush();
    return editor;
  }

我们发现，基本的写入操作都是由Editor的newOutputStream的这个Steam完成的，我们去看那个方法

 public OutputStream newOutputStream(int index) throws IOException {
      //进行下标判断，是否合法
      if (index < 0 || index >= valueCount) {
        throw new IllegalArgumentException("Expected index " + index + " to "
                + "be greater than 0 and less than the maximum value count "
                + "of " + valueCount);
      }
      //为避免线程问题
      synchronized (DiskLruCache.this) {
        //看获取的entry的editor和当前的是否一直
        if (entry.currentEditor != this) {
          throw new IllegalStateException();
        }
        //如果readable是fasle，written[]这个数组index改为true,这个是为了兼容一个key对应多个的问题
        if (!entry.readable) {
          written[index] = true;
        }
        //根据index和entry，获取对应的缓存文件
        File dirtyFile = entry.getDirtyFile(index);
        FileOutputStream outputStream;
        try {
          //打开输入管道
          outputStream = new FileOutputStream(dirtyFile);
        } catch (FileNotFoundException e) {
          // Attempt to recreate the cache directory.
          //遇见异常失败的话尝试重新创建缓存目录。
          directory.mkdirs();
          try {
            outputStream = new FileOutputStream(dirtyFile);
          } catch (FileNotFoundException e2) {
            // We are unable to recover. Silently eat the writes.
            return NULL_OUTPUT_STREAM;
          }
        }
        //返回输出管道
        return new FaultHidingOutputStream(outputStream);
      }
    }

这样通过outputStream就写入到文件里了FaultHidingOutputStream是一个继承FilterOutputStream的包装类，对一些错误进行了标记，这个大概就是写入的这个流程，其实DiskLruCache主要的是get方法，看如何计数操作

get方法

public synchronized Snapshot get(String key) throws IOException {
    //检查文件流是否关闭
    checkNotClosed();
    //检查key的有效性
    validateKey(key);
    //通过LinkedHashMap中获取Entry
    Entry entry = lruEntries.get(key);

    //判断entry的合法性
    if (entry == null) {
      return null;
    }

    if (!entry.readable) {
      return null;
    }

    // Open all streams eagerly to guarantee that we see a single published
    // snapshot. If we opened streams lazily then the streams could come
    // from different edits.
    //积极地打开所有流，以保证我们看到一个已发布的snapshot。 如果我们懒加载的方式打开流，那么流可以来自不同的edits。
    //
    InputStream[] ins = new InputStream[valueCount];
    try {
      for (int i = 0; i < valueCount; i++) {
        ins[i] = new FileInputStream(entry.getCleanFile(i));
      }
    } catch (FileNotFoundException e) {
      // A file must have been deleted manually!
      for (int i = 0; i < valueCount; i++) {
        if (ins[i] != null) {
          Util.closeQuietly(ins[i]);
        } else {
          break;
        }
      }
      return null;
    }

    redundantOpCount++;
    journalWriter.append(READ + ' ' + key + '\n');
    //检测是否到达清理标准
    if (journalRebuildRequired()) {
      //开启线程进行清理
      executorService.submit(cleanupCallable);
    }

    return new Snapshot(key, entry.sequenceNumber, ins, entry.lengths);
  }

这里我们看到他是从LinkedHashMap中获取Entry的，后面，然后进行计数和日志的写入，最后的这个

if (journalRebuildRequired()) {
      //开启线程进行清理
      executorService.submit(cleanupCallable);
    }

是重点，他负责打开清理线程，这是DiskLruCache的核心，为什么DiskLruCache会清理，我们来看这个线程

  private final Callable<Void> cleanupCallable = new Callable<Void>() {
    public Void call() throws Exception {
      synchronized (DiskLruCache.this) {
        //如果日志写入为空，说明现在的状态是异常的
        if (journalWriter == null) {
          return null; // Closed.
        }
        //清理垃圾缓存
        trimToSize();
        if (journalRebuildRequired()) {
        //我们只能重新建立这个日志的时候，它会缩小日志的大小，并且至少消除2000个记录。
          rebuildJournal();
          redundantOpCount = 0;
        }
      }
      return null;
    }
  };

  //垃圾清除,当size比最大的size大的时候，从不活跃的开始清除
  private void trimToSize() throws IOException {
    while (size > maxSize) {
      Map.Entry<String, Entry> toEvict = lruEntries.entrySet().iterator().next();
      remove(toEvict.getKey());
    }
  }