Java HashMap源码简析

最新推荐文章于 2024-07-10 21:37:49 发布

雩山贡水

最新推荐文章于 2024-07-10 21:37:49 发布

阅读量447

点赞数

分类专栏： Java 文章标签： hashmap 源码 java

本文链接：https://blog.csdn.net/xingxjhui/article/details/24602195

版权

Java 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Java HashMap的核心是数组加链表结构，数组主要用于存储和快速寻址，链表主要用户解决冲突。当然这是概念上的，源码体现的更加详细和丰满，下面是一些源码关键点：

成员变量：

     **
     * 默认初始容量必须是2的幂（性能问题，后面会涉及）
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * 最大容量
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * 默认负载参数
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    
    //Entry类型数组
    transient Entry[] table;

Entry类：

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        final int hash;
        ……
}

next是实现链表的关键，HashMap里的链表是个单链表，后插入的在链表头，先插入的在链表尾。

get方法：

public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

注意到get方法是通过indexFor方法来寻找数组的索引号，indexFor的参数是数组长度和对key的hashCode再次hash的结果，找到数组索引号后遍历链表寻找元素。

hash方法：

static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

indexFor方法：

static int indexFor(int h, int length) {
        return h & (length-1);
    }

根据key定位数组的索引号方法很多，比如我们都很熟悉的取模方法（数组长度模key的hashCode）。源码的做法是对key的hashCode再次进行哈希，然后与数组长度-1相与。

为什么要对key的hashCode再次哈希？源码里的javadoc的解释是避免过差的key的hashCode方法。我认为跟indexFor采用的跟数组长度-1相与来寻址有关，试想如果数组长度16（数组长度-1的二进制表示为：1111,1111），但是我的key为（1111,1111,1111），（1110,1111,1111），（1101,1111,1111），是不是三个key都映射到同一个数组索引号了？再次hash的原因大概如此，更详细原理可以参考：http://www.365doit.com/all/news/hashmapdeep.html。

那么为什么不用取模方法用相与呢？因为相与操作比取模性能更好。那么为什么是跟数组长度-1相与呢？因为在设置数组长度为2的幂的情况下，用数组长度-1相与可以减少冲突。更详细的可以参考：http://www.iteye.com/topic/539465

put方法：

public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

put方法源码现实很清晰的说明了一个事情：插入的key value pair如果key重复了，新的value将覆盖老的value。如果key不重复，那么说明冲突了，方法addEntry把元素加入到hashMap，下面看看addEntry的源码：

void addEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 * table.length);
    }

取出原来数组里面已经有的Entry对象，作为参数构造一个新的Entry，放回数组。这个Entry的构造函数式怎么样的？

Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }

真相大白，它把原来的Entry作为新插入Entry的next引用。即在冲突的情况下，新元素在链表头，老元素在链表尾。

雩山贡水

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Java HashMap源码简析

Java HashMap的核心是数组加链表结构，数组主要用于
复制链接

扫一扫

专栏目录

Java HashMap源码简析

“相关推荐”对你有帮助么？