【java集合】HashMap源码解析

最新推荐文章于 2024-05-27 17:54:24 发布

中华雪碧

最新推荐文章于 2024-05-27 17:54:24 发布

阅读量192

点赞数

分类专栏： java基础文章标签： java hashmap 源码数据结构

本文链接：https://blog.csdn.net/gagewang1/article/details/78472741

版权

java基础专栏收录该内容

8 篇文章 0 订阅

订阅专栏

HashMap是一种哈希表的数据结构的实现，也是java中常用的集合。HashMap的特性归纳如下：

特性	值
是否顺序存储	非顺序
是否可重复存储	key值不可以，value值可以
是否可存储null	可以
是否线程安全	非线程安全

HashMap的属性

首先列举一下HashMap主要属性，方便大家理解和本文的说明：

属性	说明
table	非常重要的属性，HashMap的原理就是数组+链表，这个就是数组
size	存储元素的总数
threshold	临界值，当元素数量大于thresholdthreshold，会进行扩容操作
loadFactor	可以调节扩容的频率,默认是0.75,一般用默认值就可以
modCount	被修改的次数，可以检查迭代过程中是否被修改
hashSeed	帮助生成hash值，一般为0

创建一个HashMap

        /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
    }

        /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

这里列举了最常用的2个构造方法，方法1和方法2的区别是否指定初始化的容量。如果我们在使用时知道大致的数据量，可以选择方法2，这样可以提高程序的运行效率。在默认情况下，HashMap的初始容量是16，当HashMap存储的元素数量超过capacity * loadFactor时，会自动进行扩容，具体会在下面进行分析。

put方法

先分析一下HashMap的put(K key, V value)方法

    public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

第2-3行，判断table是否是空，为空的话进行初始化操作，主要是设置capacity，threshold，table。
第7行，取key对应的hash值，看一下源码：

    final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

不同版本的jdk，代码都有不同，但是原理基本相识。保障hash的均匀分布，尽量减少碰撞，以提高效率。这边主要对于高位和低位进行移位处理以及异或。

第8行根据，hash值计算在数组所在的索引位置，具体实现代就一行：return h & (length-1); 简单粗暴高效。

第9-17行，根据上面计算出的索引，再对索引下的key进行遍历，对于已有的key，对其value值进行更新。
第20行，如果是之前没有put过的key，则进行新增操作。

    void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }

首先判断，是否达到临界值，大于等于临界值时进行扩容，每次扩容一倍。扩容的具体方法看第3行resize方法，

    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

resize方法做了这些事情：
1.新建一个存放数据的数组，上面也说了数组的容量是之前的2倍
2.遍历旧的数组的数据，对于数据重新计算在新数组的index，并且放入新数组
3.设置新的临界值
所以一次扩容的消耗是挺高的，能在初始化的时候设置合适的容量，对于提高效率是有帮助的。
在回到addEntry方法的最后一行，createEntry方法：

    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

取出数组对应索引的Entry，新的Entry设为表头。

HashMap底层是采用了数组+链表的数据结构，先计算key的hash值作为数组的索引，相同索引的用链表相互关联。

get方法

相对于put方法，get方法简单许多

    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }

先判断key是否为null,是的话按数组index=0来取数据。
key！=null的话执行到第4行：

   final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

先计算key对应的索引，再遍历索引存储的链表，直到碰到key值相等的或者返回null。

remove方法

    public V remove(Object key) {
        Entry<K,V> e = removeEntryForKey(key);
        return (e == null ? null : e.value);
    }

remove方法和get有些类似，找到对应的key和value，并从链表中去除。

总结

1.HashMap底层是采用了数组+链表（jdk1.8又引入红黑树）的数据结构
2.HashMap是非线程安全的
3.初始化的时候设置合适的容量，对于提高效率是有帮助的,当然如果你实在不知道该设置什么，可以使用默认。

中华雪碧

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录