HashMap源码解析及非线程安全的原因

最新推荐文章于 2023-08-16 20:53:39 发布

木叶明

最新推荐文章于 2023-08-16 20:53:39 发布

阅读量338

点赞数

分类专栏： Java基础文章标签：线程安全 hashmap 源码 java

本文链接：https://blog.csdn.net/ExecMing/article/details/53241475

版权

Java基础专栏收录该内容

2 篇文章 0 订阅

订阅专栏

HashMap在我们平时的项目开发，以及面试中都会经常遇到，是一个出场率极高的数据结构。大多数人对HashMap的操作都得心应手，但是往往对其实现原理和源码实现一头雾水，本文将带您一起分析HashMap的源码，以及为什么HashMap在多线程应用中不是线程安全的。

请尊重笔者劳动成果，转载请标明出处。

一、HashMap的实现原理

HashMap的实现结构就是一个数组+链表的组合，它集成了数组的寻址容易和链表的插入删除容易的优点。数组中存放的是每个链表的头结点。从图中可以看出HashMap就是一个Entry数组，Entry是HashMap的静态内部类，他有四个属性，key,value,hash，next；通过next我们可以看出，Entry说白了就是一个链表。那么HashMap是如何存放我们的数据的呢，以及如何获取其中存放的数据的。

其大概实现过程如下：

			// 存储时
			int hash = hash(key.hashCode());
			int index = hash % Entry[].length();
			Entry[index] = value;  // 最后put进去的数据存入链表的头结点

看到这也许你会有疑问，如果put进去的两个键值对计算得到的数组下标一样，这样是否有被覆盖的危险。为了解决键值碰撞的问题，HashMap采用拉链法，也就是如果两个键值对计算得到的Entry数组下标一样，这样这两个键值对存入同一个链表里，后存放的数据会位于头结点。例如：第一个键值对A进来，然后计算其数组下标为1，这样Entry[1] = A,然后键值对B进来，计算其数组下标也为1，这样 B.next = A,Entry[1]=B。

其实看到这里我们已经对HashMap的实现原理有了大概的了解了，接下来我们借助HashMap的源码做进一步的了解。

二、HashMap源码分析

1、put方法

		    public V put(K key, V value) {
		        if (key == null)
		            return putForNullKey(value);  //key为null的键值对永远存在数组的第一个链表当中
		        int hash = hash(key);
		        int i = indexFor(hash, table.length); //key的哈希值对数组取模作为数组下标
		        // 遍历链表当中的每个键值对，判断是否已经有该key值的存在，存在则返回旧value，并覆盖
		        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
		            Object k;
		            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
		                V oldValue = e.value;
		                e.value = value;
		                e.recordAccess(this);
		                return oldValue;
		            }
		        }
		        // 如果HashMap中没有该key则将修改次数加一，并将该键值对存入
		        modCount++;
		        addEntry(hash, key, value, i);  //非线程安全的，同时put两个i值相同的键值对时，后者将会覆盖前者
		        return null;
		    }	
		    
		    void addEntry(int hash, K key, V value, int bucketIndex) {
		        if ((size >= threshold) && (null != table[bucketIndex])) {
		            resize(2 * table.length);
		            hash = (null != key) ? hash(key) : 0;
		            bucketIndex = indexFor(hash, table.length);
		        }

		        createEntry(hash, key, value, bucketIndex);
		    }
		    
		    /**
		     * Offloaded version of put for null keys
		     */
		    private V putForNullKey(V value) {
		        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
		            if (e.key == null) {
		                V oldValue = e.value;
		                e.value = value;
		                e.recordAccess(this);
		                return oldValue;
		            }
		        }
		        modCount++;
		        addEntry(0, null, value, 0);
		        return null;
		    }

从put方法的源码中，可以看出key=null的键值对永远存在Entry数组的第一个链表。同时从put方法的源码中可以看出，该方法不是原子性的，也就是非线程安全的，如果多个线程同时往一个Map中put数据时，如果两个数据计算得到的Entry数组下标相同，则同时执行addEntry方法，则此时后者将会把前者的数据覆盖掉。不仅如此，如果此时对阈值的判断时，都大于阈值，则这是需要扩容，扩容时则会生成一个新的Entry数组，并将原来的数据重新折腾到新的数组当中，这是也会存在后者会把前者折腾的数据覆盖掉。

我们简单的看下resize方法实现：

		    void resize(int newCapacity) {
		        Entry[] oldTable = table;
		        int oldCapacity = oldTable.length;
		        if (oldCapacity == MAXIMUM_CAPACITY) {
		            threshold = Integer.MAX_VALUE;
		            return;
		        }

		        Entry[] newTable = new Entry[newCapacity];
		        boolean oldAltHashing = useAltHashing;
		        useAltHashing |= sun.misc.VM.isBooted() &&
		                (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
		        boolean rehash = oldAltHashing ^ useAltHashing;
		        transfer(newTable, rehash);
		        table = newTable;
		        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
		    }

		    /**
		     * Transfers all entries from current table to newTable.
		     */
		    void transfer(Entry[] newTable, boolean rehash) {
		        int newCapacity = newTable.length;
		        for (Entry<K,V> e : table) {
		            while(null != e) {
		                Entry<K,V> next = e.next;
		                if (rehash) {
		                    e.hash = null == e.key ? 0 : hash(e.key);
		                }
		                int i = indexFor(e.hash, newCapacity);
		                e.next = newTable[i];
		                newTable[i] = e;
		                e = next;
		            }
		        }
		    }
		    
		    void createEntry(int hash, K key, V value, int bucketIndex) {
		        Entry<K,V> e = table[bucketIndex];
		        table[bucketIndex] = new Entry<>(hash, key, value, e);
		        size++;
		    }

我们可以发现判断size 和阈值threshold之间关系是在addEntry方法中判断的，并且执行扩容方法resize的条件是

(size >= threshold) && (null != table[bucketIndex])

也就是Entry数组中存在的链表数量大于等于threshold时，才会扩容。并且从createEntry方法中可以看出，size的自增条件就是，当put进去的数据key不在HashMap中，则自增。

2、get方法

		    public V get(Object key) {
		        if (key == null)
		            return getForNullKey();
		        Entry<K,V> entry = getEntry(key);

		        return null == entry ? null : entry.getValue();
		    }

		    private V getForNullKey() {
		    	// 从该方法我们也能看出key=null一定存放在Entry[0]的链表当中
		        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
		            if (e.key == null)
		                return e.value;
		        }
		        return null;
		    }

		    final Entry<K,V> getEntry(Object key) {
		        int hash = (key == null) ? 0 : hash(key);
		        for (Entry<K,V> e = table[indexFor(hash, table.length)];
		             e != null;
		             e = e.next) {
		            Object k;
		            if (e.hash == hash &&
		                ((k = e.key) == key || (key != null && key.equals(k))))
		                return e;
		        }
		        return null;
		    }

如果一个HashMap初始化完成后，多个线程同时get，而不put,是不会有非线程安全问题的，多线程get不会改变HashMap，所以不会有线程安全问题。

3、remove方法

    public V remove(Object key) {
        Entry<K,V> e = removeEntryForKey(key);
        return (e == null ? null : e.value);
    }

    /**
     * Removes and returns the entry associated with the specified key
     * in the HashMap.  Returns null if the HashMap contains no mapping
     * for this key.
     */
    final Entry<K,V> removeEntryForKey(Object key) {
        int hash = (key == null) ? 0 : hash(key);
        int i = indexFor(hash, table.length);
        Entry<K,V> prev = table[i];
        Entry<K,V> e = prev;

        while (e != null) {
            Entry<K,V> next = e.next;
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k)))) {
                modCount++;
                size--;
                if (prev == e)
                    table[i] = next;
                else
                    prev.next = next;
                e.recordRemoval(this);
                return e;
            }
            prev = e;
            e = next;
        }

        return e;
    }

多线程同时remove容易导致remove的数据，并没有被remove掉。

木叶明

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
HashMap源码解析及非线程安全的原因

HashMap在我们平时的项目开发，以及面试中都会经常遇到，是一个出场率极高的数据结构。大多数人对HashMap的操作都得心应手，但是往往对其实现原理和源码实现一头雾水，本文将带您一起分析HashMap的源码，以及为什么HashMap在多线程应用中不是线程安全的。请尊重笔者劳动成果，转载请标明出处。一、HashMap的实现原理HashMap的实现结构就是一个数组+链表的组合，它
复制链接

扫一扫