HashMap、HashTable、ConcurrentHashMap一些小总结

最新推荐文章于 2024-04-15 15:33:19 发布

猫吻鱼

最新推荐文章于 2024-04-15 15:33:19 发布

阅读量373

点赞数

分类专栏： Java

本文链接：https://blog.csdn.net/qq_36882793/article/details/103068390

版权

Java 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

文章目录

一、前言
二、HashMap
二、HashTable 实现线程安全
三、ConcurrentHashMap
三、总结

一、前言

本文基本是自己看完之后的一个总结记录，所以写的很混乱，很多语言的描述也并不清晰。
推荐 : https://blog.csdn.net/u012403290/article/details/68488562 讲的比我要清晰多了。本文只作为个人记录使用。

二、HashMap

1. Node

HashMap 底层实现是通过一个内部类数组 transient Node<K,V>[] table;
这里Node是个自定义内部类如下，可以看出来Node 的本质是一个单向链表。

 static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
        ... 其它代码
}

四个属性意义分别如下：
hash ：保存key的hash值
key：保存节点的key值
value：保存节点的value值
next：指向下一个Node节点

HashMap结构如图（手工画，略丑）

在这里插入图片描述
为了方便描述，我们将数组上的每一个元素和所链接的元素链表或树称为桶。如Node A0、Node A1、Node A2 这样一个结构称为桶。将NodeA0、A1称为桶的节点

2、put 方法

注释写的比较详细，写了很多次都没写出来一个好点的例子。

    // 保存数据的的Node数组
   transient Node<K,V>[] table;

  /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
		// 1. 如果 Node 数组还没初始化，则进行初始化
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
		// 2. 如果key的hash处理后所对应的table数组位置的桶还没有初始化(table[i] = null, 说明table的第i个位置还没有Node节点，所以说桶还没有初始化)，则创建新节点并插入，作为当前位置桶的第一个节点
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
			// 3. 判断如果key值等于当前桶节点的key值，则记录下节点(e = p)。留待后面处理
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
			// 4. 如果当前桶已经是转化为红黑树结构，则以红黑树规则插入节点
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
			// 5. 到这里说明， 当前位置存在桶结构(即存在Node节点)，且当前位置尚不构成红黑树结构
                for (int binCount = 0; ; ++binCount) {
					// 6. 如果p节点就是最后一个节点(p.next = null), 就初始化e节点，并添加在p节点后(因为p 节点已经是最后一个节点，所以当前桶中没有当前节点，新建节点，插入末尾)
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
						// 7. 如果当前桶节点数量大于等于7。则转换成红黑树结构。小于等于6时恢复成链表（为了保证查找效率，在连接结构大于等于7的情况下会转换为树结构）
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
					// 8. 如果找到了匹配了当前key的hash的节点。跳出循环，进行value赋值
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
					// 9. 将e 赋值给 p (进行下一个节点的判断)
                    p = e;
                }
            }
			// 10. 如果 e 不为空，则说明找到了对应key的桶元素
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
				// 11. 进行新的value赋值，并返回旧value值  -  onlyIfAbsent 在put 方法中恒定传false
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
		// 12. 记录map修改次数，在快速失败时使用
        ++modCount;
		// 13. 计算如果新增后的大小超过阈值，则重新设置大小
        if (++size > threshold)
            resize();
		// 14. 进行插入的后操作，供子类实现
        afterNodeInsertion(evict);
        return null;
    }
	
	//  在 remove -> removeNode -> removeTreeNode 方法中判断是否解除树化

3、get 方法

get 方法相较于put方法更加简单

  /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    /**
     * Implements Map.get and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
		// 如果table不为空，且对应的桶不为空
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
			// 如果 找到对应keyHash的值，则返回
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
			// 如果下面一个节点不为空，则 让 
            if ((e = first.next) != null) {
				// 如果是红黑树，则按照红黑树的逻辑查找节点
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
				// 否则桶一直往下遍历
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

4、entrySet 方法的遍历

在HashMap 中有一种遍历方式如下

        HashMap<String, String> hashMap = new HashMap();
        hashMap.put("A", "1");
        hashMap.put("B", "1");
        Set<Map.Entry<String, String>> set = hashMap.entrySet();
        for (Map.Entry<String, String> entry : set) {
            System.out.println(entry.getKey());
            System.out.println(entry.getValue());
        }

这是一种很常见的遍历方式，我们点进去entrySet方法，看到如下。
我们知道HashMap 中所有的数据都存放在Node[] 数组中，那么这个 entrySet是如何实现遍历整个Map的呢？
可以看到entrySet方法中初始化了 entrySet 变量。我们进入EntrySet类中发现并无其他。

    transient Set<Map.Entry<K,V>> entrySet;
    
    public Set<Map.Entry<K,V>> entrySet() {
        Set<Map.Entry<K,V>> es;
        return (es = entrySet) == null ? (entrySet = new EntrySet()) : es;
    }
final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
    public final int size()                 { return size; }
    public final void clear()               { HashMap.this.clear(); }
    public final Iterator<Map.Entry<K,V>> iterator() {
        return new EntryIterator();
    }
    public final boolean contains(Object o) {
        if (!(o instanceof Map.Entry))
            return false;
        Map.Entry<?,?> e = (Map.Entry<?,?>) o;
        Object key = e.getKey();
        Node<K,V> candidate = getNode(hash(key), key);
        return candidate != null && candidate.equals(e);
    }
    public final boolean remove(Object o) {
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>) o;
            Object key = e.getKey();
            Object value = e.getValue();
            return removeNode(hash(key), key, value, true, true) != null;
        }
        return false;
    }
    public final Spliterator<Map.Entry<K,V>> spliterator() {
        return new EntrySpliterator<>(HashMap.this, 0, -1, 0, 0);
    }
    public final void forEach(Consumer<? super Map.Entry<K,V>> action) {
        Node<K,V>[] tab;
        if (action == null)
            throw new NullPointerException();
        if (size > 0 && (tab = table) != null) {
            int mc = modCount;
            for (int i = 0; i < tab.length; ++i) {
                for (Node<K,V> e = tab[i]; e != null; e = e.next)
                    action.accept(e);
            }
            if (modCount != mc)
                throw new ConcurrentModificationException();
        }
    }
}

这时需要注意的是： **forEach 只是一种语法糖，其底层是通过迭代器实现的。在反编译后的代码其实是迭代器实现。**所以我们的遍历代码在编译后其实是下面这种形式。可以看到他调用的是 iterator() 方法。

   Iterator<Map.Entry<String, String>> iterator = set.iterator();
        while (iterator.hasNext()){
            Map.Entry<String, String> next = iterator.next();
            System.out.println(next.getKey());
            System.out.println(next.getValue());
        }

所以我们进入EntrySet.iterator()方法中，EntrySet.iterator()中只初始化了一个 EntryIterator() ，这也是个HashMap 内部类。再进去EntryIterator类中，发现EntryIterator 类继承了HashIterator 类，再进去 HashIterator 类中。所以整个过程是 EntrySet -> EntryIterator -> HashIterator。这几个类都是HashMap 内部类。
EntryIterator 源码如下，下面可以看到，next方式调用的是父类的nextNode 方法，即HashIterator.nextNode 方法

    final class EntryIterator extends HashIterator
        implements Iterator<Map.Entry<K,V>> {
        public final Map.Entry<K,V> next() { return nextNode(); }
    }

HashIterator 代码如下，我们可以就豁然开朗了，注释都在代码中。

    abstract class HashIterator {
        Node<K,V> next;        // next entry to return
        Node<K,V> current;     // current entry
        int expectedModCount;  // for fast-fail
        int index;             // current slot

        HashIterator() {
            expectedModCount = modCount;
            // 初始化的时候将table赋值给t 
            Node<K,V>[] t = table;
            current = next = null;
            // 设置顺序从0开始
            index = 0;
            if (t != null && size > 0) { // advance to first entry
                do {} while (index < t.length && (next = t[index++]) == null);
            }
        }

        public final boolean hasNext() {
            return next != null;
        }
		// 当我们调用next 方法时，就会调用这个方法。这个方法的作用就是将获取下一个节点并返回。
        final Node<K,V> nextNode() {
            Node<K,V>[] t;
            Node<K,V> e = next;
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            if (e == null)
                throw new NoSuchElementException();
            if ((next = (current = e).next) == null && (t = table) != null) {
                do {} while (index < t.length && (next = t[index++]) == null);
            }
            return e;
        }

        public final void remove() {
            Node<K,V> p = current;
            if (p == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            current = null;
            K key = p.key;
            removeNode(hash(key), key, null, false, false);
            expectedModCount = modCount;
        }
    }

二、HashTable 实现线程安全

HashTable 实现线程安全主要是通过加锁（synchronized）。通过锁住整个数组结构来保证线程安全，除此之外，实现基本和HashMap 相同，需要注意的是，HashTable中没有使用红黑树结构，全部使用链表结构。

1、 Entry 类

这里的Entry 和 HashMap 中的Node 相同，就是换了个名字，换汤不换药。
在这里插入图片描述

2、 put 方法

这个比HashMap 还简单

public synchronized V put(K key, V value) {
        // 非空判断
        if (value == null) {
            throw new NullPointerException();
        }

        // 确保key值还未保存在 table 数组中
        Entry<?,?> tab[] = table;
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        @SuppressWarnings("unchecked")
        Entry<K,V> entry = (Entry<K,V>)tab[index];
        // 遍历查找，如果找到对应的key值，则将value替换
        for(; entry != null ; entry = entry.next) {
            if ((entry.hash == hash) && entry.key.equals(key)) {
                V old = entry.value;
                entry.value = value;
                return old;
            }
        }
		// 否则添加新的节点
        addEntry(hash, key, value, index);
        return null;
    }

3、 get 方法

get 方法更加简单。

    public synchronized V get(Object key) {
        Entry<?,?> tab[] = table;
        int hash = key.hashCode();
        // 找到下标
        int index = (hash & 0x7FFFFFFF) % tab.length;
        // 找到对应桶，遍历节点
        for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                return (V)e.value;
            }
        }
        return null;
    }

三、ConcurrentHashMap

HashMap 由于加了全局锁，会导致并发情况下效率低下，相比较而言 ConcurrentHashMap 效率要高得多。相较而言ConcurrentHashMap 也可以保证线程安全。ConcurrentHashMap 的思想是分段锁，而不是像HashTable一样的全局锁。在Jdk 1.7 和 Jdk1.8中 ConcurrentHashMap 的实现是不同的。这里主要介绍Jdk1.8。所以只是简单提一下Jdk1.7。

在Jdk1.7 中：ConcurrentHashMap 引入片段(Segment)概念，每若干个桶都有一个片段锁，各个片段锁不冲突。获取数据时先获取当前桶的所属片段的片段锁。
在Jdk 1.8 中：ConcurrentHashMap 对分段锁的更细致的划分，每个桶都有一个独立的锁。不再使用segment，使用了 CAS 来实现单独的桶锁。synchronized 实现单独的桶锁。核心思想是CAS。

1. CAS（Compare And Swap）

1.1. 概念

CAS 即比较并交换。他是一条CPU并发原语。功能是判断内存某个位置上的值是否为预期值，如果是则更改为新的值，这个过程是原子的。CAS并发原语体现在JAVA语言中就是sun.misc.Unsafe类中的各个方法。调用UnSafe类中的CAS方法， JVM会帮我们实现CAS 汇编指令。这是一种完全依赖于硬件的功能，通过它实现了原子操作。再次强调，由于CAS是一种系统原语，原语属于操作系统用语范畴，是由若干条指令组成的，用于完成某个功能的一个过程，并且原语的执行必须是连续的，在执行过程中不允许被中断，也就是说CAS是一条CPU的原子指令，不会造成所谓的数据不一致问题。

1.2 核心类 UnSafe

UnSafe是CAS的核心类，由于Java方法无法直接访问底层系统，需要通过本地(native) 方法来访问，Unsafe相当于一一个后门，基于该类可以直接操作特定内存的数据。Unsafe类存在于sun.misc包中，其内部方法操作可以像C的指针一样直接操作内存，因为Java中CAS操作的执行依赖于Unsafe类的方法。

2. ConcurrentHashMap

2.1 ConcurrentHashMap 3个原子性操作方法。

	// 根据Volatile特性， 获取到最新的table数组的第i个node。
	static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
	    return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
	}
	//  四个参数分别是： 操作对象，偏移量，期待值，新值。
	// 利用CAS实现如下操作: 取出 tab数组的第i个Node元素，比较是否和c相等，相等则将c替换成V。这个操作线程安全
	static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
	                                    Node<K,V> c, Node<K,V> v) {
	    return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
	}
	// 根据Volatile特性， 设置tab数组的第i个node，立即可见。
	static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
	    U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
	}

2.2 put 方法

    /**
     * Maps the specified key to the specified value in this table.
     * Neither the key nor the value can be null.
     *
     * <p>The value can be retrieved by calling the {@code get} method
     * with a key that is equal to the original key.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with {@code key}, or
     *         {@code null} if there was no mapping for {@code key}
     * @throws NullPointerException if the specified key or value is null
     */
    public V put(K key, V value) {
        return putVal(key, value, false);
    }

    /** Implementation for put and putIfAbsent */
    final V putVal(K key, V value, boolean onlyIfAbsent) {
		// 非空校验
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
			// 初始化Node 数组
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
			// 获取 Node数组某元素，如果为空，则创建新节点插入
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
				// 通过CAS进行赋值新节点并插入Node数组中
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
			// 插入的时候如果数组正在扩容，则当前线程进行帮助扩容
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
				// 锁住某一个桶的头结点进行操作  --> 只是锁住了一个头结点。HashTable 锁住了整个Node[]，效率可想而知
                synchronized (f) {
					// 如果新插入节点属于f桶,则进入f桶中查找合适节点
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
								// 查找到key相同的节点，替换value值
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
								// 如果到桶末尾还未找到，则创建新节点插入
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
						// 如果是红黑树结构，则按照红黑树结构规则处理
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
				// 判断是否需要扩容
                if (binCount != 0) {
                	// 这里是大于等于8进行树形转换，小于等于6切换回链表
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }

2.3 get 方法

    public V get(Object key) {
        Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
        int h = spread(key.hashCode());
        // 如果对应桶不为空
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) != null) {
            // 如果e节点hash值和key相同，则返回value
            if ((eh = e.hash) == h) {
                if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                    return e.val;
            }
            // 小于0就说明已经再扩容或者已经在初始化
            else if (eh < 0)
                return (p = e.find(h, key)) != null ? p.val : null;
           // 如果还没达到桶末尾，则往下继续查找
            while ((e = e.next) != null) {
                if (e.hash == h &&
                    ((ek = e.key) == key || (ek != null && key.equals(ek))))
                    return e.val;
            }
        }
        return null;
    }

三、总结

HashMap、ConcurrentHashMap 存储结构是 数组+链表和红黑树 来存储数据，HashTable 使用数组+链表 来存储数据。
HashTable 通过 synchronized 在某些方法上加锁来实现线程安全。同时也使得效率变低
ConcurrentHashMap 实现线程安全的原理是 CAS。通过 synchronized 来锁住桶的第一个节点(锁住第一个节点后其余线程也就无法访问这个桶了)来实现线程安全。锁的颗粒度更细，所以效率更高。
在JDK1.7 中使用了片段（segment）来加锁，一个片段锁住若干个桶，相较于HashTable锁的颗粒度更细，但是在JDK1.8中舍弃了segment，通过CAS和synchronized 为每个桶都加了一个锁，颗粒度更高，效率也更高。
在HashMap 中，链表长度大于等于7时会转换为树结构，小于等于6时会转换为链表；在ConcurrentHashMap 中是大于等于8时转换为树结构，小于等于6时转化为链表；在HashTable中没有使用红黑树的结构。

以上：内容部分自己总结
如有侵扰，联系删除。内容仅用于自我记录学习使用。如有错误，欢迎指正

猫吻鱼

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
HashMap、HashTable、ConcurrentHashMap一些小总结

文章目录一、前言二、HashMap0. 简介1、put 方法2、get 方法3、遍历4、HashTable二、ConcurrentHashMap1. CAS一、前言二、HashMap0. 简介HashMap 底层实现是通过一个内部类数组 transient Node<K,V>[] table;这里Node是个自定义内部类如下 static class Node<...
复制链接

扫一扫