HashMap及相关集合框架的总结

最新推荐文章于 2022-08-13 16:29:46 发布

大川里的小川人

最新推荐文章于 2022-08-13 16:29:46 发布

阅读量263

点赞数 2

分类专栏：并发

本文链接：https://blog.csdn.net/whandwho/article/details/88061686

版权

并发专栏收录该内容

2 篇文章 0 订阅

订阅专栏

ConcurrentHashMap相关

1. 几个重要参数（jdk1.8 中省略了segment数组）：

构造函数有5个，分别是 : ( ) / ( initialCapacity ) / ( initialCapacity、loadFactor )/ ( initialCapacity、loadFactor、concurrentLevel )/ ( Map接口的实现类对象 )
>>> 带符号右移，缺的补上补码的首位，<< >> <<< 无特殊情况
initialCapacity: 新建类的时候的参数，表示 hash 的数组大小，即每个 segment维护的数组的的大小。默认大小是 16 。最大是 1 << 30。
loadFactor: 默认 0.75 ，是每个 segment 的负载因子，同hashmap一样，因为每个segment都维护了一个hashEntry[]。当内部hashmap的数组被占用数量 > capacity x loadFactor （扩容阈值）
concurrentLevel：默认 16 。表示期望的最大的并发数量，即期待segment数组有多少个元素。
sszie：表示 segment 数组的长度。通过 concurrentLevel 计算得到的，计算大于等于concurrentLevel 的最小 2 的n次方的值，作为ssize（segment数组的长度）。
sshift ：表示 1 需要左移多少位才能等于 ssize。假设 sszie = 16，那么 sshift 等于 4 ,segmentShift = 32 - sshift = 28。
segmentMask：类似于子网掩码一样的存在，即在定位segment数组的具体index时，需要用它和二次hash的值进行 & 操作。 segmentMask = ssize - 1 。保证后面几位全 1 。segment是无法扩容的
segmentShift ：表示hash操作后需要让 hash 值右移的位数。 segmentShift = 32 - sshift。
（(hash >> segmentShift ) & segmentMask ) 这样来定位segment 数组的index。segmentShift的值 + segment 的有效位数 = 32位

>>> 在 jdk 1.8 中，
上面的构造函数没变化，但是省略了segment数组之后，整个结构非常类似于hashmap，整体上也是一个数组Node<K,V> table ,但是针对table中的每个节点都有一把锁，这样进一步的提高了并发量，并且在节点后面的链表中，在数量大于阈值的时候会发生红黑树、链表的来回转化。
.
sizeCtl :
1、负数表示----正在初始化或者扩容
2、-N 表示----有 N-1 个线程在辅助扩容（在put时检测到扩容，辅助）
3、正数或者 0 ----还没初始化，正数值等于扩容阈值 = capacity x loadFactor （扩容是针对 table 中的，扩容为两倍）
.
static final int MOVED = -1; // hash for forwarding nodes --如果table[i] 的 hash 值置为 moved 值，说明当前整个table正在进行扩容，所有的table[]头结点的hash值一定都被置为了 MOVED .
.
static final int TREEBIN = -2; // hash for roots of trees — 如果table [i] 的hash值变成 TREEBIN -2 这个值，说明当前这个头结点值 TreeNode 类型的节点，当前是 root节点，后面是红黑树的结构。

2. get 、put、transfer三种操作( jdk 1.8 )

2.1 get

    public V get(Object key) {
        Node<K,V>[] tab; // table 是hashmap的节点数组
        Node<K,V> e, p; // 两个临时节点
        int n, eh;
        K ek;
        int h = spread(key.hashCode()); //对 key 的hashcode进行再次hash操作
        /*
        tabAt 最后调用的是 Unsafe 的硬件指令，table是 volatile修饰的，保证了内存可见性，保证每次拿的都是最新的值，  
        即使 在get执行的时候，有其他线程正在该 tab[i] 上面进行写，但是在 修改后的值刷新到 主存的一瞬间之前仍然 对 读线程是最新的数据。
         */
        if ((tab = table) != null && (n = tab.length) > 0 &&
                (e = tabAt(tab, (n - 1) & h)) != null) {
            if ((eh = e.hash) == h) {
                if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                    return e.val;
            }
            else if (eh < 0)
                return (p = e.find(h, key)) != null ? p.val : null;
            while ((e = e.next) != null) {
                if (e.hash == h &&
                        ((ek = e.key) == key || (ek != null && key.equals(ek))))  // 往后面遍历查询匹配 key 的hash值
                    return e.val;
            }
        }
        return null;
    }
    static final int spread(int h) {
        return (h ^ (h >>> 16)) & HASH_BITS;
    }
    static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
        return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE); // 最终调用 Unsafe 的接口
    }

2.2 put

transient volatile Node<K,V>[] table;  // volatile 修饰，内存可见性，配合 CAS ，给数组中的每个元素加锁

    final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException(); //不允许 NULL ，hashmap允许--直接映射到index=0
        int hash = spread(key.hashCode());  // 根据hashcode求hash
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {   // Node节点的table数组
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)  // 初始化table数组
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {  // 最终调用 Unsafe 的硬件指令
                if (casTabAt(tab, i, null,
                        new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin  // 当前定位hash失败为null，说明tab[i]是空的，不需要加锁
            }
            else if ((fh = f.hash) == MOVED)  // 发现正在扩容，当前线程参与到将原table中的值按照一定算法移动到 新的table中，但是创建新的table只有一个线程在做
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {   // 针对首节点即 tab[i] 进行加锁操作
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                        ((ek = e.key) == key ||
                                                (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;   // 发现重复key，覆盖掉
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;  // 新建节点加入
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                            value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {   // 首节点为 TreeBin 类型，说明变成了红黑树，
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                    value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD) // 每条链表上面的节点数目大于阈值，转换成红黑树结构
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }

2.3 transfer扩容

什么时候触发扩容：transfer() 是在 tryPresize（）调用的，而，tryPresize 是在 treeifyBin() 和 putAll()中调用的，用于预处理size。而是否超过阈值，在 putval 中存在函数 addCount ，用于计数。~~问题：????具体用于计数 table 中节点的机制没有发现?????。~~

2.3.1 ForwordingNode<K,V>

关于 forwardingNode 类，每个类里面存了一个指向下一个nextTable 的节点类，关于节点处理的情况
1、一个线程正在处理一个链表中的节点，已经获得了锁了，每处理好一个节点，就将原容器中的节点标志为 forwardingNode 类型，那么下一个线程进来处理的时候，就可以跳过被处理的节点
2、然后就是针对刚开始处理一个table[i] 的节点
.
针对上面的情况：— 那么forwarding的用处在于，将节点标志位一个子类后，会同时赋予它一个 nexttable 的引用，这样在多线程辅助扩容的时候，针对的是同一个nextTable 进行的操作。

/*
    ForwardingNode 继承与Node 类型，主要用于在 多线程进行扩容操作（原数据移动到新的结构中）时，
    如果一个节点被标识为 ForwardingNode 类型，说明已经被处理过了
    其他线程则不需要处理这个 Node，但是存在的线程安全问题是：相当于一条链表上同时只有一个线程在处理节点，
    如果当前线程处理一条链表上的节点的一部分,释放掉tab[i]的锁，那么其他线程会根据 节点的类型判断是否已经被处理过并且移动到 了 新的结构中。
     */
    static final class ForwardingNode<K,V> extends Node<K,V> {
        final Node<K,V>[] nextTable;
        ForwardingNode(Node<K,V>[] tab) {
            super(MOVED, null, null, null);
            this.nextTable = tab;
        }

        Node<K,V> find(int h, Object k) {
            // loop to avoid arbitrarily deep recursion on forwarding nodes
            outer: for (Node<K,V>[] tab = nextTable;;) {
                Node<K,V> e; int n;
                if (k == null || tab == null || (n = tab.length) == 0 ||
                        (e = tabAt(tab, (n - 1) & h)) == null)
                    return null;
                for (;;) {
                    int eh; K ek;
                    if ((eh = e.hash) == h &&
                            ((ek = e.key) == k || (ek != null && k.equals(ek))))
                        return e;
                    if (eh < 0) {
                        if (e instanceof ForwardingNode) {  // 判断是否ForwardingNode类型
                            tab = ((ForwardingNode<K,V>)e).nextTable;
                            continue outer;
                        }
                        else
                            return e.find(h, k);
                    }
                    if ((e = e.next) == null)
                        return null;
                }
            }
        }
    }

2.3.2 Node<K,V>[] tab 与Node<K,V>[] nextTable

一个是扩容前的table，一个是扩容后的table，但是nextTable只是相当于一个中间值的存在。

2.3.3 transfer函数

/*
    通过原子操作 CAS 方法，调用的扩容函数并且由该线程首先创建一个 nextTable，同时，过程是，
     */
    private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {   // 扩容的操作函数
        int n = tab.length, stride;
        if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
            stride = MIN_TRANSFER_STRIDE; // subdivide range
        if (nextTab == null) {            // initiating, 引起扩容的时候，调用该方法的时候，是通过 原子操作 保证的线程安全。
            try {
                @SuppressWarnings("unchecked")
                Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];   // 新建一个两倍 capacity 的 table 数组
                nextTab = nt; // nextTable 指向这个 table数组
            } catch (Throwable ex) {      // try to cope with OOME
                sizeCtl = Integer.MAX_VALUE;   // sizeCtl 置为 最大正数，表示阈值无线，可能触发了什么检查然后报错
                return;
            }
            nextTable = nextTab;
            transferIndex = n;
        }
        int nextn = nextTab.length;
        // 新建一个 ForwardingNode 节点类，节点内部指向了 nextTable 的引用，用于多线程共同扩容
        ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
        boolean advance = true;  // 遍历的标志
        boolean finishing = false; // to ensure sweep before committing nextTab
        for (int i = 0, bound = 0;;) {  // 死循环
            Node<K,V> f; int fh;
            while (advance) { // advance 控制循环
                int nextIndex, nextBound;
                if (--i >= bound || finishing)
                    advance = false;
                else if ((nextIndex = transferIndex) <= 0) {
                    i = -1;
                    advance = false;
                }
                // 原子操作
                else if (U.compareAndSwapInt
                        (this, TRANSFERINDEX, nextIndex,
                                nextBound = (nextIndex > stride ?
                                        nextIndex - stride : 0))) {
                    bound = nextBound;
                    i = nextIndex - 1;
                    advance = false;
                }
            }
            if (i < 0 || i >= n || i + n >= nextn) {
                int sc;
                // 如果复制结束
                if (finishing) {
                    nextTable = null; // nextTable 置为null
                    table = nextTab;// 复制引用
                    sizeCtl = (n << 1) - (n >>> 1); // 设置新的阈值？？？？
                    return;
                }
                // 原子操作
                if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
                    if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                        return;
                    finishing = advance = true;
                    i = n; // recheck before commit
                }
            }
            else if ((f = tabAt(tab, i)) == null)  // 数组节点为 null ，那么设置为 forwardingNode 类型
                advance = casTabAt(tab, i, null, fwd);
            else if ((fh = f.hash) == MOVED)   // hash值为MOVED，那么表示
                advance = true; // already processed
            else {
                // tab[i] 这条链表的头结点加锁
                synchronized (f) {
                    if (tabAt(tab, i) == f) { // 是否是头结点
                        Node<K,V> ln, hn;
                        if (fh >= 0) {
                            // 如果是链表的结构，因为table变成了 2 倍，相当于hash值取余的按位与& 操作，多了一个 bit 为1，
                            // 链表上面的值可以分成两个 小的链表插入到新的table中，
                            // 并且原table的链表分段后的hash按位与操作不会与其他table头结点的分段冲突
                            int runBit = fh & n;
                            Node<K,V> lastRun = f;
                            for (Node<K,V> p = f.next; p != null; p = p.next) {
                                int b = p.hash & n;
                                if (b != runBit) {
                                    runBit = b;
                                    lastRun = p;
                                }
                            }
                            if (runBit == 0) {
                                ln = lastRun;
                                hn = null;
                            }
                            else {
                                hn = lastRun;
                                ln = null;
                            }
                            for (Node<K,V> p = f; p != lastRun; p = p.next) {
                                int ph = p.hash; K pk = p.key; V pv = p.val;
                                if ((ph & n) == 0)
                                    ln = new Node<K,V>(ph, pk, pv, ln);
                                else
                                    hn = new Node<K,V>(ph, pk, pv, hn);
                            }
                            setTabAt(nextTab, i, ln); //i为新的nextTable节点位置
                            setTabAt(nextTab, i + n, hn);// i+原来的n，即另外一条链表的位置
                            setTabAt(tab, i, fwd);// 设置原来的table[i] 为 forwardingNode 类型
                            advance = true;
                        }
                        else if (f instanceof TreeBin) {  // 树形结构
                            TreeBin<K,V> t = (TreeBin<K,V>)f;
                            // 同样分成两个子树
                            TreeNode<K,V> lo = null, loTail = null;
                            TreeNode<K,V> hi = null, hiTail = null;
                            int lc = 0, hc = 0;
                            for (Node<K,V> e = t.first; e != null; e = e.next) {
                                int h = e.hash;
                                TreeNode<K,V> p = new TreeNode<K,V>
                                        (h, e.key, e.val, null, null);
                                if ((h & n) == 0) {
                                    if ((p.prev = loTail) == null)
                                        lo = p;
                                    else
                                        loTail.next = p;
                                    loTail = p;
                                    ++lc;
                                }
                                else {
                                    if ((p.prev = hiTail) == null)
                                        hi = p;
                                    else
                                        hiTail.next = p;
                                    hiTail = p;
                                    ++hc;
                                }
                            }
                            // 两个子树 小于阈值 会重新转换为 链表 
                            ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
                                    (hc != 0) ? new TreeBin<K,V>(lo) : t;
                            hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
                                    (lc != 0) ? new TreeBin<K,V>(hi) : t;
                            setTabAt(nextTab, i, ln);  //存 head结点或者 root 到 nextTable[i]
                            setTabAt(nextTab, i + n, hn); // 到 nextTable[i+n]
                            setTabAt(tab, i, fwd); // table[i]变成 forwardingNode
                            advance = true;
                        }
                    }
                }
            }
        }
    }

HashMap相关

initialCapacity 、loadFactor等参数同上面一直，数组节点元素后面可以是链表也可以是红黑树。
非线程安全，锁与非锁都没有涉及到。

2. 三个函数put get resize

2.1 get

    public V get(Object key) {  // 传入 key ，根据 hash 值定位，hashmap允许null 的存在，并且key value重复会覆盖
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
                (first = tab[(n - 1) & hash]) != null) {  // 在table中找到头结点，首先检查第一个head节点
            if (first.hash == hash && // always check first node
                    ((k = first.key) == key || (key != null && key.equals(k))))// 对象判断 && 值判断
                return first;
            if ((e = first.next) != null) {  // table[i]这个head节点的后面不为空，先看head节点是 node 还是 treeNode
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key); // treeNode 类型，根据tree的算法遍历
                do { // 循环
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k)))) // 值 && 对象引用
                        return e;
                } while ((e = e.next) != null); // 循环
            }
        }
        return null; // 根据hash没有定位到 头结点 ，返回null
    }

put

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * //只在putIfAbsent()函数中，该值默认为true，即该 key 不存在的时候才插入，决定是否覆盖
     * @param evict if false, the table is in creation mode.  // 暂时忽略，看了都默认的true
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0) // 为空则进行新建扩容，构造函数没有对table进行初始化
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)// table节点为null，则直接创建节点加入
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                    ((k = p.key) == key || (key != null && key.equals(k))))// 如果和 头结点head 相同的 hash && key 则直接到后面 if 里面
                e = p; // 覆盖
            else if (p instanceof TreeNode) // treeNode 类型
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {// 头结点存在且key不匹配，向后遍历
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {// 为空直接加入
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st  树形化
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))  // 相同 hash && key，跳出循环，但不做操作
                        break;
                    p = e; // 下一个向后遍历赋值
                }
            }
            // 相同 hash && key
            if (e != null) { // existing mapping for key
                // ！！！只有可能是上面的 hash && key 相同情况下，上面有 头结点 && 后继节点两种！！！！！
                V oldValue = e.value;  // 取值
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value; // 如果原来key的值为空，直接用新值修改value，不改节点引用
                afterNodeAccess(e);
                return oldValue; // 返回原来的 value .
            }
        }
        ++modCount;
        if (++size > threshold)
            resize(); // 添加完一个节点后进行数量检查 是否 > 阈值 ？ 扩容 ：无
        afterNodeInsertion(evict);
        return null;
    }

    public V putIfAbsent(K key, V value) {
        return putVal(hash(key), key, value, true, true);
    }

resize() 扩容

   final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table; // oldTable[]
        int oldCap = (oldTab == null) ? 0 : oldTab.length; // 之前的容量，即table的大小
        int oldThr = threshold; // 之前的阈值
        int newCap, newThr = 0; // 新的容量 && 阈值
        if (oldCap > 0) {  // 不为空
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                    oldCap >= DEFAULT_INITIAL_CAPACITY)  // double  table大小 && 阈值
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold // 对应构造函数 HashMap(int initialCapacity，...)，第一次put
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults // 对应构造函数 HashMap() ，使用默认值，第一次put
            newCap = DEFAULT_INITIAL_CAPACITY; // table大小
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); // 阈值
        }
        if (newThr == 0) {  // 阈值赋值
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                    (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr; // 全局变量的阈值
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];  // 创建新的Table 无论是新的，还是第一次初始化
        table = newTab;
        if (oldTab != null) {  // 原来的不为空
            for (int j = 0; j < oldCap; ++j) {  // table 的遍历
                Node<K,V> e;
                if ((e = oldTab[j]) != null) { // oldTable[i] 不为 null
                    oldTab[j] = null;
                    if (e.next == null) // 只有一个头结点
                        newTab[e.hash & (newCap - 1)] = e; // newTable[]
                    else if (e instanceof TreeNode) // 是 treeNode 类型
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        // 原来的链表拆成  lo 和 hi 两条
                        Node<K,V> loHead = null, loTail = null;  
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do { // 循环
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead; // 尾部置 null ， 头结点加入 newTable[]
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead; // 容量扩大一倍，相对hash置加上之前的容量值
                        }
                    }
                }
            }
        }
        return newTab;
    }

大川里的小川人

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HashMap及相关集合框架的总结

ConcurrentHashMap相关几个重要参数：构造函数有5个，分别是 : ( ) / ( initialCapacity ) / ( initialCapacity、loadFactor )/ ( initialCapacity、loadFactor、concurrentLevel )/ ( Map接口的实现类对象 ) &amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;amp;amp;amp;amp;g
复制链接

扫一扫

专栏目录