HashMap——如何保证容量为2的整数次幂？

非理性地界生物

已于 2022-01-22 15:32:44 修改

阅读量1.2k

点赞数 6

于 2022-01-22 15:18:20 首次发布

本文链接：https://blog.csdn.net/zhy1379/article/details/122637815

版权

先来看一下HashMap内部元素存放的容器——成员变量table的定义

    /**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K, V>[] table;

When allocated, length is always a power of two.
翻译过来：长度必须是2的整数次幂

HashMap的容量需要保证为2的整数次幂，但是HashMap提供了指定容量的构造方法，如：

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param initialCapacity the initial capacity
     * @param loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *                                  or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                    initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                    loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

如果使用者传入的不是2的整数次幂，比如10；HashMap如何保证容量为2的整数次幂呢？
先测试一下传入一个非2的整数次幂的容量初始化HashMap后的容量：

    @Test
    public void testCapacity() {
        HashMap<String, String> map = new HashMap<>(5);
        // HashMap在添加第一个元素时初始化，所以需要put一个元素，不然反射获取到空数组，会报空指针异常
        map.put("","");
        System.out.println("通过反射获取到的HashMap容量为：" + getHashMapCapacity(map));
    }


    /**
     * 通过反射机制获取 HashMap 的容量
     */
    public static int getHashMapCapacity(HashMap<?,?> map) {
        Class<HashMap> hashMapClass = HashMap.class;
        try {
            Field field = hashMapClass.getDeclaredField("table");
            field.setAccessible(true);
            Object[] objects = (Object[])field.get(map);
            return objects.length;
        } catch (NoSuchFieldException | IllegalAccessException e) {
            e.printStackTrace();
            return -1;
        }
    }

输出：

通过反射获取到的HashMap容量为：8

Process finished with exit code 0

传入5，初始化的容量是8；传入10，初始化的容量是16…
分析一下这个流程：
上面代码中使用的是这个构造：

    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

这个构造调用的另一个带加载因子的构造（阅读源码，会发现很多这样的写法）：

    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                    initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                    loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

这里对容量与加载因子的边界做了一系列判断（逻辑比较简单，并且不是本文重点），这个先不考虑；来看一下本文重点：

        this.threshold = tableSizeFor(initialCapacity);

这里调用了一个tableSizeFor方法，传入了初始化容量，所以我们来看一下这个方法：

    /**
     * Returns a power of two size for the given target capacity.
     * 返回大于输入参数且最近的2的整数次幂的数
     *      先让 n = cap-1;
     *      假设 n的二进制格式为 0{x}1(0|1){m}  或  0...(x个0)1*...(m个 任意二进制数字)
     *      正则表达式版本：将 0{x}1(0|1){m}
     *              转变为：0{x}1{m+1}
     *              然后返回 n+1 ：0{x-1}10{m+1}
     *      即：将0...(x个0)1*...(m个任意二进制数字)
     *          转变为 0...(x个0)1...(m+1个1)
     *          然后返回 n+1 ：0...(x-1个0)10...(m+1个0)
     *
     *      举个例子，cap = 10；n = 9 = 0000 0000 0000 1001
     *      需要将 n 转变为 0000 0000 0000 1111
     *      然后return = n+1 = 0000 0000 0001 0000 = 16
     *
     *      先让 n = cap-1；是因为当cap已经是2的n次幂的时候，返回值时cap而不是2 * cap
     *
     *      原理：
     *      0|0=0; 0|1=1; 1|1=1; 1|0=1;   >>>n 代表将二进制数右移n位，前面补0，后面超出的直接省略
     *      假设n = cap - 1 = 0...1*...(后面有m位)      注：n最多为32位，为1的最高位后面有m位数
     *      n |= n >>> 1       n = 0...11*...(后面有m-1位)
     *      n |= n >>> 2       n = 0...1111*...(后面有m-3位)
     *      n |= n >>> 4       n = 0...1111 1111*...(后面有m-7位)
     *      n |= n >>> 8       n = 0...1111 1111 1111 1111*...(后面有m-15位)
     *      n |= n >>> 16      n = 0...1111 1111 1111 1111 1111 1111 1111 1111*...(后面有m-31位)
     *      注：n最多为32位(int类型 4 字节 32位)，为1的最高位后面有m个未知数，当>>>右移后，m - x的值小于0代表后面m - x的位都被移出了。忽略
     *      然后返回 n + 1 = 0..10...   只有一位为1,1后面是m+1个0  就保证了为2的n此幂
     */

    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

这个方法的的作用就是返回>=参数的最小2的整数次幂，即转换为二进制后，有且只有一位为1；
输入cap，用n保存cap减一；经过下面一系列操作，将n转换为大于n的最小2的n此幂-1的形式（即：前面全是0，后面全是1），再将n加一，就得到了2的整数次幂。这一些操作就是：

        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;

考虑n的正常范围：
我们把n（32位）表示成0{32-x-1}1(0|1){x}（{x}表示前面的数字出现x次，这个表达式表示一个前面全是0的1后面有x个位，这个数共32位）：

操作	操作后n的值，后面`x-m`中`m`代表`溢出的位数`,先溢出不确定为1还是为0的，再溢出已经确定为1的；如果`x-m`为负，则代表已经溢出`m-x`个已经确定为1的位
初始值	0{32-x-1}1(0\|1){x}
n \|= n >>> 1	0{32-x-1}11(0\|1){x-1}
n \|= n >>> 2	0{32-x-1}1111(0\|1){x-3}
n \|= n >>> 4	0{32-x-1}1111 1111(0\|1){x-7}
n \|= n >>> 8	0{32-x-1}1111 1111 1111 1111(0\|1){x-15}
n \|= n >>> 16	0{32-x-1}1111 1111 1111 1111 1111 1111 1111 1111(0\|1){x-31}

总共右移了31位，去掉溢出的31位（所以后面不确定的数是x-31<0）：

除掉不确定的位数x => 还需要溢出31-x个已知为1的位；
所以剩余为1的位数为：32-（31-x） = x+1位
结论：n最终为：0{32-x-1}1{x+1}；即：
然后再将n+1，得到0{32-x-2}10{x+1}

举个例子，传入21，减一后为20（0000 0000 0000 0000 0000 0000 0001 0100）

操作	操作后n的值	溢出位数
初始值	0000 0000 0000 0000 0000 0000 000`1` 0100	0
n \|= n >>> 1	0000 0000 0000 0000 0000 0000 000`1 1`111 -	1
n \|= n >>> 2	0000 0000 0000 0000 0000 0000 000`1 111`1 —	3
n \|= n >>> 4	0000 0000 0000 0000 0000 0000 000`1 1111 ---`- —	7
n \|= n >>> 8	0000 0000 0000 0000 0000 0000 000`1 1111 ---- ---- ---`- —	15
n \|= n >>> 16	0000 0000 0000 0000 0000 0000 000`1 1111 ---- ---- ---- ---- ---- ---- ---`- —	31