什么是HashMap？不同版本的HashMap有什么不同？浅谈HashMap

最新推荐文章于 2022-04-20 13:56:08 发布

奋进的loser

最新推荐文章于 2022-04-20 13:56:08 发布

阅读量1.1k

点赞数

文章标签：哈希算法 java 散列表

本文链接：https://blog.csdn.net/weixin_42387745/article/details/122783569

版权

当我们探究什么是HashMap时，应该带着如下的问题去探讨？

1.什么是Hashmap？

中文名哈希映射,HashMap是一个用于存储Key-Value键值对的集合,每一个键值对也叫做Entry。这些个键值对(Entry)分散存储在一个数组当中,这个数组就是HashMap的主干。

2.HashMap的组成？不同jdk版本的HashMap有什么不一样？

在jdk1.7中是由：数组+链表
在jdk1.8中是由：数据+链表+红黑树

3.什么是hash冲突？

hash冲突是是在存入hashmap时，首先map在放入的key值的时候，会先计算当前key值的hashcode值，然后当前数组大小进行取模运算，放入对应的下标，当下标相同并且key值不同时，就会产生冲突，这个时候产生的冲突就叫做hash冲突，也叫hash碰撞。

4.hashmap在jdk1.7中的源码表示（以put 和get方法作为示例）

1.首先当我们进入hashmap中


    /** 初始化数组大小为16
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/**		负载因子0.75
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

我们可以看到负载因子为0.75和初始化数组大小为16
	问题1.当我们定义数据大小为 new HashMap(10);此时的默认数组大小为多少？
		答：16，因为HashMap的初始化函数中规定容量大小要是2的指数倍，即2，4，8，16，所以当指定容量为10时，实际容量为16。

4.1 对于put方法

public V put(K key, V value) {
	//  首先判断当前数组是否为空，若为空就进行初始化数组
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        //判断当前传入的key值是否为空，若为空就放入第一位
        if (key == null)
            return putForNullKey(value);
        //计算当前的hash值
        int hash = hash(key);
        //获取当前hash值的数组下标
        int i = indexFor(hash, table.length);
        //首先遍历到对应的数组下标
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            //判断当前数组下标的hash值是否相同，若相同在判断对应的key至是否相同，若相通则替换原来的值
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
		//hahsmap结构被修改的次数
        modCount++;
        //添加对应的数据
        addEntry(hash, key, value, i);
        return null;
    }

当前需要明确的有

如果两个对象的hash值相同，则两个对象不一定相同。
若过两个对象相同，则他们的hash值一定相同。

首先看 inflateTable（） 这个初始化函数

//当前方法是初始化hashmap的
	private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        int capacity = roundUpToPowerOf2(toSize);
		//这个位置是用来判断当前hamap是否需要进行扩容的
        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];
        initHashSeedAsNeeded(capacity);
    }

当前函数主要是为了计算hashcode值
	final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }
    
这个方法是计算出hashcode值的索引下标
	static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
  ` }

疑问：这个位置使用 & 和 % 取模运算结果相同，为什么使用& 而不使用%？（引用哔哩哔哩原话）

提升计算效率：因为2的指数倍的二进制都是只有一个1，而2的指数倍-1的二进制就都是左全0右全1。那么跟（2^n -做按位与运算的话，得到的值就一定在【0,（2^n - 1）】区间内，这样的数就刚合适可以用来作为哈希表的容量大小，因为往哈希表里插入数据，就是要对其容量大小取余，从而得到下标。所以用2^n做为容量大小的话，就可以用按位与操作替代取余操作，提升计算效率。
便于动态扩容后的重新计算哈希位置时能均匀分布元素：因为动态扩容仍然是按照2的指数倍，所以按位与操作的值的变化就是二进制高位+1，比如16扩容到32，二进制变化就是从0000 1111（即15）到0001 1111（即31），那么这种变化就会使得需要扩容的元素的哈希值重新按位与操作之后所得的下标值要么不变，要么+16（即挪动扩容后容量的一半的位置），这样就能使得原本在同一个链表上的元素均匀（相隔扩容后的容量的一半）分布到新的哈希表中。

 添加相应的对象
void addEntry(int hash, K key, V value, int bucketIndex) {
//首先判断当前数组大小是否需要进行扩容，如果需要就以原数组大小的二倍进行扩容
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
//创建新的对象进行头差法
        createEntry(hash, key, value, bucketIndex);
    }
进行头插法
void createEntry(int hash, K key, V value, int bucketIndex) {
//首先创建一个新的对象，然后对应的数组下标上进行头插法
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

疑问：为什么使用头插法？

	链表的特性：插入快，查找慢
	所以在jdk1.7中，官方使用头插法以提高效率。

4.2 对于get方法

	public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }
    
    final Entry<K,V> getEntry(Object key) {
    //若素组大小为0就返回null
        if (size == 0) {
            return null;
        }
//判断他的key值是否为null，若不为null就返回对应的hash值
        int hash = (key == null) ? 0 : hash(key);
        //根据hashcode值获取对应的数组下标
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            //判断他的hash值和他的key值是否相同
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

	get方法就相对简单，首先获取对应的key值，计算其对应的数组下标，再根据对应的数组下标查找，查到就返回对应的Value值。

5.hashmap在jdk1.8中的与jdk1.7中最大的不同

1.首先加入了红黑树

//记录链表的最大的转换长度，若链表长度大于等于8，就将链表转换成红黑树
	static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
//记录链表的最小的转换长度，若链表长度小于等于6，就将红黑树转换成链表
    static final int UNTREEIFY_THRESHOLD = 6;
    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
//记录当数组大小大于等于64的时候才会将链表的长度进行转换，否则哪怕链表的长度大于8也不进行转换
    static final int MIN_TREEIFY_CAPACITY = 64;

2.HashMap在jdk1.8中使用尾插法
3.比1.7多了判断（看代码的注释）

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        //判断当前数组是否为空，为空就先进行出事话
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        //判断当前节点是否存在值，如果不存在就直接进行添加
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            //判断当前插入的key值是否存在，若存在则进行替换
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            //当前节点已经存在值，判断其是否是链表，若是链表就进行尾插法
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
            //这个地方处理的是红黑树，左子节点<中间节点<右子节点
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            //值存在就进行替换
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        //将数据进行尾插
        afterNodeInsertion(evict);
        return null;
    }