HashMap源码分析------核心面试题解析（二）

最新推荐文章于 2022-07-11 18:20:57 发布

龙小虬

最新推荐文章于 2022-07-11 18:20:57 发布

阅读量102

点赞数

分类专栏： HashMap源码

本文链接：https://blog.csdn.net/weixin_43911969/article/details/115363773

版权

HashMap源码专栏收录该内容

7 篇文章 0 订阅

订阅专栏

面试

HashMap底层是有序存放的吗？
LinkedHashMap 如何实现有序的
HashMap底层如何降低Hash冲突概率
为什么不直接将key作为哈希值而是与高16位做异或运算？

1.为什么HashMap底层是无序存放的
在前面）HashMap源码分析------核心面试题解析（一）提到过，HashMap的key的存放是根据hashcode的值来进行数组下标存放的。
所以HashMap就是无序存放的，散列。

那有没有是有序存放的集合呢？有的，那就是LinkedHashMap，采用的是双向链表的形式存储。我们先看看他的用法，并输出存储的数据。

package com.hashmap;

import java.util.LinkedHashMap;

/**
 * @author 龙小虬
 * @date 2021/3/31 14:53
 */
public class Main {
    public static void main(String[] args) {
        LinkedHashMap<String,String> linkedHashMap = new LinkedHashMap<>();
        linkedHashMap.put("a","a");
        linkedHashMap.put("c","c");
        linkedHashMap.put("b","b");
        linkedHashMap.put("d","d");
        linkedHashMap.forEach((k,v)->{
            System.out.println(k+","+v);
        });
    }
}

输出数据：
在这里插入图片描述
可以发现是acbd，而不是abcd，这也能说明了他并不是像HashMap那样hashcode不同的key的对象之间没有任何关联。
那他到底是怎么去实现这种有序的关联。
看源码，走起。
先看到最基础的键值对对象entry()。
继承于HashMap，但是自己多定义了两个个Entry<K,V> before, after。
反观，HashMap的entry()
在这里插入图片描述自定义了一个Node<K,V> next;
从字面上理解，HashMap属于数组+单链表。而LinkedHashMap属于双链表存储碰撞的hashcode。
再往前面查看，可以看到

这里有一个头和尾，那么他是怎么去加入这个逻辑的呢？

这个方法里面提到了头和尾吧？那他是怎么用到的呢？
通过IDEA的Find Usages功能可以查询到此方法的调用是在newNode()方法。
在这里插入图片描述
那这个方法是被putVal()方法调用

这个应该就很熟悉了，这里是HashMap的put方法。
其实这个newNode()方法是因为被LinkedHashMap重写了所以才被调用。
可能看代码不是特别清楚他是怎么有序的，那我们就直接来个图片吧。

put数据。

put("a","a")
put(97,97)
put("c","c")
put("b","b")

在这里插入图片描述
那么内部的存储就是这样的，我们也可以遍历一下。

结果是这样吧？那么都有hashMap了，还用他干嘛？其实他的作用在于实现缓存淘汰框架
LRU(最近最少使用算法)缓存淘汰算法
LFU(最不经常使用算法)缓存淘汰算法
ARC(自适应缓存替换算法)缓存淘汰算法
FIFO（先进先出算法）缓存淘汰算法
MRU(最近最常使用算法)缓存淘汰算法

实现LRU算法
我么可以看到，这里有一段代码。
在这里插入图片描述
我们可以更改这个accessOrder变量，在其他的构造函数中accessOrder变量都是默认为false，我们只要改成true，就可以实现LRU算法，LRU算法是在内存满了的情况下，将最近最少使用的数据删除。我们可以对比一下，true和false的情况下，遍历集合的结果。
测试代码：

package com.hashmap;

import java.util.LinkedHashMap;

/**
 * @author 龙小虬
 * @date 2021/3/31 14:53
 */
public class Main {
    public static void main(String[] args) {
        LinkedHashMap<Object,String> linkedHashMap1 = new LinkedHashMap<Object,String>(3,0.75f,true);
        linkedHashMap1.put("a","a");
        linkedHashMap1.put("b","b");
        linkedHashMap1.put("c","c");
        linkedHashMap1.get("a");
        linkedHashMap1.forEach((k,v)->{
            System.out.println(k+","+v);
        });
        System.out.println("-----------------");
        LinkedHashMap<Object,String> linkedHashMap2 = new LinkedHashMap<Object,String>(3,0.75f,true);
        linkedHashMap2.put("a","a");
        linkedHashMap2.put("b","b");
        linkedHashMap2.put("c","c");
        linkedHashMap1.get("a");
        linkedHashMap2.forEach((k,v)->{
            System.out.println(k+","+v);
        });

    }
}

运行结果：
在这里插入图片描述
我们会发现linkedHashMap1中使用get会改变链表的顺序。而linkedHashMap2并不会因为get方法而改变链表的顺序。

2.HashMap的put()方法
HashMap的put()方法中有两个重点。
1.(h = key.hashCode()) ^ (h >>> 16)
2.tab[i = (n - 1) & hash]
为什么他是很重要的，为什么我们的key不直接使用hashcode，而是使用了移位的运算方法，更改了下标。
我们先来看看h >>> 16，为什么使用移位运算。
看看官方的解释。

/**
  * Computes key.hashCode() and spreads (XORs) higher bits of hash
  * to lower.  Because the table uses power-of-two masking, sets of
  * hashes that vary only in bits above the current mask will
  * always collide. (Among known examples are sets of Float keys
  * holding consecutive whole numbers in small tables.)  So we
  * apply a transform that spreads the impact of higher bits
  * downward. There is a tradeoff between speed, utility, and
  * quality of bit-spreading. Because many common sets of hashes
  * are already reasonably distributed (so don't benefit from
  * spreading), and because we use trees to handle large sets of
  * collisions in bins, we just XOR some shifted bits in the
  * cheapest possible way to reduce systematic lossage, as well as
  * to incorporate impact of the highest bits that would otherwise
  * never be used in index calculations because of table bounds.
  */

里面的意思，大概就是：
由于哈希表的容量都是 2 的 N 次方，在当前，元素的 hashCode() 在很多时候下低位是相同的，这将导致冲突（碰撞）
其实这个解释还是挺合乎情理的，毕竟我们需要尽量的避免hash冲突。那又为什么要使用异或运算呢？

使得前面的高位参与到hash运算中，减少发生hash冲突的概率，因为在运算中我们保留了高半区的特征，又使用高半区的数据与低半区的数据进行运算，使得低半区也会一部分高半区的信息，这样就更加的降低了运算后的低半区数据的随机性。因为此运算是相同为0，不同为1，很大程度上减少了低位的1的存在。

那为何要使用“&”运算？我们应该都知道，此运算的方法是只有两个对位二进制均为1的时候才会为1。又因为数组的扩容是采取2的次幂进行扩容的，所以这样可以确保不越界。并且采取此运算可以保证最大数据为（n-1）