HashMap1.8中hash()

sunshineing~~~

于 2022-12-11 17:45:15 发布

阅读量175

点赞数

文章标签： java

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_44049210/article/details/128276781

版权

1.8中Hash函数的实现：

/**

* Computes key.hashCode() and spreads (XORs) higher bits of hash

* to lower. Because the table uses power-of-two masking, sets of

* hashes that vary only in bits above the current mask will

* always collide. (Among known examples are sets of Float keys

* holding consecutive whole numbers in small tables.) So we

* apply a transform that spreads the impact of higher bits

* downward. There is a tradeoff between speed, utility, and

* quality of bit-spreading. Because many common sets of hashes

* are already reasonably distributed (so don't benefit from

* spreading), and because we use trees to handle large sets of

* collisions in bins, we just XOR some shifted bits in the

* cheapest possible way to reduce systematic lossage, as well as

* to incorporate impact of the highest bits that would otherwise

* never be used in index calculations because of table bounds.

*/

计算key的hashCode值，并且通过异或操作将哈希值高位向低位扩展。

static final int hash(Object key) {

int h;

return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);

}

目的：key存储在数组中的位置(槽位或桶)更加分散，减少哈希冲突(哈希碰撞，计算得的数组下标一样)，提高读写性能。

实现方式：先通过jdk的hashCode()方法获取key的hashCode值，然后hashCode值右移16位，然后两者做异或运算。

实现过程分析：

1、h = key.hashCode()

hashCode()方法返回值是一个int类型，int占4字节，32位。

2、h >>> 16

将hashCode无符号右移16位，如：张三hashCode()值为774889

二进制： 00000000 00001011 11010010 11101001

右移16位：00000000 00000000 00000000 00001011

3、hash = h ^ (h >>> 16)

高16位不动；低16位与高16位做异或运算；高16位的参与，增加了结果的随机性

00000000 00001011 11010010 11101001

^ 00000000 00000000 00000000 00001011

-------------------------------------

= 0000000000001011 1101001011100010

4、(n - 1) & hash

存值时计算数组下标，即桶或槽位。n指的是代码HashMap中数组的长度，初始的时候没有指定，默认情况下n就是2^4 = 16

(n - 1) = 16 - 1 = 15

那还有一个问题：为什么要n-1?

以默认长度：16（2^4）为例，那数组对应的下标就是0-15之间

计算方式：hash % (2^4)；本质就是和长度取余

等价计算方式：hash & (2^4 - 1) 位运算效率高

hash 0000000000001011 1101001011100010

& (2^4 - 1) 00000000 00000000 00000000 00001111

----------------------------------------------

= 00000000 00000000 00000000 00000010

十进制 = 2

由此，可以得出"张三"最终所属的槽位就是：2。

整个hash值，除了低四位参与了计算，其他全部没有起到任何的作用，这样就会导致，key的hash值是低位相同，高位不同的话，计算出来的槽位下标都是同一个，大大增加了碰撞的几率；

但如果使用h ^ (h >>> 16)，将高位参与到低位的运算，整个随机性大大增加；

根据源码可知，无论是初始化，还是保存过程中的扩容，槽位数的长度始终是2^n；通过(2^n - 1) & hash公式计算出来的槽位索引更具散列性；假如默认槽位数n的长度不是16（2^4）,而是17，会出现什么效果呢？

在做**(n - 1) & hash**运算的时候，计算过程如下：

hash 0000000000001011 1101001011100010

&(17 - 1) = 16 00000000 00000000 00000000 00010000

------------------------------------------------------------

= 00000000 00000000 00000000 00000000

十进制 = 0

由于16的二进制是00010000，最终参与&(与运算)的只有1位，其他的值全部被0给屏蔽了；导致最终计算出来的槽位下标只能是0或16，那么所有的值也就只会保存在这两个槽位下；其他索引将永远无法命中，这对HashMap来说，无疑是灾难性的，保存的值越多，存取效率将会大大降低。

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。