/**
* Computes key.hashCode() and spreads (XORs) higher bits of hash
* to lower. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so don't benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
该段注释在hash()方法上。
通过hashCode()方法,计算得到key的hash值,并将近右移16位使其高16位与原hashcode进行异或(hash值本为32位的int类型)。
之所以需要右移16位,是因为散列表的大小都是2的阶乘,如果不将高位与低位进行异或将会造成持续的hash碰撞。
(原因在于当散列表足够小的时候,在计算存放位置的时候,参与比较的将只有低位,那么高位的hash码将会失去意义,导致大量碰撞产生)
一个显而易见的例子就是一批Float类型的数据将会在一个小数据量的散列表中造成持续碰撞。所以在这里将会通过把高位下移的方式来转换,这是一种对于速度和实用之间的妥协。
相比持续碰撞后采用树来保存大量hash碰撞的数据,右移几位进行异或的代价其实很小,也解决了高位的hash不会在小数据量的散列表中进行位置计算的问题。