HashMap源码初探

Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

​ Hash表是基于Map接口的实现。该实现提供了所有的map接口函数,并且允许k,v为null。(除了非同步和允许null外,HashMap类完全的等同于HashTable)。HashMap类不能够保证map映射的顺序;特别是是,不能够映射的顺序持久不变。

上述内容描述了,HashMap允许k,v为null,并且说明了在并发的场景下,HashMap是线程不安全的,容易出现数据的覆盖和丢失

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the “capacity” of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

​ 假设hash函数能够将元素离散的分布在桶中,那么HashMap对于get和put方法的实现时常数级别的。当在遍历HashMap的时候,所需的时间与HashMap的实例(桶的数量capacity)加上键值对的数量(size)成正比。因此初始化的容量即桶的数量不能够太大(或者负载因子不能太小)。

上述内容描述了,HashMap在执行get和put方法的时候,执行时间是常数级别的。同时在初始化HashMap的时候,桶的大小不能够太大,负载因子不能够太小。

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

HashMap的实例有两个参数影响其性能:初始化容量(桶的大小)和负载因子。初始化容量是在hash表中桶的数量,初始化容量仅仅是hash表创建时的容量。负载因子是能够判断桶是否满的一个度量标准,如果当前HashMap的size为10,元素个数为5的话,当前的负载因子为0.5,表示已经装了一半的数据。随着hash表中的实体数量超过负载因子和初始化容量的时候,hash表会进行rehash过程,也就是扩容,扩容的大小是当前桶数量的2的幂次方倍,如果当前桶的数量为14,则扩容后数量为32。

上述内容描述了,HashMap的性能由两个因素决定:负载因子和初始化容量。hash扩容的过程。

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

一般意义来讲,默认的负载因子为0.75,这个值是对执行时间空间的一个均衡。负载因子越大,证明桶越能被装满,空间利用率很好,但是在查找的时候增加了查找开销(主要体现在get和put方法上)。因此在设置初始化桶容量的时候,需要考虑map中实体的数量的负载因子的大小,这样能够减少reHash过程。如果初始化容量超过最大实体数量除以负载因子,rehash过程将不会发生。也就是说初始化容量为1000的时候,负载因子为0.75,最大实体数量为99*0.75即132的时候,永远不会发生rehash。

上述内容描述了负载因子设置为0.75的原因是基于空间和时间上的tradeoff。至于为什么负载因子越大,get方法查找的时间越长,是因为随着插入的数据增多,越容易发生hash碰撞,发生hash碰撞后,该实体就会存储在链表或者红黑树上,查询的效率急剧下降。

If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same hashCode() is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are Comparable, this class may use comparison order among keys to help break ties.

​ 如果有太多的实例要存储在HashMap中,需要创建一个较大数量的桶来进行存储,以便能够根据自适应hash来满足实体的高效映射。使用相同的hashCode()函数对数量较大不同键计算hash会降低hash表的性能。为了改善性能,当key是可比的时候,HashMap可以使用键之间的比较顺序来帮助打破关系。

​ 为何初始化的容量越充足,存储效率更高,是因为不需要进行rehash过程,也即是将数据从一个地方转移到另一个地方。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值