HashMap实现原理

最新推荐文章于 2024-06-12 16:53:50 发布

pmdream

最新推荐文章于 2024-06-12 16:53:50 发布

阅读量144

点赞数

分类专栏：问题

问题专栏收录该内容

16 篇文章 0 订阅

订阅专栏

参考：https://www.cnblogs.com/chengxiao/p/6059914.html

题外话：

数组的特点：数组是连续的存储单元存储数据，对于数组下标的查找，时间复杂度是O（1），但是如果对于定值进行查找，则需要O（n），逐个进行比对，所以需要O（n），对于有序的数组可以采用二分查找，对于一般的插入删除操作，涉及到数组元素的移动，其平均复杂度也为O(n)。

对于线性链表：新增删除等操作，仅需要处理结点间的引用即可，时间复杂度为O（1），查找操作也是需要进行逐一的比较的所以需要的O（n）

对于二叉树：对于一个相对平衡的二叉树，对于插入查找和删除的操作平均的时间复杂度为O（logn）

哈希表：添加删除和查找等操作性能都是很高的 O（1）

数据结构的物理存储结构只有两种：顺序存储结构和链式存储结构（所有的数据结构比如说堆栈和树图等等，映射到内存中都是两种屋里结构）

哈希表的主干就是数组

新增或者查找某个元素：　存储位置 = f(关键字)

f其实就是哈希函数，这个函数的设计好坏会直接影响到哈希表的优劣

哈希冲突怎么解决的？

一、hashMap是怎么实现的？

HashMap的主干是一个Entry数组。Entry是HashMap的基本组成单元，每一个Entry包含一个key-value键值对。

hashMap 的主干是一个Entry数组. (什么是Entry数组？其实就是一个key value 的键值对)

HashMap可以是一种键值对的一个数组，主干数组长度为2的次幂

transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

引申问题 transient关键字是干什么的？

序列化是什么？

Java的序列化就是说对象转换成字节序列的形式来表示

这些字节包含了对象和数据的信息，一个序列化后的对象可以被写道数据库或者文件中，也可以用来文件传输。

一般当我们使用cache或者远程调用rpc （远程调用RPC的一些实例 github上面去看源码！= =）经常需要实现序列化的接口Serializable接口

序列化的目的是什么？反序列化，恢复成为java 对象

那么 transient关键字是干什么的？

其实这个关键字就是要让某些修饰的成员属性不进行序列化，但是为啥不让他序列化呢，什么情况下不进行序列化呢？

1、类中的字段值可以根据其它字段推导出来，如一个长方形类有三个属性：长度、宽度、面积（示例而已，一般不会这样设计），那么在序列化的时候，面积这个属性就没必要被序列化了；

2、其它，看具体业务需求吧，哪些字段不想被序列化；

比如hashMap中
transient int modCount;
为什么要不被序列化呢，主要是为了节省存储空间，其它的感觉没啥好处，可能还有坏处（有些字段可能需要重新计算，初始化什么的），总的来说，利大于弊。

引申的问题：Java关键字transient和volatile

Entry是HashMap的一个静态内部类，代码如下：

/**
     * Basic hash bin node, used for most entries.  (See below for
     * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
     */
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash; //
        final K key;
        V value;
        Node<K,V> next;   //存储指向下一个Entry的引用，单链表结构

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

//Enrty 其实是继承Map。Entry 拥有这泛型的K 和 V

是一种单链表的结构，对key的hashcode值进行hash运算然后储存在Entry中，避免重复计算。

HashMap由数组+链表组成的，数组是HashMap的主体，链表则是主要为了解决哈希冲突而存在的，如果定位到的数组位置不含链表（当前entry的next指向null）,那么对于查找，添加等操作很快，仅需一次寻址即可；如果定位到的数组包含链表，对于添加操作，其时间复杂度为O(n)，首先遍历链表，存在即覆盖，否则新增；对于查找操作来讲，仍需遍历链表，然后通过key对象的equals方法逐一比对查找。所以，性能考虑，HashMap中的链表出现越少，性能才会越好。

问题：怎么样提高HahsMap的性能，尽量少的减少链表出现的次数。

其他的重要字段

//实际存储的key-value键值对的个数
   /**
     * The number of key-value mappings contained in this map.
     */
transient int size;
//阈值，当table == {}时，该值为初始容量（初始容量默认为16）；当table被填充了，也就是为table分配内存空间后，threshold一般为 capacity*loadFactory。HashMap在进行扩容时需要参考threshold，后面会详细谈到
int threshold;
//负载因子，代表了table的填充度有多少，默认是0.75
final float loadFactor;
//用于快速失败，由于HashMap非线程安全，在对HashMap进行迭代时，如果期间其他线程的参与导致HashMap的结构发生变化了（比如put，remove等操作），需要抛出异常ConcurrentModificationException
 /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
transient int modCount;

注意HashMap是非线程安全的，所以对HashMap进行迭代的时候，其他县城导致hashMap结构变化了（比如put，remove等操作），需要抛出异常ConcurrentModificationException

所以 modeCount用来标识结构发生变化的数量？

HashMap的扩容相关？

HashMap有4个构造器，其他构造器如果用户没有传入initialCapacity 和loadFactor这两个参数，会使用默认值

initialCapacity默认为16，loadFactory默认为0.75

/**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }

    /**
     * Constructs a new <tt>HashMap</tt> with the same mappings as the
     * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
     * default load factor (0.75) and an initial capacity sufficient to
     * hold the mappings in the specified <tt>Map</tt>.
     *
     * @param   m the map whose mappings are to be placed in this map
     * @throws  NullPointerException if the specified map is null
     */
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }

不太了解为什么，与原博主的代码不太一样= =|||

但是还是以自己的源码为主

public HashMap(int initialCapacity, float loadFactor) {
　　　　　//此处对传入的初始容量进行校验，最大不能超过MAXIMUM_CAPACITY = 1<<30(230)
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        this.loadFactor = loadFactor;
        threshold = initialCapacity;
　　　　　
        init();//init方法在HashMap中没有实际实现，不过在其子类如 linkedHashMap中就会有对应实现
    }
//原博主的代码

　　从上面这段代码我们可以看出，在常规构造器中，没有为数组table分配内存空间（有一个入参为指定Map的构造器例外），而是在执行put操作的时候才真正构建table数组

那么put操作是怎么实现的

 public V put(K key, V value) {
        //如果table数组为空数组{}，进行数组填充（为table分配实际内存空间），入参为threshold，此时threshold为initialCapacity 默认是1<<4(24=16)
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
       //如果key为null，存储位置为table[0]或table[0]的冲突链上
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);//对key的hashcode进一步计算，确保散列均匀
        int i = indexFor(hash, table.length);//获取在table中的实际位置
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        //如果该对应数据已存在，执行覆盖操作。用新value替换旧value，并返回旧value
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;//保证并发访问时，若HashMap内部结构发生变化，快速响应失败
        addEntry(hash, key, value, i);//新增一个entry
        return null;
    }