HashMap源码解析

最新推荐文章于 2024-01-14 18:43:13 发布

长腿欧巴的痘痘

最新推荐文章于 2024-01-14 18:43:13 发布

阅读量632

点赞数

分类专栏： android技术开发文章标签： hashmap源码数据结构链表面试

本文链接：https://blog.csdn.net/DZ266912/article/details/78233465

版权

android技术开发专栏收录该内容

9 篇文章 1 订阅

订阅专栏

HashMap很多面试都会问，很纳闷儿那么多数据结构为什么非挑HashMap呢？直到分析了源码以后才发现这里面包含了大量的知识点，单挑出一个来就能问到你怀疑人生。

HashMap数据结构

先介绍两种常用的数据结构，以及优缺点：

数组

数组在内存中是连续存放的，所以数组的遍历速度很快，但是在数组中插入数据时，需要将插入位置之后的元素依次向后移动，所以数组插入数据的效率较低。

链表

HashMap中使用的单向链表，实际上是由节点（Node）组成的，一个链表拥有不定数量的节点，其数据在内存中存储是不连续的，它存储的数据分散在内存中，每个结点只能也只有它能知道下一个结点的存储位置。由N个节点（Node）组成单向链表，每一个Node记录本Node的数据及下一个Node。

链表的操作都是直接或者间接的操作Head完成的，所以链表的遍历速度相对数组来说较慢。但是链表的插入数据和删除数据时相对数组来说速度要快，因为链表只需要更改节点的指向下一个节点的指针即可。

ArrayList内部使用的是数组，LinkedList使用的是链表数据结构，所以对于随机访问一个元素ArrayList速度快于LinkedList。但是对于增加或者删除元素LinkedList快于ArrayList。
当操作是在一列数据的后面添加数据而不是在前面或中间,并且需要随机地访问其中的元素时,使用ArrayList会提供比较好的性能；当你的操作是在一列数据的前面或中间添加或删除数据,并且按照顺序访问其中的元素时,就应该使用LinkedList了。

数组＋链表

HashMap使用的是链表＋数组的数据结构。HashMap使用了一个HashMapEntry数组来表示这种（数组＋链表）的结构。看图：
这里写图片描述

HashMap用法

HashMap用法很简单：

 HashMap<String, String> hashMap = new HashMap();
                hashMap.put("1", "123");
                hashMap.put(null, "null");
         String aNull = hashMap.put(null, "not null");
                hashMap.put("1", "1");
                hashMap.put("2", "2");
                hashMap.put("3", "3");
                hashMap.put("4", "4");
   Iterator<Map.Entry<String, String>> iterator = hashMap.entrySet().iterator();
                while (iterator.hasNext()) {
                    Map.Entry<String, String> next = iterator.next();
                    String key = next.getKey();
                    String value = next.getValue();
                }

源码分析

HashMap中源码中定义了下面两个常量：

static final int DEFAULT_INITIAL_CAPACITY = 16;// 默认初始容量为16，必须为2的幂 JDK版本不同 值可能也不同
static final float DEFAULT_LOAD_FACTOR = 0.75f;

DEFAULT_INITIAL_CAPACITY为HashMap的默认容量，通过HashMap的构造方法可以更改默认容量。
DEFAULT_LOAD_FACTOR为HashMap的加载因子（都这么翻译）。根据上面HashMap的数据结构图，可以看到HashMap的key和Value封装成了Entry存储到链表中。
当HashMap中数据很多时，每个链表必然会很长，链表的查询效率也会降低，要提高链表的查询速度就要减短链表的长度。所以当HashMap中元素的值超过16*0.75（默认值）时会对HashMap的数组扩容来减短每个链表的长度，来提高查询速度（很巧妙）。
（PS：个人觉得DEFAULT_LOAD_FACTOR翻译成扩容因子更恰当些）

HashMap<String, String> hashMap = new HashMap();
                hashMap.put("1", "123");
                hashMap.put(null, "null");
         String aNull = hashMap.put(null, "not null");

HashMap的构造方法有4个，最终会调用两个参数的构造方法，也很简单，主要是判断传入的配置参数是否合法。

public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY) {
        initialCapacity = MAXIMUM_CAPACITY;
    } else if (initialCapacity < DEFAULT_INITIAL_CAPACITY) {
        initialCapacity = DEFAULT_INITIAL_CAPACITY;
    }
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    // Android-Note: We always use the default load factor of 0.75f.

    // This might appear wrong but it's just awkward design. We always call
    // inflateTable() when table == EMPTY_TABLE. That method will take "threshold"
    // to mean "capacity" and then replace it with the real threshold (i.e, multiplied with
    // the load factor).
    threshold = initialCapacity;
    init();
}

接下来看put方法

public V put(K key, V value) {
    //如果Hash表为null 先根据传入的容量 初始化hash表
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);
    }
    //如果传入的Key为null
    if (key == null)
        return putForNullKey(value);
    //得到key的hash值
    int hash = sun.misc.Hashing.singleWordWangJenkinsHash(key);
    int i = indexFor(hash, table.length);
    //遍历HashTable 指定位置的HashMapEntry
    for (HashMapEntry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;    
         //比较插入key的hash值和链表中已经存在的hash值是否相等（hash值相等时，比较地址）
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            //在链表中找到了相同的key 把旧值换成新值
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }
    modCount++;
    addEntry(hash, key, value, i);
    return null;
}
static int indexFor(int h, int length) {
    // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
    return h & (length-1);
}

上面分析数据结构时知道HashMap定义了HashMapEntry数组来表示（数组＋链表）的数据结构。put方法就是将Key-Value放入HashMapEntry数组中。
放入之前先判空（好习惯），如果为null调用inflateTable方法创建HashMapEntry数组，长度为capacity * loadFactor，这里验证了上面的分析。

    private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        int capacity = roundUpToPowerOf2(toSize);

        // Android-changed: Replace usage of Math.min() here because this method is
        // called from the <clinit> of runtime, at which point the native libraries
        // needed by Float.* might not be loaded.
        float thresholdFloat = capacity * loadFactor;
        if (thresholdFloat > MAXIMUM_CAPACITY + 1) {
            thresholdFloat = MAXIMUM_CAPACITY + 1;
        }

        threshold = (int) thresholdFloat;
        table = new HashMapEntry[capacity];
    }

当key为null 时调用了putForNullKey(value);方法将数据放入到了数组的第0个位置上。

    private V putForNullKey(V value) {
        for (HashMapEntry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(0, null, value, 0);
        return null;
    }

下一步将把key转换成相应的hash值，然后调用indexFor（）方法，得到该key在数组中的第几个位置。

//根据hash值找到在数组中的位置 本质上做的是％ 运算 （如果hash值是 10 长度是16 则将Entry放在数组的10的位置）
static int indexFor(int h, int length) {
    // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
    return h & (length-1);
}

找到Key所在的数组的位置后，会遍历该位置的HashMapEntry对象，看看该Key是不是已经存在，先比较hash值是否相等，如过hash值相等再调用equals方法比较两个key的内存地址是否相等（相等的概率很低）。

“==” 用来比较两个地址是否相等，equals也是比较两个地址是否相等。但是在String中重写了equals方法，重写后 equals是比较两个字符串的值是否相等。
对于非String变量来说 ==和equals都是来判断指向的地址是否相同

调用addEntry将Entry添加到相应的HashMapEntry中，根据上面的分析当HashMap的size大于16*0.75时，将会对数组扩容，下面的代码给予了验证。

void addEntry(int hash, K key, V value, int bucketIndex) {
    if ((size >= threshold) && (null != table[bucketIndex])) {
        //扩容
        resize(2 * table.length);
        hash = (null != key) ? sun.misc.Hashing.singleWordWangJenkinsHash(key) : 0;
        bucketIndex = indexFor(hash, table.length);
    }
    //
    createEntry(hash, key, value, bucketIndex);
}
void resize(int newCapacity) {
    HashMapEntry[] oldTable = table;
    int oldCapacity = oldTable.length;
    if (oldCapacity == MAXIMUM_CAPACITY) {
        threshold = Integer.MAX_VALUE;
        return;
    }
    HashMapEntry[] newTable = new HashMapEntry[newCapacity];
    transfer(newTable);
    table = newTable;
    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
void transfer(HashMapEntry[] newTable) {
    int newCapacity = newTable.length;
    for (HashMapEntry<K,V> e : table) {
        while(null != e) {
            HashMapEntry<K,V> next = e.next;
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}

resize的过程就是创建新的数组，然后遍历旧数组中的key的hash值，重新计算该key在新数组中的位置，然后插进入。可以得知这个过程是比较耗时的，如果我们已经得知HashMap中数据量的大小，可以在构造方法中更改HahsMap的默认容量，来减少HashMap中扩容的次数，从而提高效率。
最后通过createEntry方法将key-Value封装成Entry对象，然后插入到链表中。

 //创建Entry并插入到HashMapEntry的头部
void createEntry(int hash, K key, V value, int bucketIndex) {
  //将之前旧的头取出来赋值给Entry的next 插入的Entry指向它
    HashMapEntry<K,V> e = table[bucketIndex];
   //将创建的Entry放到头部
    table[bucketIndex] = new HashMapEntry<>(hash, key, value, e);
    size++;//HashMap的size＋1
}

到这里HashMap put的过程就结束了。
下篇分析怎么样取数据
http://blog.csdn.net/dz266912/article/details/78233777