HashMap源码阅读

最新推荐文章于 2023-04-12 15:35:40 发布

一世留恋510

最新推荐文章于 2023-04-12 15:35:40 发布

阅读量288

点赞数 2

分类专栏：面试文章标签：面经

面试专栏收录该内容

5 篇文章 0 订阅

订阅专栏

面试中会经常被问到HashMap的实现，而且日常工作中也会经常使用的HashMap，HashMap的重要性便不言而喻了。下面我挑自己感觉重要的地方开始看起，顺序也按我理解的顺序开始读起。

一、从HashMap中定义的属性看起。从源码中我们可以看到HashMap中数组初始大小默认为16，大小为2的n次方，每次扩容也是扩大为原来的2倍，这在计算位置时可以非常巧妙地使用位运算。负载因子默认为0.75，也就是当map中元素数目达到数组大小*0.75时将进行扩容。

/**
     * 默认初始容量为16.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * 最大容量2^30
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * An empty table instance to share when the table is not inflated.
     */
    static final Entry<?,?>[] EMPTY_TABLE = {};

    /**
     * 对应数组部分，初始容量为16，大小是2的n次方。transient 修饰变量表示在序列化时不被序列化。
     */
    transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

    /**
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * 进行扩容的阈值 (capacity * load factor).
     * @serial
     */
    int threshold;

    /**
     * The load factor for the hash table.
     */
    final float loadFactor;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

    /**
     * The default threshold of map capacity above which alternative hashing is
     * used for String keys. Alternative hashing reduces the incidence of
     * collisions due to weak hash code calculation for String keys.
     */
    static final int ALTERNATIVE_HASHING_THRESHOLD_DEFAULT = Integer.MAX_VALUE;

二、HashMap的put(K key,V value)。

 public V put(K key, V value) {
	if (table == EMPTY_TABLE) {
		inflateTable(threshold);
	}
	//允许key为空
	if (key == null)
		return putForNullKey(value);
	//根据hashcode值计算hash值
	int hash = hash(key);
	//根据hash值计算在数组中的位置
	int i = indexFor(hash, table.length);
	//顺着链进行查找
	for (Entry<K,V> e = table[i]; e != null; e = e.next) {
		Object k;
		//key是同一个对象或是equals值相同
		if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
			V oldValue = e.value;
			e.value = value;
			e.recordAccess(this);
			return oldValue;
		}
	}
	//记录修改结构的次数
	modCount++;
	//在头部添加元素
	addEntry(hash, key, value, i);
	return null;
}

可以发现在HashMap中键值跟value都可以为null：

private V putForNullKey(V value) {
	for (Entry<K,V> e = table[0]; e != null; e = e.next) {
		if (e.key == null) {
			V oldValue = e.value;
			e.value = value;
			e.recordAccess(this);
			return oldValue;
		}
	}
	modCount++;
	addEntry(0, null, value, 0);
	return null;
}

以位运算来取代求模运算：

static int indexFor(int h, int length) {
	// 因长度都为2的n次方，此处求余运算可以转换为按位与运算
	return h & (length-1);
}

三、HashMap的get(key)。

public V get(Object key) {
	//key为null的情况
	if (key == null)
		return getForNullKey();
	Entry<K,V> entry = getEntry(key);
	//有对应实体则返回value值，否则返回null
	return null == entry ? null : entry.getValue();
}

private V getForNullKey() {
	//一个元素都没有时返回null
	if (size == 0) {
		return null;
	}
	//返回key值为null的value
	for (Entry<K,V> e = table[0]; e != null; e = e.next) {
		if (e.key == null)
			return e.value;
	}
	return null;
}

final Entry<K,V> getEntry(Object key) {
	//一个元素都没有时返回null
	if (size == 0) {
		return null;
	}
	//利用key的hashcode值计算hash值
	int hash = (key == null) ? 0 : hash(key);
	//在找到的链上从前往后找
	for (Entry<K,V> e = table[indexFor(hash, table.length)];
		 e != null;
		 e = e.next) {
		Object k;
		//关键点：因不同的对象也可能对应相同的hash值，所以不仅要hash值相同，equals方法对应值也要相同。
		//对于普通对象来说，equals相同实际上是比较的引用值，对应String类型来说，equals比较的是内容。
		if (e.hash == hash &&
			((k = e.key) == key || (key != null && key.equals(k))))
			return e;
	}
	return null;
}

附一篇感觉不错的文章：http://www.importnew.com/7099.html，从这篇文章中我们可以学到我们最好使用Stirng,Integer等这类不变的wrapper类作key值，这样才能找到我们所需的对象，当然我们也可以重写类的hashcode跟equals方法来实现。

四、HashMap的扩容

void addEntry(int hash, K key, V value, int bucketIndex) {
	//在添加实体时要先看是否达到了负载因子对应的threshold，当达到时就要进行扩容，容量变为原来的2倍。
	if ((size >= threshold) && (null != table[bucketIndex])) {
		resize(2 * table.length);
		hash = (null != key) ? hash(key) : 0;
		bucketIndex = indexFor(hash, table.length);
	}

	createEntry(hash, key, value, bucketIndex);
}

void resize(int newCapacity) {
	Entry[] oldTable = table;
	int oldCapacity = oldTable.length;
	if (oldCapacity == MAXIMUM_CAPACITY) {
		threshold = Integer.MAX_VALUE;
		return;
	}

	Entry[] newTable = new Entry[newCapacity];
	transfer(newTable, initHashSeedAsNeeded(newCapacity));
	table = newTable;
	threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}

void transfer(Entry[] newTable, boolean rehash) {
	int newCapacity = newTable.length;
	//遍历桶
	for (Entry<K,V> e : table) {
		//顺着链挨个复制,将每个Entry接到新桶的最前面
		while(null != e) {
			Entry<K,V> next = e.next;
			if (rehash) {
				e.hash = null == e.key ? 0 : hash(e.key);
			}
			//需要重新计算位置
			int i = indexFor(e.hash, newCapacity);
			e.next = newTable[i];
			newTable[i] = e;
			e = next;
		}
	}
}

五、HashMap并发安全问题。

友情链接：http://www.cnblogs.com/andy-zhou/p/5402984.html

1、两个线程同时在一个链上插入Entry时会丢失元素。

void createEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];
    //两个线程同时创建节点时会丢失一个
    table[bucketIndex] = new Entry<>(hash, key, value, e);
    size++;
}

2、多线程并发，在rehash时可能出现环。

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
        while(null != e) {
            Entry<K,V> next = e.next;//-->假设线程1被在这阻塞
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);
            //悲剧发生处，如另一个线程已经反转了顺序，在这再重新反转，会出现环
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}

六、解决HashMap并发安全问题。

1、使用HashTable替代HashMap

2、使用ConcurrentHashMap替代HashMap，其利用了分段锁的原理，只对相应段进行加锁，效率比HashTable要高。

3、使用Collections.synchronizedMap将HashMap包装起来。