啃知识系列_HashMap

最新推荐文章于 2024-04-20 15:02:44 发布

Vilochen_

最新推荐文章于 2024-04-20 15:02:44 发布

阅读量562

点赞数

分类专栏： Java啃知识系列

本文链接：https://blog.csdn.net/wyc199273/article/details/53887271

版权

Java啃知识系列专栏收录该内容

2 篇文章 0 订阅

订阅专栏

《Effective JAVA》中认为，99%的情况下，当你覆盖了equals方法后，请务必覆盖hashCode方法。默认情况下，这两者会采用Object的“原生”实现方式，即：

protected native int hashCode();  
public boolean equals(Object obj) {  
    return (this == obj);  
}

hashCode方法的定义用到了native关键字，表示它是由C或C++采用较为底层的方式来实现的，你可以认为它返回了该对象的内存地址.而缺省equals则认为，只有当两者引用同一个对象时，才认为它们是相等的。如果你只是覆盖了equals()而没有重新定义hashCode()，在读取HashMap的时候，除非你使用一个与你保存时引用完全相同的对象作为key值，否则你将得不到该key所对应的值。

另一方面，你应该尽量避免使用“可变”的类作为HashMap的键。如果你将一个对象作为键值并保存在HashMap中，之后又改变了其状态，那么HashMap就会产生混乱，你所保存的值可能丢失（尽管遍历集合可能可以找到）。

(这里说的可变,我的理解是重写的hashCode是如果会根据一些情况而改变,从而看作是'可变'对象,这样肯定get的时候无法找到之前的值.)

HashMap实际上是一个数组和链表的集合体. 利用数组来模拟一个个桶从而快速存取不同的hashCode的key,对于相同的hashCode不同的key,再调用其equals方法从中提取出key对应的value的值.

---- 摘自. http://www.nowamagic.net/librarys/veda/detail/1202

问题1.再看JDK1.7的HashMap源码的时候,有一块代码是这样的.

public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

这里有一个indexFor是根据Key的Hash和Map的长度求他的table的索引.

然后下面会做相应的添加操作.我之前有个疑问是,是否会有索引i相同,但是hash不同的情况,这种情况会导致addEntry()这里直接覆盖之前的key的value导致出问题.

后来经过测试发现,原来indexFor这里会做如下操作

static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
 }

根据hash和length-1与得到他的index,那么这里的length由于HashMap规定,length必须是2的整数方. 所以hash不同的时候,index一定是不同的..

问题2 . 当hash值相同的时候,得到的索引也会是相同的,

如果key相同,那么会直接做覆盖.

如果两个key不相同,会将该元素添加到桶的头部,然后之前的元素插入到next.这个时候依赖的是Entry这个数据结构的设计.

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        int hash;

        /**
         * Creates new entry.
         */
        Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }

       //....后面省略
    }

当出现上述所说的情况时,会做如下操作. modCount++; addEntry(hash,key,value,i). modCount之后再说,这里主要看addEntry .

void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }

if里面是map调整大小的时候做的操作,重新分配table,然后重新计算hash和桶的索引. 这个问题我们主要看这个createEntry.

void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
 }

我们发现,这里是的到桶的头部的元素e,然后将桶头部创建新的元素,这里的newEntry() , 将之前的元素e传入其中,而e是放到next这里的.从而做到将新元素插入到桶的列表中.完成插入.

问题3.如何得到hash值相同的key的value的.

public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }

对特殊的key==null的时候先不做阐述.这里看getEntry.

final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

这里通过key得到hash然后得到table的索引. 接下来遍历entry,再判断key == || key.equals从而得到entry.在返回entry.getValue().

问题4. modCount的作用.

当我们做一些修改map的操作的时候,如put,remove,clear等相关操作,我们总会发现有一行

modCount++;

这个是HashMap的一个实例变量. 他的作用是防止我们在迭代过程中,修改Map. 迭代过程中修改map,会抛出ConcurrentModificationException. 这就是fail-fast策略.

private abstract class HashIterator<E> implements Iterator<E> {
        Entry<K,V> next;        // next entry to return
        int expectedModCount;   // For fast-fail
        int index;              // current slot
        Entry<K,V> current;     // current entry

        HashIterator() {
            expectedModCount = modCount;
            if (size > 0) { // advance to first entry
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Entry<K,V> nextEntry() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Entry<K,V> e = next;
            if (e == null)
                throw new NoSuchElementException();

            if ((next = e.next) == null) {
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
            current = e;
            return e;
        }

        public void remove() {
            if (current == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Object k = current.key;
            current = null;
            HashMap.this.removeEntryForKey(k);
            expectedModCount = modCount;
        }
    }

在我们的迭代器中用到了.

在迭代器构造的时候会有,expectedModCount = modCount; 当迭代过程中,对map进行了修改,那么在遍历的时候就会导致modCount != expectedModCount从而抛出异常.

利用这一机制避免问题的发生.

问题5.HashMap死循环问题.

这里有几篇写的很好的博客,就不自己重复阐述了.

http://coolshell.cn/articles/9606.html

http://blog.csdn.net/chenxuegui1234/article/details/39646041

这里对最后为什么会出现如下情况做些自己的理解,有问题希望指出.

首先说为什么最后会反转. 因为看代码

void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }

可以发现,当我们从原来的table向新的table放entry的时候,因为取的时候是正序取出来,所以顺序是3,7,5依次取出.但是插入的时候,链表从头插入,所以在新的table中,index的3位置就变成了7,3这种.

接下来说为什么线程2变换之后,线程1的next,e的指针最后变成那样.

因为不同线程中,实例变量是互相不受影响的,但是由于e和next是引用,指向next,e. 所以线程1阻塞等待运行的时候,线程2改变了e和next的对象,使得next.next指向了e,线程1开始运行的时候,是变化后的.所以导致之后的环形链出现