HashMap 阅读笔记

最新推荐文章于 2024-10-08 13:47:01 发布

aiba5303

最新推荐文章于 2024-10-08 13:47:01 发布

阅读量69

点赞数

文章标签：数据结构与算法

原文链接：http://www.cnblogs.com/aprz512/p/5333631.html

版权

HashMap 的储存结构是数组+单链表的结构，如下图（盗的图）：

看构造函数：

    /**
     * Constructs a new empty {@code HashMap} instance.
     */
    @SuppressWarnings("unchecked")
    public HashMap() {
        table = (HashMapEntry<K, V>[]) EMPTY_TABLE;
        threshold = -1; // Forces first put invocation to replace EMPTY_TABLE
    }

    /**
     * Constructs a new {@code HashMap} instance with the specified capacity.
     *
     * @param capacity
     *            the initial capacity of this hash map.
     * @throws IllegalArgumentException
     *                when the capacity is less than zero.
     */
    public HashMap(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("Capacity: " + capacity);
        }

        if (capacity == 0) {
            @SuppressWarnings("unchecked")
            HashMapEntry<K, V>[] tab = (HashMapEntry<K, V>[]) EMPTY_TABLE;
            table = tab;
            threshold = -1; // Forces first put() to replace EMPTY_TABLE
            return;
        }

        if (capacity < MINIMUM_CAPACITY) {
            capacity = MINIMUM_CAPACITY;
        } else if (capacity > MAXIMUM_CAPACITY) {
            capacity = MAXIMUM_CAPACITY;
        } else {
            capacity = Collections.roundUpToPowerOfTwo(capacity);
        }
        makeTable(capacity);
    }

    /**
     * Constructs a new {@code HashMap} instance with the specified capacity and
     * load factor.
     *
     * @param capacity
     *            the initial capacity of this hash map.
     * @param loadFactor
     *            the initial load factor.
     * @throws IllegalArgumentException
     *                when the capacity is less than zero or the load factor is
     *                less or equal to zero or NaN.
     */
    public HashMap(int capacity, float loadFactor) {
        this(capacity);

        if (loadFactor <= 0 || Float.isNaN(loadFactor)) {
            throw new IllegalArgumentException("Load factor: " + loadFactor);
        }

        /*
         * Note that this implementation ignores loadFactor; it always uses
         * a load factor of 3/4. This simplifies the code and generally
         * improves performance.
         */
    }

    /**
     * Constructs a new {@code HashMap} instance containing the mappings from
     * the specified map.
     *
     * @param map
     *            the mappings to add.
     */
    public HashMap(Map<? extends K, ? extends V> map) {
        this(capacityForInitSize(map.size()));
        constructorPutAll(map);
    }

多个重载构造函数，可以根据需求指定加载因子和初始容量。

加载因子：储存的数据个数 > 容量 * 加载因子的时候，容量会翻倍。默认是 0.75。

初始容量：会自动更正为 2 的幂次方（最接近初始容量，但大于初始容量） ---- capacity = Collections.roundUpToPowerOfTwo(capacity);

这是因为，容量为 2 的幂次方数的时候，计算 hash 值的时候，发生碰撞的几率会更小，有兴趣可以自己研究研究。

再看，遍历 HashMap 的时候，用到的方法。

    /**
     * Returns a set containing all of the mappings in this map. Each mapping is
     * an instance of {@link Map.Entry}. As the set is backed by this map,
     * changes in one will be reflected in the other.
     *
     * @return a set of the mappings.
     */
    public Set<Entry<K, V>> entrySet() {
        Set<Entry<K, V>> es = entrySet;
        return (es != null) ? es : (entrySet = new EntrySet());
    }

返回由 mapping 组成的 Set 集合，每个 mapping 是一个 Map.Entry，Map.Entry 就是 key-value 组成的键值对。

看 Map.Entry 是什么：

/**
     * {@code Map.Entry} is a key/value mapping contained in a {@code Map}.
     */
    public static interface Entry<K,V> {
        /**
         * Compares the specified object to this {@code Map.Entry} and returns if they
         * are equal. To be equal, the object must be an instance of {@code Map.Entry} and have the
         * same key and value.
         *
         * @param object
         *            the {@code Object} to compare with this {@code Object}.
         * @return {@code true} if the specified {@code Object} is equal to this
         *         {@code Map.Entry}, {@code false} otherwise.
         * @see #hashCode()
         */
        public boolean equals(Object object);

        /**
         * Returns the key.
         *
         * @return the key
         */
        public K getKey();

        /**
         * Returns the value.
         *
         * @return the value
         */
        public V getValue();

        /**
         * Returns an integer hash code for the receiver. {@code Object} which are
         * equal return the same value for this method.
         *
         * @return the receiver's hash code.
         * @see #equals(Object)
         */
        public int hashCode();

        /**
         * Sets the value of this entry to the specified value, replacing any
         * existing value.
         *
         * @param object
         *            the new value to set.
         * @return object the replaced value of this entry.
         */
        public V setValue(V object);
    };

就是一个接口，注意其实现类就行。

注意到 entrySet 方法中，entrySet 为 null 时，entrySet 赋值为 new EntrySet()，看看 EntrySet 类：

    private final class EntrySet extends AbstractSet<Entry<K, V>> {
        public Iterator<Entry<K, V>> iterator() {
            return newEntryIterator();
        }
        public boolean contains(Object o) {
            if (!(o instanceof Entry))
                return false;
            Entry<?, ?> e = (Entry<?, ?>) o;
            return containsMapping(e.getKey(), e.getValue());
        }
        public boolean remove(Object o) {
            if (!(o instanceof Entry))
                return false;
            Entry<?, ?> e = (Entry<?, ?>)o;
            return removeMapping(e.getKey(), e.getValue());
        }
        public int size() {
            return size;
        }
        public boolean isEmpty() {
            return size == 0;
        }
        public void clear() {
            HashMap.this.clear();
        }
    }

当我们遍历 HashMap，先拿到 entrySet ，在获取其迭代器 iterator。

public Iterator<Entry<K, V>> iterator() {
            return newEntryIterator();
        }

iterator() 方法调用了 newEntryIterator() 方法，看看：

    Iterator<Entry<K, V>> newEntryIterator() { return new EntryIterator(); }

返回了一个 EntryIterator 对象。

private final class EntryIterator extends HashIterator
        implements Iterator<Entry<K, V>> {
    public Entry<K, V> next() { return nextEntry(); }
}

我们使用迭代器调用 next() 方法的时候，实际上是调用了 EntryIterator 父类 HashIterator 的 nextEntry() 方法。

    private abstract class HashIterator {
        int nextIndex;
        HashMapEntry<K, V> nextEntry = entryForNullKey;
        HashMapEntry<K, V> lastEntryReturned;
        int expectedModCount = modCount;

        HashIterator() {
            if (nextEntry == null) {
                HashMapEntry<K, V>[] tab = table;
                HashMapEntry<K, V> next = null;
                while (next == null && nextIndex < tab.length) {
                    next = tab[nextIndex++];
                }
                nextEntry = next;
            }
        }

        public boolean hasNext() {
            return nextEntry != null;
        }

        HashMapEntry<K, V> nextEntry() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            if (nextEntry == null)
                throw new NoSuchElementException();

            HashMapEntry<K, V> entryToReturn = nextEntry;
            HashMapEntry<K, V>[] tab = table;
            HashMapEntry<K, V> next = entryToReturn.next;
            while (next == null && nextIndex < tab.length) {
                next = tab[nextIndex++];
            }
            nextEntry = next;
            return lastEntryReturned = entryToReturn;
        }

        public void remove() {
            if (lastEntryReturned == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            HashMap.this.remove(lastEntryReturned.key);
            lastEntryReturned = null;
            expectedModCount = modCount;
        }
    }

可能会抛出 ConcurrentModificationException 异常。迭代时，注意多线程问题。

HashIterator 中 nextEntry 就是下一个 Entry 的值。注意到，nextEntry() 方法中，过滤了 Entry 为 null 的对象。

构造函数中，给 nextEntry 赋值为第一个不为 null 的 Entry。

再看操作 HashMap 集合的方法。

先是 put() 方法：

    /**
     * Maps the specified key to the specified value.
     *
     * @param key
     *            the key.
     * @param value
     *            the value.
     * @return the value of any previous mapping with the specified key or
     *         {@code null} if there was no such mapping.
     */
    @Override public V put(K key, V value) {
        if (key == null) {
            return putValueForNullKey(value);
        }

        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                preModify(e);
                V oldValue = e.value;
                e.value = value;
                return oldValue;
            }
        }

        // No entry for (non-null) key is present; create one
        modCount++;
        if (size++ > threshold) {
            tab = doubleCapacity();
            index = hash & (tab.length - 1);
        }
        addNewEntry(key, value, hash, index);
        return null;
    }

如果存入 key 值不为 null，根据 key 值计算出 key 对应的 hash 值，根据 hash 值，算出键值对在 table 中的位置。

transient HashMapEntry<K, V>[] table，table 是一个数组。table[index] 返回的是一个 HashMapEntry，HashMapEntry 是一个单链表结构，现在看上面的图，就好理解多了。

HashMap 就是一个单链表数组，这种结构是解决 hash 值的碰撞问题。

假设，存入多个 key-value 键值对，key1 计算出来的 hash 值，与 Key2 计算出来的 hash 值相等（但是 equals 方法却不相等），那么他们都要储存到同一个位置，为了解决这个问题，就使用链表来储存这些发生碰撞的数据。

拿到 hash 值对应位置的链表，查看链表有没有头，即 table[index] 是否有值。

如果 table[index] 有值，则遍历 key 值是否存在，存在则替换。

如果 table[index] 没有值，或者 key 值不存在，addNewEntry(key, value, hash, index);

    /**
     * Creates a new entry for the given key, value, hash, and index and
     * inserts it into the hash table. This method is called by put
     * (and indirectly, putAll), and overridden by LinkedHashMap. The hash
     * must incorporate the secondary hash function.
     */
    void addNewEntry(K key, V value, int hash, int index) {
        table[index] = new HashMapEntry<K, V>(key, value, hash, table[index]);
    }

将最新的值放入 table[index] 位置，其余的以链表形式挂到后面。

看 HashMapEntry 类：

    static class HashMapEntry<K, V> implements Entry<K, V> {
        final K key;
        V value;
        final int hash;
        HashMapEntry<K, V> next;

        HashMapEntry(K key, V value, int hash, HashMapEntry<K, V> next) {
            this.key = key;
            this.value = value;
            this.hash = hash;
            this.next = next;
        }

        public final K getKey() {
            return key;
        }

        public final V getValue() {
            return value;
        }

        public final V setValue(V value) {
            V oldValue = this.value;
            this.value = value;
            return oldValue;
        }

        @Override public final boolean equals(Object o) {
            if (!(o instanceof Entry)) {
                return false;
            }
            Entry<?, ?> e = (Entry<?, ?>) o;
            return Objects.equal(e.getKey(), key)
                    && Objects.equal(e.getValue(), value);
        }

        @Override public final int hashCode() {
            return (key == null ? 0 : key.hashCode()) ^
                    (value == null ? 0 : value.hashCode());
        }

        @Override public final String toString() {
            return key + "=" + value;
        }
    }

最重要的就是里面的 next 成员变量，看到这个就知道，是链表，实际上就是一个 Wrapper 类。

如果存入 key 值为 null 的键值对，走 putValueForNullKey 方法（说明 HashMap 支持存放 null 值，null 键）：

    private V putValueForNullKey(V value) {
        HashMapEntry<K, V> entry = entryForNullKey;
        if (entry == null) {
            addNewEntryForNullKey(value);
            size++;
            modCount++;
            return null;
        } else {
            preModify(entry);
            V oldValue = entry.value;
            entry.value = value;
            return oldValue;
        }
    }

entryForNullKey 为 null，表示还没有存放过 key 值为 null 的 Entry。有则替换，没有新建，很简单。

最后要注意的是 addNewEntry 和 preModify 方法，该方法被 LinkedHashMap 重写了。

看 get() 方法：

    /**
     * Returns the value of the mapping with the specified key.
     *
     * @param key
     *            the key.
     * @return the value of the mapping with the specified key, or {@code null}
     *         if no mapping for the specified key is found.
     */
    public V get(Object key) {
        if (key == null) {
            HashMapEntry<K, V> e = entryForNullKey;
            return e == null ? null : e.value;
        }

        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        for (HashMapEntry<K, V> e = tab[hash & (tab.length - 1)];
                e != null; e = e.next) {
            K eKey = e.key;
            if (eKey == key || (e.hash == hash && key.equals(eKey))) {
                return e.value;
            }
        }
        return null;
    }

key值为null，直接返回 entryForNullKey。

和 put 一样，计算出储存的位置，拿到链表遍历，key 相等，则返回。

思考一个问题，HashSet 的内部实现是使用的 HashMap，HashSet add 数据的时候，是存放到 HashMap 的 Key 中，还是 Value 中？？？

看 remove() 方法:

    /**
     * Removes the mapping with the specified key from this map.
     *
     * @param key
     *            the key of the mapping to remove.
     * @return the value of the removed mapping or {@code null} if no mapping
     *         for the specified key was found.
     */
    @Override public V remove(Object key) {
        if (key == null) {
            return removeNullKey();
        }
        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index], prev = null;
                e != null; prev = e, e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                if (prev == null) {
                    tab[index] = e.next;
                } else {
                    prev.next = e.next;
                }
                modCount++;
                size--;
                postRemove(e);
                return e.value;
            }
        }
        return null;
    }

仍然是对链表的操作。

大致就这样吧，其余的方法自己翻翻源码。

转载于:https://www.cnblogs.com/aprz512/p/5333631.html