HashTable源码分析

最新推荐文章于 2021-11-08 18:19:18 发布

Burgess_Lee

最新推荐文章于 2021-11-08 18:19:18 发布

阅读量153

点赞数

分类专栏： JDK源码分析测试

本文链接：https://blog.csdn.net/Burgess_Lee/article/details/89081417

版权

JDK源码分析测试专栏收录该内容

33 篇文章 0 订阅

订阅专栏

基于JDK1.8进行分析的，Hashtable类的实现也是基于“数组和链表”来实现的。

继承结构

public class Hashtable<K,V>
    extends Dictionary<K,V>
    implements Map<K,V>, Cloneable, java.io.Serializable

通过上面可以看到继承的是Dictionary类，实现了Map、Cloneable、Serializable。

成员属性

    //用来实现Hashtable所借助的数组,默认大小为11.
    private transient Entry<?,?>[] table;

    //用来记录table数组中存储元素的个数
    private transient int count;

    //扩容的阈值
    //即如果数组table中存储的元素个数大于threshold，则扩大数组table的大小
    private int threshold;

    //加载因子，默认值为0.75f
    //扩容阈值threshold的值等于loadFactor与数组table的容量的乘积.
    private float loadFactor;

    //修改次数
    private transient int modCount = 0;

    //序列化的时候使用
    private static final long serialVersionUID = 1421746759512286392L;

构造函数

    public Hashtable(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal Load: "+loadFactor);

        if (initialCapacity==0)
            initialCapacity = 1;
        this.loadFactor = loadFactor;
        table = new Entry<?,?>[initialCapacity];
        threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    }

    /**
     * Constructs a new, empty hashtable with the specified initial capacity
     * and default load factor (0.75).
     *
     * @param     initialCapacity   the initial capacity of the hashtable.
     * @exception IllegalArgumentException if the initial capacity is less
     *              than zero.
     */
    public Hashtable(int initialCapacity) {
        this(initialCapacity, 0.75f);
    }

    /**
     * Constructs a new, empty hashtable with a default initial capacity (11)
     * and load factor (0.75).
     */
    public Hashtable() {
        this(11, 0.75f);
    }
    public Hashtable(Map<? extends K, ? extends V> t) {
        this(Math.max(2*t.size(), 11), 0.75f);
        putAll(t);
    }

可以看到，其他构造方法都是调用第一个构造方法实现的。下面看看这个构造方法。

    //构造方法
    //initialCapacity为初始容量,默认大小为 11
    //loadFactor为加载因子，默认大小为0.75f，好比是你的内容超过了 总长度了75%,就会自动扩容
    public Hashtable(int initialCapacity, float loadFactor) {
        //校验数据合法性
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        //校验数据合法性
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal Load: "+loadFactor);
        
        //如果初始化的时候传入参数为0，那么构建一个长度为1的table
        if (initialCapacity==0)
            initialCapacity = 1;
        this.loadFactor = loadFactor;
        table = new Entry<?,?>[initialCapacity];
        //计算扩容阈值
        threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    }

成员方法

put(K key, V value)

    public synchronized V put(K key, V value) {
        // Make sure the value is not null
        //可见value为空，会抛出异常，所以不能存入空值
        if (value == null) {
            throw new NullPointerException();
        }

        // Makes sure the key is not already in the hashtable.
        //根据key计算位置
        Entry<?,?> tab[] = table;
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        @SuppressWarnings("unchecked")
        Entry<K,V> entry = (Entry<K,V>)tab[index];
        //key存在的时候，在链表中进行查找，然后更新对应的元素
        for(; entry != null ; entry = entry.next) {
            if ((entry.hash == hash) && entry.key.equals(key)) {
                V old = entry.value;
                entry.value = value;
                return old;
            }
        }
        //如果没有找到对应的key，那么直接添加新元素
        addEntry(hash, key, value, index);
        return null;
    }

首先看到的是synchronized，上锁，所以可以体现线程安全性。整体实现思想如下：首先根据拿到key的hashcode，然后根据hashCode找到此元素即将要存储的位置index.如果位置index已经有元素链表了，则在此链表中，寻找是否有key已经存在了，如果存在，则更新value值，如果不存在，则将此value存储在index位置上。然后看到了调用了addEntry方法。下面看看这个的源码。

addEntry(int hash, K key, V value, int index)

     private void addEntry(int hash, K key, V value, int index) {
        //修改记录修改次数
        modCount++;

        //存储保存现在的数据
        Entry<?,?> tab[] = table;
        //大于阈值
        if (count >= threshold) {
            // Rehash the table if the threshold is exceeded
            //再散列
            rehash();
            
            //扩容之后，对key的hashcode、以及即将要存储的位置index重新进行更新
            tab = table;
            hash = key.hashCode();
            index = (hash & 0x7FFFFFFF) % tab.length;
        }

        // Creates the new entry.
        @SuppressWarnings("unchecked")
        //key不存在的时候，直接创建新元素并存入
        Entry<K,V> e = (Entry<K,V>) tab[index];
        tab[index] = new Entry<>(hash, key, value, e);
        //元素个数+1
        count++;
    }

整体思想如下：首先检查下数组table存储的元素个数是否大于扩容门限threshold.如果大于，则扩容，扩容之后，重新获取key的hashcode，并根据hashcode重新计算要存储的位置index.最后将要存储的数据存储到table[index]中。这里要注意的一点是，如果table[index]中已经有其它元素了，那么在同一个位子上的元素将以链表的形式存放，新加入的放在链头，最先加入的放在链尾。那么我们来看下rehash方法。

rehash()

    protected void rehash() {
        //原来的大小
        int oldCapacity = table.length;
        //保存原来的数据
        Entry<?,?>[] oldMap = table;

        // overflow-conscious code
        //扩容大小为 *2+1
        int newCapacity = (oldCapacity << 1) + 1;
        //扩容大小数据校验
        if (newCapacity - MAX_ARRAY_SIZE > 0) {
            if (oldCapacity == MAX_ARRAY_SIZE)
                // Keep running with MAX_ARRAY_SIZE buckets
                return;
            newCapacity = MAX_ARRAY_SIZE;
        }
        //创建新的entry
        Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];
        
        //修改修改次数
        modCount++;
        //计算阈值
        threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
        //将原来table指向新entry
        table = newMap;
        
        //将原来的数据，重新存放
        for (int i = oldCapacity ; i-- > 0 ;) {
            for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
                Entry<K,V> e = old;
                old = old.next;

                int index = (e.hash & 0x7FFFFFFF) % newCapacity;
                e.next = (Entry<K,V>)newMap[index];
                newMap[index] = e;
            }
        }
    }

实现思想如下：首先对原来的长度乘以2+1就为即将扩容的长度。然后新建一个newCapacity的数组，将原来的table数组的元素拷贝到新的数组中去即可。

内部类Entry

    private static class Entry<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Entry<K,V> next;

        protected Entry(int hash, K key, V value, Entry<K,V> next) {
            this.hash = hash;
            this.key =  key;
            this.value = value;
            this.next = next;
        }

        @SuppressWarnings("unchecked")
        protected Object clone() {
            return new Entry<>(hash, key, value,
                                  (next==null ? null : (Entry<K,V>) next.clone()));
        }

        // Map.Entry Ops

        public K getKey() {
            return key;
        }

        public V getValue() {
            return value;
        }

        public V setValue(V value) {
            if (value == null)
                throw new NullPointerException();

            V oldValue = this.value;
            this.value = value;
            return oldValue;
        }

        public boolean equals(Object o) {
            if (!(o instanceof Map.Entry))
                return false;
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;

            return (key==null ? e.getKey()==null : key.equals(e.getKey())) &&
               (value==null ? e.getValue()==null : value.equals(e.getValue()));
        }

        public int hashCode() {
            return hash ^ Objects.hashCode(value);
        }

        public String toString() {
            return key.toString()+"="+value.toString();
        }
    }

这是里面的一个内部类。整体对于hashtable的整个维护体现了链表的作用。

get(Object key)

    public synchronized V get(Object key) {
        Entry<?,?> tab[] = table;
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                return (V)e.value;
            }
        }
        return null;
    }

从源码可以看到，实现思路如下，首先获取key的hashcode，然后根据hashcode得到存储位置index，最后在table[index]取出元素即可，不过要注意的是，这里可能存储了多个元素构成了一个链表，因此要进行一个key和hash的判断。

remove(Object key)

该方法用于根据键删除键值对。

    public synchronized V remove(Object key) {
        //保存现有数据变量
        Entry<?,?> tab[] = table;
        //根据key计算位置
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        @SuppressWarnings("unchecked")
        Entry<K,V> e = (Entry<K,V>)tab[index];
        //链表的经典删除操作
        for(Entry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                modCount++;
                if (prev != null) {
                    prev.next = e.next;
                } else {
                    tab[index] = e.next;
                }
                count--;
                V oldValue = e.value;
                e.value = null;
                return oldValue;
            }
        }
        return null;
    }

其他方法如size，isEmpty，contains，containsValue，containsKey，clear，clone，方法不在一一粘贴。

迭代器部分

由于Hashtable没有实现Iterable接口，所以不能foreach循环遍历其键值，这是因为Hashtable从JDK1.0起就存在了，不过可以使用keys()方法得到键的集合，使用values()得到值的集合。keys()方法的实现如下：

    public synchronized Enumeration<K> keys() {
        return this.<K>getEnumeration(KEYS);
    }
    public Collection<V> values() {
        if (values==null)
            values = Collections.synchronizedCollection(new ValueCollection(),
                                                        this);
        return values;
    }

其中Enumeration是一种类似于Iterator的接口，可以使用该类进行遍历。下面我们看一下getEnumeration方法的源码

     private <T> Enumeration<T> getEnumeration(int type) {
        if (count == 0) {
            return Collections.emptyEnumeration();
        } else {
            return new Enumerator<>(type, false);
        }
    }

可以看到，在哈希表不为空时，返回Enumerator对象，该类既实现了Enumeration接口，也实现了Iterator接口，构造方法中指明了是否使用Iterator接口的方法。源码如下：

    private class Enumerator<T> implements Enumeration<T>, Iterator<T> {
        Entry<?,?>[] table = Hashtable.this.table;
        int index = table.length;
        Entry<?,?> entry;
        Entry<?,?> lastReturned;
        int type;

        /**
         * Indicates whether this Enumerator is serving as an Iterator
         * or an Enumeration.  (true -> Iterator).
         */
        boolean iterator;

        /**
         * The modCount value that the iterator believes that the backing
         * Hashtable should have.  If this expectation is violated, the iterator
         * has detected concurrent modification.
         */
        protected int expectedModCount = modCount;

        Enumerator(int type, boolean iterator) {
            this.type = type;
            this.iterator = iterator;
        }

        public boolean hasMoreElements() {
            Entry<?,?> e = entry;
            int i = index;
            Entry<?,?>[] t = table;
            /* Use locals for faster loop iteration */
            while (e == null && i > 0) {
                e = t[--i];
            }
            entry = e;
            index = i;
            return e != null;
        }

        @SuppressWarnings("unchecked")
        public T nextElement() {
            Entry<?,?> et = entry;
            int i = index;
            Entry<?,?>[] t = table;
            /* Use locals for faster loop iteration */
            while (et == null && i > 0) {
                et = t[--i];
            }
            entry = et;
            index = i;
            if (et != null) {
                Entry<?,?> e = lastReturned = entry;
                entry = e.next;
                return type == KEYS ? (T)e.key : (type == VALUES ? (T)e.value : (T)e);
            }
            throw new NoSuchElementException("Hashtable Enumerator");
        }

        // Iterator methods
        public boolean hasNext() {
            return hasMoreElements();
        }

        public T next() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            return nextElement();
        }

        public void remove() {
            if (!iterator)
                throw new UnsupportedOperationException();
            if (lastReturned == null)
                throw new IllegalStateException("Hashtable Enumerator");
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();

            synchronized(Hashtable.this) {
                Entry<?,?>[] tab = Hashtable.this.table;
                int index = (lastReturned.hash & 0x7FFFFFFF) % tab.length;

                @SuppressWarnings("unchecked")
                Entry<K,V> e = (Entry<K,V>)tab[index];
                for(Entry<K,V> prev = null; e != null; prev = e, e = e.next) {
                    if (e == lastReturned) {
                        modCount++;
                        expectedModCount++;
                        if (prev == null)
                            tab[index] = e.next;
                        else
                            prev.next = e.next;
                        count--;
                        lastReturned = null;
                        return;
                    }
                }
                throw new ConcurrentModificationException();
            }
        }
    }

Enumeration接口有如下方法

boolean hasMoreElements();
E nextElement();

Iterator接口源码如下：

    boolean hasNext();
    
    E next();

    default void remove() {
        throw new UnsupportedOperationException("remove");
    }

    default void forEachRemaining(Consumer<? super E> action) {
        Objects.requireNonNull(action);
        while (hasNext())
            action.accept(next());
    }

下面两个方法先不用关心，是从jdk1.8开始，标注有@since 1.8。

看到该两个接口基本是一致的。在Enumerator的实现中可以发现，除了remove()方法，Iterator接口的另外两个方法都是使用的Enumeration接口的实现，而remove()方法只有在iterator参数为true时才能使用，否则抛出异常。在keys()的调用过程中可以发现传入的iterator这个参数为false。在使用values()方法得到值的集合时，iterator参数会为true。由于values()的返回值是一个Collection，必须支持foreach遍历，并且由于Hashtable是线程安全的，所以values使用了Collections.synchronziedCollection()方法对ValueCollection就行了同步封装。而通过values源码可以看到：

    public Collection<V> values() {
        if (values==null)
            values = Collections.synchronizedCollection(new ValueCollection(),
                                                        this);
        return values;
    }

然后我们可以看到如果values为空的时候，通过调用Collections.synchronziedCollection()实现。

     private class ValueCollection extends AbstractCollection<V> {
        public Iterator<V> iterator() {
            return getIterator(VALUES);
        }
        public int size() {
            return count;
        }
        public boolean contains(Object o) {
            return containsValue(o);
        }
        public void clear() {
            Hashtable.this.clear();
        }
    }

源码中iterator方法通过调用内部的getIterator方法实现。源码如下：

    private <T> Iterator<T> getIterator(int type) {
        if (count == 0) {
            return Collections.emptyIterator();
        } else {
            return new Enumerator<>(type, true);
        }
    }

可以看到这时Enumerator的第二个参数为true。

以上就是本次对hashTable整个源码的分析过程，当然还有一些，没有分析到或者分析出来的，以后有机会再进行补充。

Burgess_Lee

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HashTable源码分析

基于JDK1.8进行分析的，Hashtable类的实现也是基于“数组和链表”来实现的。继承结构public class Hashtable<K,V> extends Dictionary<K,V> implements Map<K,V>, Cloneable, java.io.Serializable通过上面可以看到继承的是Dict...
复制链接

扫一扫