跟着源码看ArrayList、LinkedList、HashMap、HashSet的内部存储机制

近来闲着没事,就突发奇想来研究下java中常用的各种集合的内部存储机制。为什么呢,因为不同的存储机制是为了适用不同的使用场景。如链式存储的特性就是存储长度可以随意改变,插入删除方便,缺点就是每次读取都要从头一个一个的找,读取不方便;线性存储的特性就是可以快速随意查找,读取方便,但插入删除的话可能就要挪移其它的数据位置了,就是插入删除不方便。因为在日常编程中常碰到对集合数据的存取操作,为了达到对数据的高效率使用,我们就有必要了解这些数据在计算机的内部存储机制。

  • ArrayList
    当从名字我们就可以判断出它的底层储存是数组存储,也就是线性存储,array翻译成英文就是数组。下面看下ArrayList的源码
    public class ArrayList<E> extends AbstractList<E> implements Cloneable, Serializable, RandomAccess {
        /**
         * The minimum amount by which the capacity of an ArrayList will increase.
         * This tuning parameter controls a time-space tradeoff. This value (12)
         * gives empirically good results and is arguably consistent with the
         * RI's specified default initial capacity of 10: instead of 10, we start
         * with 0 (sans allocation) and jump to 12.
         */
        private static final int MIN_CAPACITY_INCREMENT = 12;
    
        /**
         * The number of elements in this list.
         */
        int size;
    
        /**
         * The elements in this list, followed by nulls.
         */
        transient Object[] array;
    
        /**
         * Constructs a new instance of {@code ArrayList} with the specified
         * initial capacity.
         *
         * @param capacity
         *            the initial capacity of this {@code ArrayList}.
         */
        public ArrayList(int capacity) {
            if (capacity < 0) {
                throw new IllegalArgumentException("capacity < 0: " + capacity);
            }
            array = (capacity == 0 ? EmptyArray.OBJECT : new Object[capacity]);
        }
    
        /**
         * Constructs a new {@code ArrayList} instance with zero initial capacity.
         */
        public ArrayList() {
            array = EmptyArray.OBJECT;
        }
    
    ArrayList的默认无参构造函数里就一句代码,而array 的类型是Object[],java中语法规定这是数组的申明形式,而数组是线性存储的一种形式。j数组的一个特性就是初始化数组时必须设置它的一个存储长度,且之后不能改变。所以上面的判断没错,ArrayList适用于需要频繁读取操作的场景。

  • LinkedList
    Link的意思就是链接,链接是链式存储的一种,所以它就是链式存储。

     /**
         * Constructs a new empty instance of {@code LinkedList}.
         */
        public LinkedList() {
            voidLink = new Link<E>(null, null, null);
            voidLink.previous = voidLink;
            voidLink.next = voidLink;
        }
    private static final class Link<ET> {
            ET data;
    
            Link<ET> previous, next;
    
            Link(ET o, Link<ET> p, Link<ET> n) {
                data = o;
                previous = p;
                next = n;
            }
        }


    上面就是LinkedList的一个构造方法和Link的一个构造方法,LinkedList里的一个数据就是Link类型。Link的中存储的是它保存的数据和它自己的前后数据指向。
    LinkedList适用于需要频繁地数据插入删除操作的场景。

  • HashMap
    它的话从名字就不好判读是哪一种存储类型了。从名字看它是根据哈希值存储的键值对集合,但是这个集合底层又是怎么存储的呢?看代码
       /**
         * Constructs a new empty {@code HashMap} instance.
         */
        @SuppressWarnings("unchecked")
        public HashMap() {
            table = (HashMapEntry<K, V>[]) EMPTY_TABLE;
            threshold = -1; // Forces first put invocation to replace EMPTY_TABLE
        }
    /**
         * The hash table. If this hash map contains a mapping for null, it is
         * not represented this hash table.
         */
        transient HashMapEntry<K, V>[] table;


    构造方法中HashMap保存的是一个table,而table的类型是数组,因而HashMap底层存储属于线程数组存储。因它带了一个哈希值,故HashMap里数组的数据的位置会因每个数据的哈希值不同而动态改变。上面讲到数组的长度不能改变,当HashMap存储的数据长度超过它的容量的时候,它又是怎么增加数据的呢?
    /**
         * Maps the specified key to the specified value.
         *
         * @param key
         *            the key.
         * @param value
         *            the value.
         * @return the value of any previous mapping with the specified key or
         *         {@code null} if there was no such mapping.
         */
        @Override public V put(K key, V value) {
            if (key == null) {
                return putValueForNullKey(value);
            }
    
            int hash = Collections.secondaryHash(key);
            HashMapEntry<K, V>[] tab = table;
            int index = hash & (tab.length - 1);
            for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
                if (e.hash == hash && key.equals(e.key)) {
                    preModify(e);
                    V oldValue = e.value;
                    e.value = value;
                    return oldValue;
                }
            }
    
            // No entry for (non-null) key is present; create one
            modCount++;
            if (size++ > threshold) {
                tab = doubleCapacity();
                index = hash & (tab.length - 1);
            }
            addNewEntry(key, value, hash, index);
            return null;
        }
    我们看tab = doubleCapacity();
    /**
         * Doubles the capacity of the hash table. Existing entries are placed in
         * the correct bucket on the enlarged table. If the current capacity is,
         * MAXIMUM_CAPACITY, this method is a no-op. Returns the table, which
         * will be new unless we were already at MAXIMUM_CAPACITY.
         */
        private HashMapEntry<K, V>[] doubleCapacity() {
            HashMapEntry<K, V>[] oldTable = table;
            int oldCapacity = oldTable.length;
            if (oldCapacity == MAXIMUM_CAPACITY) {
                return oldTable;
            }
            int newCapacity = oldCapacity * 2;
            HashMapEntry<K, V>[] newTable = makeTable(newCapacity);
            if (size == 0) {
                return newTable;
            }
    
            for (int j = 0; j < oldCapacity; j++) {
                /*
                 * Rehash the bucket using the minimum number of field writes.
                 * This is the most subtle and delicate code in the class.
                 */
                HashMapEntry<K, V> e = oldTable[j];
                if (e == null) {
                    continue;
                }
                int highBit = e.hash & oldCapacity;
                HashMapEntry<K, V> broken = null;
                newTable[j | highBit] = e;
                for (HashMapEntry<K, V> n = e.next; n != null; e = n, n = n.next) {
                    int nextHighBit = n.hash & oldCapacity;
                    if (nextHighBit != highBit) {
                        if (broken == null)
                            newTable[j | nextHighBit] = n;
                        else
                            broken.next = n;
                        broken = e;
                        highBit = nextHighBit;
                    }
                }
                if (broken != null)
                    broken.next = null;
            }
            return newTable;
        }
    当插入数据时长度超过它的容量时,内部又new了一个长度为原有长度两倍的数组,然后把原来的数据保存到新数据中。
    HashMap也是适用于需要频繁读取操作的场景。

  • HashSet
    基于哈希值的set集合,它是怎么存储的呢;请看下面
     */
    public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable,
            Serializable {
    
        private static final long serialVersionUID = -5024744406713321676L;
    
        transient HashMap<E, HashSet<E>> backingMap;
    
        /**
         * Constructs a new empty instance of {@code HashSet}.
         */
        public HashSet() {
            this(new HashMap<E, HashSet<E>>());
        }
    HashSet(HashMap<E, HashSet<E>> backingMap) {
            this.backingMap = backingMap;
        }
    它里面new了一个HashMap,天哪!原来HashSet里面是这样的。也就是HashSet的数据是保存在HashMap中,所以 HashSet也是适用于需要频繁读取操作的场景。



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值