Java集合相关

H'Complicated

已于 2023-08-15 13:37:45 修改

阅读量130

点赞数 1

文章标签： java 数据结构

于 2023-05-11 23:36:53 首次发布

本文链接：https://blog.csdn.net/weixin_44369859/article/details/130630815

版权

本文详细介绍了Java集合框架中的Collection和Map接口的实现类，如ArrayList、LinkedList、Vector以及HashSet、HashMap等，分析了它们的底层实现、特性、线程安全性和性能差异，包括扩容策略和元素添加删除的方法。

摘要由CSDN通过智能技术生成

集合简介

java中的集合主要分为两种,一种是Collection接口的实现类,我们称其为单列集合(集合中只能存放单个元素),另一种是Map接口的实现类,我们称其为双列集合(集合中存放的键值对)

1.Collection接口

调用add方法的时候如果传入的是基本类型,会自动装箱
调用remove方法的时候如果传入的是索引,会返回删除的对象,如果传入的是一个对象,则返回是否删除成功

1.1 List接口

List接口的特点:
1.List集合中元素有序(添加顺序和取出顺序一致,且元素可重复)
2.Lisst集合中的每个元素都有其对应的索引

常用的List实现类:ArrayList,LinkedList,Stack,Vector

1.1.1 ArrayList

1.arrayList底层由数组实现
2.arrayList可以添加null
3.线程不安全(属性modCount会记录修改次数)
4.默认的大小是0,但是当第一次添加元素时,会扩容成10(调用grow方法,底层为Arrays.copyof()),之后每次扩容都为之前的1.5倍

    /**
     * Returns a capacity at least as large as the given minimum capacity.
     * Returns the current capacity increased by 50% if that suffices.
     * Will not return a capacity greater than MAX_ARRAY_SIZE unless
     * the given minimum capacity is greater than MAX_ARRAY_SIZE.
     *
     * @param minCapacity the desired minimum capacity
     * @throws OutOfMemoryError if minCapacity is less than zero
     */
    private int newCapacity(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        // 新的容量为原来的容量右移一位加上原来的容量,相当于是原来容量的1.5倍
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity <= 0) {
            if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
                return Math.max(DEFAULT_CAPACITY, minCapacity);
            if (minCapacity < 0) // overflow
                throw new OutOfMemoryError();
            return minCapacity;
        }
        return (newCapacity - MAX_ARRAY_SIZE <= 0)
            ? newCapacity
            : hugeCapacity(minCapacity);
    }

5.由于底层的elementData数组被transient关键字修饰,在序列化的时候elementData属性不会被序列化

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */
    transient Object[] elementData; // non-private to simplify nested class access

1.1.2 Vector

1.底层也是对象数组

    /**
     * The array buffer into which the components of the vector are
     * stored. The capacity of the vector is the length of this array buffer,
     * and is at least large enough to contain all the vector's elements.
     *
     * <p>Any array elements following the last element in the Vector are null.
     *
     * @serial
     */
    protected Object[] elementData;

2.线程安全,但是效率会比ArrayList低(方法前有synchronized修饰)
3.默认大小为10,扩容会变成原来的两倍(底层扩容的原理和ArrayList相同)(两倍是默认,如果设置了capacityIncrement的话,每次就会变大capacityIncrement)

1.1.3 LinkedList

1.底层实现了双向链表和双端队列
2.线程不安全
3.可以添加任何元素(null)
4.相比于ArrayList,添加和删除的效率较高
5.add方法调用的是linkLast方法,将元素插入到最后

    /**
     * Links e as last element.
     */
    void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }

6.remove方法默认删除的是第一个元素

    /**
     * Retrieves and removes the head (first element) of this list.
     *
     * @return the head of this list
     * @throws NoSuchElementException if this list is empty
     * @since 1.5
     */
    public E remove() {
        return removeFirst();
    }

7.get方法(底层调用node方法),双向链表增加效率

    /**
     * Returns the (non-null) Node at the specified element index.
     */
    Node<E> node(int index) {
        // assert isElementIndex(index);

        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

1.2 Set接口

1.无序(添加和取出的顺序不一致)而且没有索引
2.不能有重复元素(但是add的时候不会报错,返回false),只能有一个null
3.虽然添加和取出的顺序不一致,但是取出的顺序是固定的

1.2.1 HashSet

1.底层是HashMap实现

    /**
     * Constructs a new, empty set; the backing {@code HashMap} instance has
     * default initial capacity (16) and load factor (0.75).
     */
    public HashSet() {
        map = new HashMap<>();
    }

2.add方法具体实现(调用HashMap的put方法,传入具体元素和一个new Object()):

	// Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();
        /**
     * Adds the specified element to this set if it is not already present.
     * More formally, adds the specified element {@code e} to this set if
     * this set contains no element {@code e2} such that
     * {@code Objects.equals(e, e2)}.
     * If this set already contains the element, the call leaves the set
     * unchanged and returns {@code false}.
     *
     * @param e element to be added to this set
     * @return {@code true} if this set did not already contain the specified
     * element
     */
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

先通过hash算法(把hashCode按位异或hashCode无符号右移16位得到的值)得到hash值
如果该索引位置为空,直接添加
如果该位置不为空,调用equal方法比较
如果相同,就放弃添加,返回false
如果不同就一直往下比,一直到最后都不相同的话,就添加到最后,添加方法如下

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        //这里的table就是HashMap中的node数组
        if ((tab = table) == null || (n = tab.length) == 0)
        	//当数组为null或者数组长度为0的时候,调用resize方法进行初始化,默认的数组大小为16,负载因子为0.75,并初始化该hashMap的threshold值,后续进行数组扩容时,扩容成原先的两倍
            n = (tab = resize()).length;
        //把key需要插入的数组的位置取出来赋给p
        //判断p是否为null,为null的话直接new一个node对象存入
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        //不为null的话
        else {
            Node<K,V> e; K k;
            //第一个if先通过判断该索引下链表的第一个对象的key的hash值和当前准备添加的对象的key的hash值是否相同
            //再通过比较地址和调用equals方法比较当前准备添加的对象和该索引下链表的第一个对象的key和当前准备添加的对象的key是否为同一个对象
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            //第二个if判断p是不是红黑树
            //如果是的话,就调用putTreeVal方法来添加元素
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            //最后剩下一种情况就是,当前的数组位置为链表,且链表的第一个元素的key和需要添加的key不是一个元素
            else {
             	//循环比较链表的每个元素的key和需要添加的key是否相同
                for (int binCount = 0; ; ++binCount) {
                	//如果比完了都没有相同的,就new一个挂在后面
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        //如果添加完之后的链表长度大于8,需要对该链表进行树化(treeifyBin方法),但是在treeifyBin方法中会判断hashMap的数组的长度是否小于64,小于的话会对数组进行扩容(扩容后,链表对象不会变,但是索引值可能会变),不会进行树化,大于等于64时才会进行树化
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    //如果比到了key相同的情况,放弃添加,直接退出
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            //如果e不为空,代表这次添加失败,会返回当前需要添加的key值下,在hashMap中的值
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        //如果所有的node的个数(元素个数)超过threshold的值,需要扩容
        if (++size > threshold)
            resize();
        //这个方法在hashMap里为空,给他的子类做其他操作
        afterNodeInsertion(evict);
        return null;
    }

红黑树比链表的优势：
红黑树的查找、插入和删除操作的时间复杂度都是 o(log n)，而链表的时间复杂度是o(n)。还有一点需要留意的是链表转为红黑树的阈值是8，而红黑树退化成链表的阈值是6。如果这两个值都为8的话，而当前链表的节点数量为7，此时一个新的节点进来了，计算出hash值和这七个节点的hash值相同，即发生了hash冲突。于是就会把这个节点挂在第七个节点的后面，但是此时已经达到了变成红黑树的阈值了（MIN_TREEIFY_CAPACITY条件假定也满足），于是就转成红黑树。

1.2.2 LinkedHashSet

1.LinkedHashSet底层是一个LInkedHashMap(LInkedHashMap底层是一个数组双向链表)
2.LinkedHashSet根据元素的hashCode来决定元素的存储位置,通过链表来维护元素的次序,保证元素的有序
3.底层的数组类型是HashMap.Node[],但是存放的元素是LinkedHashMap.Entry的类型(继承关系)

    /**
     * HashMap.Node subclass for normal LinkedHashMap entries.
     */
    static class Entry<K,V> extends HashMap.Node<K,V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next);
        }
    }

4.添加的逻辑同HashSet，在添加完之后会对双向列表进行添加。

1.2.3 TreeSet

1.TreeSet提供了传入比较器的构造器使放入的数据按照比较器进行排序，每次put数据的时候会去逐个比较，如果通过比较之后的结果=0，那么这个数据不会被添加进去。

        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else 
                    return t.setValue(value);
            } while (t != null);
        }

2.底层是TreeMap

2.Map接口

Map接口实现类的特点：
1.map中的key和value可以是任务引用类型的数据（都可以为null），其中key不能重复，value可以重复，当同样的key添加不同的value时，新的会替换旧的，并在put方法中返回旧的value值。
2.无序存放。
3.为了方便方便遍历，会创建一个EntrySet集合用来存放Entry（k-v结构，entry中提供了getValue和getKey方法）

Map的遍历方式：
1.通过使用ketSet方法取出所有的key，再通过get方法依次取出对应的value
2.利用values方法获取map里所有的value
3.通过entrySet获取所有的entry（包括所有的key和value）

2.1 HashMap

1.线程不安全
2.数组+链表和红黑树构成
3.默认的临界因子为0.75，初始大小为0，数组第一次扩容为16（后面扩容为之前的两倍直到inter的最大值），当链表长度大于8且数组长度大于64进行树化，而红黑树退化成链表的阈值是6。

2.2 HashTable

1.HashTable的键和值都不能为null
2.使用方法和HashMap基本一样
3.线程安全
4.底层是数组（第一次初始化大小为11）+链表
5.默认的负载因子也是0.75
6.扩容的机制是原大小*2+1

2.3 ConcurrentHashMap

在涉及到多线程开发时，使用HashMap可能会出现死锁问题（put方法导致形成环形链表，即链表的每一个节点都不为null），使用HashTable效率又比较低（使用synchronized保证线程安全），ConcurrentHashMap就可以同时解决这两种问题（通过使用分段锁的思想，可以支持多个线程同时访问不同的数据段）。
ConcurrentHashMap中的sizeCtl属性代表初始化和调整大小的控制标志。为负数，Hash表正在初始化或者扩容;(-1表示正在初始化,-N表示有N-1个线程在进行扩容)否则，当表为null时，保存创建时使用的初始化大小或者默认0;初始化以后保存下一个调整大小的尺寸。