java中的数据集合--List源码分析

最新推荐文章于 2022-04-11 00:28:29 发布

gaoshanliushuizf

最新推荐文章于 2022-04-11 00:28:29 发布

阅读量252

点赞数 1

分类专栏： Android 源码文章标签： List源码 ArrayList源码 LinkList源码 ArrayList扩容

本文链接：https://blog.csdn.net/u011976443/article/details/115072696

版权

Android 同时被 2 个专栏收录

10 篇文章 0 订阅

订阅专栏

源码

1 篇文章 0 订阅

订阅专栏

java中的数据集合--List源码分析

Collection
List
再看一下LinkList和ArrayList的不同与多数博客的错误认知

Collection

先看一下collection的解释说明部分：

/ *
 * @param <E> the type of elements in this collection
 *
 * @author  Josh Bloch
 * @author  Neal Gafter
 * @see     Set
 * @see     List
 * @see     Map
 * @see     SortedSet
 * @see     SortedMap
 * @see     HashSet
 * @see     TreeSet
 * @see     ArrayList
 * @see     LinkedList
 * @see     Vector
 * @see     Collections
 * @see     Arrays
 * @see     AbstractCollection
 * @since 1.2
 */

这个解释实际上基本涵盖了所有我们编码过程中使用的数据集合类，可以分析一下这些类的源码，基本上java中的集合类对你再无什么秘密。

public interface Collection<E> extends Iterable<E> {}

collection本身是一个接口，继承自迭代器Iterable。

如上所述，确实只能implement一个接口，但是作为interface同样可以被extends调用，并且一个子类可以extends多个interface父类，这句话有些拗口，不过看看上面collection同样是一个interface类，但是他用extends了一个类Iterable仍然是interface类型的。
在这里插入图片描述
上图是collection类中的所有方法，同事涵盖了Iterable中的方法，还有自己的方法，比如toArray，增删等等。

在这里插入图片描述

上图是collection的子类，从图中可以看出，他的子类包含ArrayList，ArrayDueque，ArraySet，concurrentlist、concurrentSet之类的子类，实际上集合类的三大件，set 、list、 map三个中，除了map其余的两个都继承自Collection。

List

List同样是一个接口类，看一下源码：

public interface List<E> extends Collection<E> {

在这里插入图片描述
类中的接口如上图，我们通过分析ArrayList中的增删改查来看list接口方法和Iterator中的方法。

ArrayList

ArrayList继承自SeriaLizable，表明它是一个可序列化的类

public class ArrayList<E> extends AbstractList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{}

Arraylist中存储数据的数据结构是

 transient Object[] elementData; // non-private to simplify nested class access

有上面一行代码可以看出，ArrayList的底层结构实际上是一个数组，并且数组的类型是Object，并且ArrayList定义的时候是以泛型定义，这样也就解释了，为什么ArrayList既可以存放基本类型的数据，也可以存放自定义类型的数据。

/**
 * Constructs an empty list with the specified initial capacity.
 *
 * @param  initialCapacity  the initial capacity of the list
 * @throws IllegalArgumentException if the specified initial capacity
 *         is negative
 */
public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}

/**
 * Constructs an empty list with an initial capacity of ten.
 */
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

/**
 * Constructs a list containing the elements of the specified
 * collection, in the order they are returned by the collection's
 * iterator.
 *
 * @param c the collection whose elements are to be placed into this list
 * @throws NullPointerException if the specified collection is null
 */
public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

ArrayList有三个构造方法，分别是直接new一个不带参数的ArrayList，一个带有初始容量的ArrayList，或者参数是Collection的ArrayList，其中我们最常用的不带参的ArrayList构造方法，实际上等于在new的时候创建了一个DEFAULTCAPACITY_EMPTY_ELEMENTDATA，这个是一个空的数组，这样做的目的，笔者认为是减小不必要的内存损耗。

private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

既然是数组那么定义的时候就会定义一个初始容量，后面我们会讲到一个ArrayList的另一个机制–扩容

Iterator方法

首先看ArrayList中的iterator方法：

/**
 * Returns an iterator over the elements in this list in proper sequence.
 *
 * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
 *
 * @return an iterator over the elements in this list in proper sequence
 */
public Iterator<E> iterator() {
    return new Itr();
}

注释的意思是按照一定顺序返回列表中的元素，迭代的实现是通过一个内部类实现，每次迭代都会new一个Itr（）

Itr（）源码如下：

/**
 * An optimized version of AbstractList.Itr
 */
private class Itr implements Iterator<E> {
    // Android-changed: Add "limit" field to detect end of iteration.
    // The "limit" of this iterator. This is the size of the list at the time the
    // iterator was created. Adding & removing elements will invalidate the iteration
    // anyway (and cause next() to throw) so saving this value will guarantee that the
    // value of hasNext() remains stable and won't flap between true and false when elements
    // are added and removed from the list.
    protected int limit = ArrayList.this.size;

    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

    public boolean hasNext() {
        return cursor < limit;
    }

    @SuppressWarnings("unchecked")
    public E next() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        int i = cursor;
        if (i >= limit)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        cursor = i + 1;
        return (E) elementData[lastRet = i];
    }

    public void remove() {
        if (lastRet < 0)
            throw new IllegalStateException();
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();

        try {
            ArrayList.this.remove(lastRet);
            cursor = lastRet;
            lastRet = -1;
            expectedModCount = modCount;
            limit--;
        } catch (IndexOutOfBoundsException ex) {
            throw new ConcurrentModificationException();
        }
    }

    @Override
    @SuppressWarnings("unchecked")
    public void forEachRemaining(Consumer<? super E> consumer) {
        Objects.requireNonNull(consumer);
        final int size = ArrayList.this.size;
        int i = cursor;
        if (i >= size) {
            return;
        }
        final Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length) {
            throw new ConcurrentModificationException();
        }
        while (i != size && modCount == expectedModCount) {
            consumer.accept((E) elementData[i++]);
        }
        // update once at end of iteration to reduce heap write traffic
        cursor = i;
        lastRet = i - 1;

        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

下面这句话是在内部类中的next（）方法完成对list中元素的拷贝，所以next的操作不会对list中的元素有所改变：

 Object[] elementData = ArrayList.this.elementData;

下面这句话的意思是在多线程操作list的时候的异常检查，也侧面证明了ArrayList并不是一个线程安全的集合类。

  throw new ConcurrentModificationException();

remove（）中的实现过程核心是这句话，表明他是直接对ArrayList进行的操作。

  ArrayList.this.remove(lastRet);

add方法与ArrayList的扩容

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return <tt>true</tt> (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

/**
 * Inserts the specified element at the specified position in this
 * list. Shifts the element currently at that position (if any) and
 * any subsequent elements to the right (adds one to their indices).
 *
 * @param index index at which the specified element is to be inserted
 * @param element element to be inserted
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

add方法有两个，一个是直接添加，一个是带有index，可以指定位置添加，此处的添加不做过多讨论，看一下，add方法中的ensureCapacityInternal方法，此方法牵扯到ArrayList的另一个机制–扩容。

首先我们看一下ensureCapacityInternal方法源码：

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

上面第一个方法中，实现了对new一个参数为空的ArrayList的容量初始化过程，初始化容量也正是DEFAULT_CAPACITY,这个常量值为10。

下面分析一下比较关键的一个方法，也就是最常分析的ArrayList的扩容机制。

    /**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

下面这句话的意思就是如果当前容量在不够用的时候，会先扩容到当前数组长度的1.5倍，如果你一次添加的数据超过原来容量的1.5倍，那么扩容的容量扩容为当下添加数量和原来数量的容量和，如果添加的数据量超过数组最大的容量，那么就会报一个outofMemory错误：（初始容量的定义是，如果你第一次添加的数据量大于10，那么你的初始容量就是你添加的长度，如果小于10，那么数组的初时容量就是10），此处很多博客上直接说arraylist扩容直接说扩容1.5倍是不准确的。

    int newCapacity = oldCapacity + (oldCapacity >> 1);

至此，ArrayList的扩容机制讲解完毕。

ArrayList中还有替换对应位置元素的方法，set方法，可以根据上面的分析方法查看，比较简单。

LinkList分析

链表结构印象中一直有些复杂，貌似牵扯到头指针尾指针的问题，不过比较复杂的结构，也代表着更强大的功能，今天来从源码分析一下。

构造方法

LinkList比ArrayList要复杂一些，首先看一下类的定义和构造

public class LinkedList<E>
extends AbstractSequentialList<E>
implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{}

从定义中可以看出，他集成在list，Deque，cloneable，Serializable，这些说明他可以序列化，可以实现双端队列，同时又是一个list。

接下来看一下类中的变量

transient int size = 0;

/**
 * Pointer to first node.
 * Invariant: (first == null && last == null) ||
 *            (first.prev == null && first.item != null)
 */
transient Node<E> first;

/**
 * Pointer to last node.
 * Invariant: (first == null && last == null) ||
 *            (last.next == null && last.item != null)
 */
transient Node<E> last;

变量只有三个，一个是size表示链表的长度，然后分别是Node类型的头first和尾部last。

private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}

Node是LinkList的静态内部类，分别维护了一个next和prev两个节点，分别定义为节点前一个节点和后一个节点，而item则代表当前节点存放的内容。

下面看一下构造。

/**
 * Constructs an empty list.
 */
public LinkedList() {
}

/**
 * Constructs a list containing the elements of the specified
 * collection, in the order they are returned by the collection's
 * iterator.
 *
 * @param  c the collection whose elements are to be placed into this list
 * @throws NullPointerException if the specified collection is null
 */
public LinkedList(Collection<? extends E> c) {
    this();
    addAll(c);
}

构造分为两个，我们先看一下带参构造，稍后分析一下add方法。

带参构造主要的实现过程是addAll方法

/**
 * Inserts all of the elements in the specified collection into this
 * list, starting at the specified position.  Shifts the element
 * currently at that position (if any) and any subsequent elements to
 * the right (increases their indices).  The new elements will appear
 * in the list in the order that they are returned by the
 * specified collection's iterator.
 *
 * @param index index at which to insert the first element
 *              from the specified collection
 * @param c collection containing elements to be added to this list
 * @return {@code true} if this list changed as a result of the call
 * @throws IndexOutOfBoundsException {@inheritDoc}
 * @throws NullPointerException if the specified collection is null
 */
public boolean addAll(int index, Collection<? extends E> c) {
//检查时候有多线程修改导致的异常
    checkPositionIndex(index);

    Object[] a = c.toArray();
    int numNew = a.length;
    if (numNew == 0)
        return false;

    Node<E> pred, succ;
    //带参初始化传入的index==size
    if (index == size) {
    //此处pred首先赋值为链表尾部
        succ = null;
        pred = last;
    } else {
    //对于插入链表中某个位置的逻辑，走的是这里，这个时候，succ代表的是插入的原始节点，而pred代表的是他前一个节点，新插入的元素插在二者中间
        succ = node(index);
        pred = succ.prev;
    }

//循环插入
    for (Object o : a) {
    //初始化插入节点的第一个元素
        @SuppressWarnings("unchecked") E e = (E) o;
        Node<E> newNode = new Node<>(pred, e, null);
        if (pred == null)
        //对于一个空的链表，先把链表头作为第一个插入的元素
            first = newNode;
        else
        //对于已经存在数据的链表，把新的节点放到pred的下一个节点位置
            pred.next = newNode;

		//这一步是把刚刚插入的节点赋值给pred，保证pred始终代表最新插入的节点，也就是保证pred，是被插入位置的前一个节点
        pred = newNode;
    }

    if (succ == null) {
    //succ代表新插入的最后一个节点，循环结束，赋值。
        last = pred;
    } else {
    //同样循环结束，赋值
        pred.next = succ;
        succ.prev = pred;
    }

    size += numNew;
    modCount++;
    return true;
}

上面赋值部分已经把链表中的addAll讲解完毕了，下图是一个从网上盗的图，便于大家理解链表结构，此图是一个循环链表的结构，咱们介绍的是一个单向链表，因此看的时候把红框圈住的部分忽略即可。

在这里插入图片描述

add方法

/**
 * Appends the specified element to the end of this list.
 *
 * <p>This method is equivalent to {@link #addLast}.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    linkLast(e);
    return true;
}

add方法的主要实现过程是linkLast，顾名思义，add方法插入元素的时候，是插入到最后一个位置的。

/**
 * Links e as last element.
 */
void linkLast(E e) {
//定义一个节点把它赋值为尾结点
    final Node<E> l = last;
    //初始化一个新的节点，节点的前一个元素就是原来的尾结点
    final Node<E> newNode = new Node<>(l, e, null);
    //重新赋值尾结点
    last = newNode;
    if (l == null)
    //如果链表最开始是一个空链表，那么就把新插入的元素赋值给首节点
        first = newNode;
    else
    //如果原始链表不为空，那么就把原来的尾结点和新插入的节点关联起来
        l.next = newNode;
    size++;
    modCount++;
}

在分析一个带参的add方法

/**
 * Inserts the specified element at the specified position in this list.
 * Shifts the element currently at that position (if any) and any
 * subsequent elements to the right (adds one to their indices).
 *
 * @param index index at which the specified element is to be inserted
 * @param element element to be inserted
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    checkPositionIndex(index);

    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}

此方法中如果插入的位置和size相同就插入到后面，如果不同，就插入到前面，意思是什么呢，

通俗点讲，如果你传入的index小于链表的尺寸，那么就把元素插入到你设定的那个index节点之前，有人会问那是不是存在index大于size的情况，答案是不存在的，因为上面还有一儿checkPositionIndex方法。linkBefore方法源码如下。

/**
 * Inserts element e before non-null Node succ.
 */
void linkBefore(E e, Node<E> succ) {
    // assert succ != null;
    //首先把插入目标节点的前一个节点赋值给pred
    final Node<E> pred = succ.prev;
    //定义一个新的节点
    final Node<E> newNode = new Node<>(pred, e, succ);
    //把要插入的节点，赋值给插入index位置的前一个节点
    succ.prev = newNode;
    if (pred == null)
    //如果链表为空，则赋值第一个节点为新插入的节点
        first = newNode;
    else
    //如果不为空个，那么把新的节点，把原始插入节点的前一个节点和新的节点关联
        pred.next = newNode;
    size++;
    modCount++;
}

至此简单的介绍了LinkList的源码。

参考：（https://blog.csdn.net/qedgbmwyz/article/details/80108618）

再看一下LinkList和ArrayList的不同与多数博客的错误认知

ArrayList和LinkList的根本上不同是存储方式不同，一个是顺序存储结构，一个是链式存储结构，二者物理表现是，一个存储空间是连续的，一个是不连续的。

很多博客上说ArrayList插入删除效率低，查询效率高，LinkList反之，并且说出了复杂度,ArrayList的插入删除时间复杂度是O(N),查询是O（1）

LinkList插入和删除是O（1）查询则是O（N）

这种说法是十分不准确的，既然LinkList查询的效率更低，那么向某一个位置插入数据，所经历的过程必定是先查询到这个元素，之后在插入，那么这个过程是不是时间复杂度已经提高了呢？有源码为证

/**
 * Inserts the specified element at the specified position in this list.
 * Shifts the element currently at that position (if any) and any
 * subsequent elements to the right (adds one to their indices).
 *
 * @param index index at which the specified element is to be inserted
 * @param element element to be inserted
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    checkPositionIndex(index);

    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}

上述代码的意思是，如果插入到链表的末端，则直接插入即可，如果插入的不是末端，那么调用linkBefore，这里面的参数中有一个node(index),我们看一下node（）的源码：

/**
 * Returns the (non-null) Node at the specified element index.
 */
Node<E> node(int index) {
    // assert isElementIndex(index);

    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

源码中根据插入的位置和链表的长度进行比较，大于二人之一，则从尾部查询，小于二分之一则从头部开始查询，这个过程，恰好印证了之前的想法。所以直接说链表插入和删除效率高，是十分不准确的，如果链表的长度足够大，那么实际上链表的插入和删除时间复杂度并不低。

ArrayList 是线性表（数组）
get() 直接读取第几个下标，复杂度 O(1)
add(E) 添加元素，直接在后面添加，复杂度O（1）
add(index, E) 添加元素，在第几个元素后面插入，后面的元素需要向后移动，复杂度O（n）
remove（）删除元素，后面的元素需要逐个移动，复杂度O（n）
LinkedList 是链表的操作
get() 获取第几个元素，依次遍历，复杂度O(n)
add(E) 添加到末尾，复杂度O(1)
add(index, E) 添加第几个元素后，需要先查找到第几个元素，直接指针指向操作，复杂度O(n)
remove（）删除元素，同样需要先查询后删除，复杂度O(n)

gaoshanliushuizf

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
java中的数据集合--List源码分析

java中的数据集合--List源码分析CollectionListArrayListIterator方法add方法与ArrayList的扩容LinkList分析Collection先看一下collection的解释说明部分：/ * * @param <E> the type of elements in this collection * * @author Josh Bloch * @author Neal Gafter * @see Set * @see
复制链接

扫一扫