java数据结构之List

最新推荐文章于 2024-05-15 17:22:30 发布

luckystar2008

最新推荐文章于 2024-05-15 17:22:30 发布

阅读量9.9k

点赞数 3

文章标签： java List

本文链接：https://blog.csdn.net/qincidong/article/details/82491982

版权

Core Java 专栏收录该内容

155 篇文章 0 订阅

订阅专栏

List是java中重要的数据结构之一，这里介绍常用的3种实现方式：ArrayList、Vector、LinkedList。
类图如下：

可以看到，ArrayList、Vector、LinkedList都是AbstractList的实现。而AbstractList实现了List接口，并扩展自AbstractCollection。

其中，ArrayList和Vector底层使用了数组，LinkedList使用的是循环双向链表。这是2种完全不同的实现，所有这也决定了它们的适应的工作场景不同。

ArrayList和Vector的区别是对多线程的支持，ArrayList没有对任何一个方法做同步，因此不是线程安全的。而Vector中绝大多数方法都做了线程同步。所以在多线程环境中，建议使用Vector，反之则使用ArrayList。

ArrayList和Vector实现除了同步几乎一样，所以下面以ArrayList进行说明。

添加元素

public boolean add(E e) {
    // 确保内部数组有足够的空间
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 将元素添加到数组的末尾
    elementData[size++] = e;
    return true;
}
private void ensureCapacityInternal(int minCapacity) {
    modCount++;
    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    // 新的数组容量是原来数组的1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // 如果新的容量小于需要的最小容量，则使用最小需要的容量
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}
private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}

在添加元素前，ArrayList先确保数组空间够用，否则会对数组进行扩容。新数组的长度是原数组的1.5倍。

通过代码可以知道，如果ArrayList内部数组的空间足够大，那么添加元素是很快的。在数组的容量不够时，会进行扩容，在扩容的过程中，会进行大量的数组的复制操作。

LinkedList的add操作

public boolean add(E e) {
    linkLast(e);
    return true;
}
void linkLast(E e) {
    final Node<E> l = last;
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}

将新的元素放到最后。并设置添加前的最后一个元素的下一个元素是要添加的元素。
可以看到，LinkedList不用担心容量的问题，添加的效率也是很高的，只是每次添加时会生成一个新的Node对象，会有一定的性能损耗。在ArrayList容量足够的情况下，ArrayList的速度是非常快的，是直接操作的数组。

添加操作性能测试

1.ArrayList的添加操作

long s = System.currentTimeMillis();
for (int i=0;i<5000000;i++) {
    list.add(i);
}
System.out.println(System.currentTimeMillis() - s);

耗时：161ms.

2.LinkedList的添加操作

long s = System.currentTimeMillis();
for (int i=0;i<5000000;i++) {
    linkedList.add(i);
}
System.out.println(System.currentTimeMillis() - s);

耗时：3476ms.

如果列表容量足够（不用扩容），添加元素到尾部，推荐使用ArrayList，效率高于LinkedList。

添加元素到任意位置

1.ArrayList

public void add(int index, E element) {
    rangeCheckForAdd(index);
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

可以看到，ArrayList添加元素到指定位置时，会导致index之后的所有元素向后移动，如果数据量大而index比较小的话，产生的数组拷贝是非常多的。这样必然导致效率降低。所以，尽量将元素插入到列表的尾部，可以提高该方法的性能。

2.LinkedList

public void add(int index, E element) {
    checkPositionIndex(index);
    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}
void linkBefore(E e, Node<E> succ) {
    // assert succ != null;
    final Node<E> pred = succ.prev;
    final Node<E> newNode = new Node<>(pred, e, succ);
    succ.prev = newNode;
    if (pred == null)
        first = newNode;
    else
        pred.next = newNode;
    size++;
    modCount++;
}
Node<E> node(int index) {
    // assert isElementIndex(index);
    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

如果index=size，就跟add(E e)是一样的，添加在最后；否则调用linkBefore方法。
在linkBefore方法中，注意第2个参数node(index)，该方法中，如果index在LinkedList的前半部分就在前半部分查找；否则在后半部分查找。所以，可以看到，如果index在靠近中间的位置，效率比较低。

添加元素到任意位置测试

1.ArrayList

long s = System.currentTimeMillis();
for (int i=0;i<50000;i++) {
    list.add(0,i);
}
System.out.println(System.currentTimeMillis() - s);

耗时：236ms。

2.LinkedList

long s = System.currentTimeMillis();
for (int i=0;i<50000;i++) {
    linkedList.add(i);
}
System.out.println(System.currentTimeMillis() - s);

耗时：5ms。

除了一些极端的情况（首尾），添加元素到列表任意位置，LinkedList效率高于ArrayList。所以如果有在任意位置插入元素的需求，可以考虑LinkedList。

删除任意位置的元素

ArrayList

public E remove(int index) {
    rangeCheck(index);
    modCount++;
    E oldValue = elementData(index);
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // Let gc do its work
    return oldValue;
}

可以看到，跟add(index,E e)方法比较类似，在任意位置删除元素后，都需要进行数组的重组。而且，要删除的元素越靠前，开销越小；要删除的元素越靠后，开销越大。

LinkedList

public E remove(int index) {
    checkElementIndex(index);
    return unlink(node(index));
}
 E unlink(Node<E> x) {
    // assert x != null;
    final E element = x.item;
    final Node<E> next = x.next;
    final Node<E> prev = x.prev;
    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }
    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }
    x.item = null;
    size--;
    modCount++;
    return element;
}

注意remove方法中，最后调用了unlink方法，该方法中也查找了index位置的元素。代码同add(index,E)。
如果元素在前半段，则在链表的前半段循环查找；否则在链表的后半段循环查找。所以如果要删除的元素在前半段或后半段还好，如果在中间位置效率就比较低了（需要循环一半的链表）。

容量参数

在前面添加元素的地方说了，ArrayList和Vector在容量不够的情况下会进行扩容，新的容量是原来的1.5倍，扩容后会对之前的数据重组，所以性能损耗是比较大的。
在ArrayList和Vector的构造器中，有一个初始容量参数可指定列表的容量，如果在构造列表时预估数据量，将会避免进行列表扩容，从而可以提升效率。

列表遍历

分别使用ArrayList和LinkedList进行for循环、forEach循环、迭代器进行100万条数据的遍历测试。
测试代码：

for (int i=0;i<1000000;i++) {
    list.add(i);
}
long s = System.currentTimeMillis();
for (int i=0;i<list.size();i++) {
    list.get(i);
}
System.out.println("for循环耗时：" + (System.currentTimeMillis() - s) + "ms.");
s = System.currentTimeMillis();
Integer tmp = null;
for (Integer i:list) {
    tmp = i;
}
System.out.println("forEach耗时："+(System.currentTimeMillis() - s) + "ms.");
s = System.currentTimeMillis();
for (Iterator<Integer> iterator = list.iterator();iterator.hasNext();) {
    tmp = iterator.next();
}
System.out.println("迭代器耗时：" + (System.currentTimeMillis() - s) + "ms.");

ArrayList测试结果：
for循环耗时：4ms.
forEach耗时：17ms.
迭代器耗时：10ms.

LinkedList测试结果：
forEach耗时：11ms.
迭代器耗时：15ms.
for循环耗时：无穷大，没法等到执行结束。

注意：上面代码在测试时将for循环放到最后，否则测试LinkedList时，看不到forEach和迭代器的执行结果。

可以看到，对于ArrayList来说，for循环的效率是最高的，forEach反而最低。LinkedList使用for循环完全没法忍受，原因可以看之前的LinkedList查找元素的部分，每次循环都会遍历一次列表。
编译后的代码如下：

long var8 = System.currentTimeMillis();
Integer tmp = null;
Iterator i;
Integer i1;
for(i = list.iterator(); i.hasNext(); i1 = (Integer)i.next()) {
    ;
}
System.out.println("forEach耗时：" + (System.currentTimeMillis() - var8) + "ms.");
var8 = System.currentTimeMillis();
for(i = list.iterator(); i.hasNext(); tmp = (Integer)i.next()) {
    ;
}
System.out.println("迭代器耗时：" + (System.currentTimeMillis() - var8) + "ms.");
var8 = System.currentTimeMillis();
for(int var9 = 0; var9 < list.size(); ++var9) {
    list.get(var9);
}
System.out.println("for循环耗时：" + (System.currentTimeMillis() - var8) + "ms.");

forEach在编译时被优化为使用迭代器，但是多了一个赋值操作，所以比使用迭代器慢一点。

总结

要根据应用场景选择合适的List，否则会导致性能问题。同时也要注意ArrayList和Vector的容量，使用前如果能评估大小，最好初始化List时就指定，这样可以避免数组扩容，导致大量的数据拷贝，导致性能下降。

参考：《java程序性能优化——让你的java程序更快、更稳定》

luckystar2008

关注

3
点赞
踩
13

收藏

觉得还不错? 一键收藏
1
评论
java数据结构之List

List是java中重要的数据结构之一，这里介绍常用的3种实现方式：ArrayList、Vector、LinkedList。类图如下：可以看到，ArrayList、Vector、LinkedList都是AbstractList的实现。而AbstractList实现了List接口，并扩展自AbstractCollection。其中，ArrayList和Vector底层使用了数组，Li...
复制链接

扫一扫