JDK1.8 --- ArrayList-LinkedList源码解析

最新推荐文章于 2021-02-12 14:31:20 发布

懵懂无知的青春

最新推荐文章于 2021-02-12 14:31:20 发布

阅读量214

点赞数 1

分类专栏： java源码--集合类文章标签：集合 java源码

本文链接：https://blog.csdn.net/dc12dc34/article/details/81239742

版权

java源码--集合类专栏收录该内容

5 篇文章 0 订阅

订阅专栏

ArrayList是一个可以自动扩容的线程不安全的动态数组。

下面我们从源码级别分析一下它的实现方式。

成员变量

DEFAULT_CAPACITY:默认的初始容量为10(当元素大于初试容量时自动扩容)

/**
 * Default initial capacity.
 */
 private static final int DEFAULT_CAPACITY = 10;

elementData:存储集合元素

/**
 * The array buffer into which the elements of the ArrayList are stored.
 * The capacity of the ArrayList is the length of this array buffer. Any
 * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
 * will be expanded to DEFAULT_CAPACITY when the first element is added.
 */
transient Object[] elementData; // non-private to simplify nested class access

EMPTY_ELEMENTDATA:指定容量时，作为elementData初始化值

/**
 * Shared empty array instance used for empty instances.
 */
private static final Object[] EMPTY_ELEMENTDATA = {};

DEFAULTCAPACITY_EMPTY_ELEMENTDATA:作为elementData默认初始化值(不指定初试容量)

/**
 * Shared empty array instance used for default sized empty instances. We
 * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
 * first element is added.
 */
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

成员方法

trimToSize方法:去掉预留元素位置

/**
 * Trims the capacity of this <tt>ArrayList</tt> instance to be the
 * list's current size.  An application can use this operation to minimize
 * the storage of an <tt>ArrayList</tt> instance.
 */
public void trimToSize() {
    modCount++;
    if (size < elementData.length) {
        elementData = (size == 0)
          ? EMPTY_ELEMENTDATA
          : Arrays.copyOf(elementData, size);
    }
}

注：这个方法一般在list元素过多，内存紧张的时候使用。

ensureExplicitCapacity:确定容量是否需要自动增长

private void ensureExplicitCapacity(int minCapacity) {
     modCount++;

     // overflow-conscious code
     if (minCapacity - elementData.length > 0)
         grow(minCapacity);
}

grow:自动增长

/**
 * The maximum size of array to allocate.
 * Some VMs reserve some header words in an array.
 * Attempts to allocate larger arrays may result in
 * OutOfMemoryError: Requested array size exceeds VM limit
 */
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    //新的容量等于原来容量的1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    /*如果扩容之后的新容量比添加元素之后的容量还小，那么就扩容成添加元素后的容量
     *比如原本容量为10，那么我们调用addAll方法尝试添加一个20个元素的list
     *这个时候经过一次扩容之后的容量肯定是不足以存储，那么此时增长后的容量就为20
     */
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}
private static int hugeCapacity(int minCapacity) {
    //溢出处理【minCapacity溢出时，会为负数】
    if (minCapacity < 0) // overflow
         throw new OutOfMemoryError();
     return (minCapacity > MAX_ARRAY_SIZE) ?
         Integer.MAX_VALUE :
         MAX_ARRAY_SIZE;
}

通过以上代码，可能有几个问题会困惑我们。

1⃣️.为什么最大容量 MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8。
答：通过注释我们可以看出：JVM在存储数组时会预留一定大小的字节用以存储对象头等信息。(jvm:arrayOopDesc)

// Return the maximum length of an array of BasicType.  The length can passed
// to typeArrayOop::object_size(scale, length, header_size) without causing an
// overflow. We also need to make sure that this will not overflow a size_t on
// 32 bit platforms when we convert it to a byte size.
static int32_t max_array_length(BasicType type) {
  assert(type >= 0 && type < T_CONFLICT, "wrong type");
  assert(type2aelembytes(type) != 0, "wrong type");

  const size_t max_element_words_per_size_t =
    align_size_down((SIZE_MAX/HeapWordSize - header_size(type)), MinObjAlignment);
  const size_t max_elements_per_size_t =
    HeapWordSize * max_element_words_per_size_t / type2aelembytes(type);
  if ((size_t)max_jint < max_elements_per_size_t) {
    // It should be ok to return max_jint here, but parts of the code
    // (CollectedHeap, Klass::oop_oop_iterate(), and more) uses an int for
    // passing around the size (in words) of an object. So, we need to avoid
    // overflowing an int when we add the header. See CRs 4718400 and 7110613.
    return align_size_down(max_jint - header_size(type), MinObjAlignment);
  }
  return (int32_t)max_elements_per_size_t;
}

以上的代码就是JVM处理数组最大长度的代码：我们能够从中得到以下几个点：

返回值是一个int32_t类型(和int64_t大小一样)，32位环境下的整型占用4个字节。那么我们可以得知这就是int最大长度不可能大于java的int类型的最大长度即：Integer.MAX_VALUE(2^31-1)。(最大长度受JVM堆大小的限制：如果需要测试，请尽力保证堆内存 > 2^31-1/2^20 [大概2048M左右,也就是2G内存]1M = 1024K; 1K = 1024B)
至于预留8个元素的大小用于给对象头等预留空间(这里面的空间包括对象头8个字节、oop指针:默认开启对象指针压缩，4个字节、数组长度4个字节、内存按8字节倍数对齐)。注意这里的8个元素的大小可能和对象头等信息的大小对不上，这一点我会再去查阅相关资料。

因为JVM使用uint32_t来记录对象的大小，由此得出一个结论：

(uint32_t的最大值 - 数组对象的对象头大小) / 数组元素大小。
所以对于元素类型不同的数组，实际能创建的数组的最大length不一定相同。

以HotSpot VM为例，在32位上的版本，或者在64位上开启了压缩指针的版本，int[]与String[]的元素大小其实都是32位（4字节），所以它们的最大元素个数会是一样的。而同样是64位的HotSpot VM如果不开启压缩指针的话String[]的元素大小就是64位（8字节），这就跟int[]的元素大小不一样大，可以容纳的最大元素个数就不一样多了。

2⃣️. (minCapacity > MAX_ARRAY_SIZE) 时，依旧会扩容。
通过JVM的实现可以看出java不允许数组的长度大于Integer.MAX_VALUE - 8，那么它为什么不直接抛出内存溢出的溢出呢？这个问题暂时没有找到答案。我是这么想的，它这边不处理，在后面进行数组拷贝的时候依旧会抛出内存溢出的异常。所以在这里没有处理。

ArrayList—–iterator : ArrayList迭代器

/**
 * 游标：指向下一个需要遍历的元素
 * Index of element to be returned by subsequent call to next.
 */
 int cursor = 0;

 /**
  * 游标的前一个元素，也就当前元素
  * Index of element returned by most recent call to next or
  * previous.  Reset to -1 if this element is deleted by a call
  * to remove.
  */
 int lastRet = -1;

 /**
  * 修改次数：一般用于防止并发操作，我们知道 java.util.ArrayList 不是线程安全的，
  * 因此如果在使用迭代器的过程中有其他线程修改了map，那么将抛出  
  * ConcurrentModificationException，这就是所谓fail-fast策略。
  * expectedModCount:预期修改次数，modCount:实际修改次数
  * The modCount value that the iterator believes that the backing
  * List should have.  If this expectation is violated, the iterator
  * has detected concurrent modification.
  */
 int expectedModCount = modCount;

fail-fast机制
对ArrayList内容的修改都将增加modCount这个值，那么在迭代器初始化过程中会将这个值赋给迭代器的 expectedModCount。在迭代过程中，判断 modCount 跟 expectedModCount 是否相等，如果不相等就表示已经有其他线程修改了ArrayList：注意到 modCount 声明为 volatile，保证线程之间修改的可见性。

public boolean hasNext() {
    return cursor != size;
}

@SuppressWarnings("unchecked")
public E next() {
    checkForComodification();
    int i = cursor;
    if (i >= size)
        throw new NoSuchElementException();
    Object[] elementData = ArrayList.this.elementData;
    if (i >= elementData.length)
        throw new ConcurrentModificationException();
    cursor = i + 1;
    return (E) elementData[lastRet = i];
}

public void remove() {
    if (lastRet < 0)
        throw new IllegalStateException();
    checkForComodification();

    try {
        ArrayList.this.remove(lastRet);
        //重新调整遍历位置，使得游标指向下一个元素，因为ArrayList中remove方法
        //像我们普通的删除数组元素一样，会将后面的元素移到前一个元素的位置上来，
        //所以下一个元素会被移到当前位置，所以我们把游标指向当前位置。
        cursor = lastRet;
        lastRet = -1;
        expectedModCount = modCount;
    } catch (IndexOutOfBoundsException ex) {
        throw new ConcurrentModificationException();
    }
}

迭代器的实现比较简单，其中值得注意的是：每次在进行遍历和移除时，都要校验fast-fail，以防止并发修改。

集合遍历删除元素的三种方式

我们先给出这样一个List:[1, 2, 3, 4, 5, 1, 1] 。下面三种方式的结果

普通遍历

for (int i=0 ; i< list.size() ; i++) {
    if(list.get(i).equals(new Integer(1))) {
        list.remove(i);
        //i--;
    }
}

结果:[2, 3, 4, 5, 1]

原因是在remove的时候没有调整 i 的值。导致如果发生remove事件，就会使得之后的一个元素会被跳过。所以最后的一个 1 就被跳过了。当然我们可以自己去调整 i 的值，但是不推荐使用这种方式。

增强for循环

for (Integer i : list) {
    if(i.equals(new Integer(1))) {
        list.remove(i);
    }
}

结果:

Exception in thread "main" java.util.ConcurrentModificationException
    at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
    at java.util.ArrayList$Itr.next(ArrayList.java:859)
    at ItrArrayList.main(ArrayListTest.java:62)

这个结果的原因是:增强for循环的内部用的其实是迭代器的next()方法，在调用next的方法时，会进行fast-fail校验，然而我们在内部进行remove调用时并没有使用迭代器内部的remove方法，没有及时更新expectedModCount的值，导致fast-fail。

以下是增强for循环代码反编译的结果:我们可以清除的看到增强for循环使用了迭代器的方式

204: invokeinterface #7,  1// InterfaceMethod java/util/List.iterator:()Ljava/util/Iterator;
209: astore_2
210: aload_2
211: invokeinterface #8,  1// InterfaceMethod java/util/Iterator.hasNext:()Z
216: ifeq          262
219: aload_2
220: invokeinterface #9,  1// InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object;

迭代器遍历

Iterator it = list.iterator();
  while (it.hasNext()) {
      if(it.next().equals(new Integer(1))) {
          it.remove();
      }
  }

结果:[2, 3, 4, 5]

总结
当我们要进行遍历删除的时候一定要使用集合内部的迭代器。也就是第三种方式。

并发使用ArrayList会出现的问题

考虑如下代码会有什么问题？

public class ArrayListTest {
    public static List<Integer> numberList = new ArrayList<Integer>();
    public static Integer limit = 1000000;

    public static class AddToList implements Runnable {

        int startNum;
        public AddToList (int startNum) {
            this.startNum = startNum;
        }
        @Override
        public void run() {
            int count = 0 ;
            while ((count < limit)) {
                numberList.add(startNum);
                System.out.println(Thread.currentThread().getName()+"--"+(count + 1) +"次进入，添加的数字为："+startNum+"---此时集合大小:"+numberList.size());
                startNum += 2;
                count++;
            }
        }
    }
    public static void main(String[] args) {
        Thread t1 = new Thread(new AddToList(0));
        Thread t2 = new Thread(new AddToList(1));
        t1.start();
        t2.start();
    }
}

经过验证最终可能出现两种问题：

数组越界异常

Exception in thread "Thread-0" java.lang.ArrayIndexOutOfBoundsException: 549
    at java.util.ArrayList.add(ArrayList.java:463)
    at ArrayListTest$AddToList.run(ArrayListTest.java:23)
    at java.lang.Thread.run(Thread.java:748)

那么只可能是 elementData[size++] = e; 出现的问题，我们再往前走:if (minCapacity - elementData.length > 0)。我们现在假设有AB线程并行进入ensureExplicitCapacity这个方法，而此时集合容量差一个元素就需要扩容。那么这个AB线程在这个位置都不会去扩容。最终导致其中一个线程出现数组越界异常。

集合中出现null

同样是AB线程:add方法最终就是为数组赋值:elementData[size++] = e; 在这里我们看到了一个非常常见的线程安全问题就是 size++ ,接触过多线程的都知道:size++这个操作并不是原子性的(从变量 size 中读取读取 size 的值-> 值+1 ->将+1后的值写回 size 中)，那么在这里如果AB线程同时到了这里，则极有可能导致size++的结果最终只加了1次，使得一个线程的值别另一个线程覆盖，多余的一个位置为null。我们很容易相处在这个位置可以用AtomicInteger定义。但是这样做解决不了其他地方的并发问题。

总结
ArrayList不是线程安全的，不能将其用在并发环境上。

以下是并发下的两种方案:

Collections.synchronizedCollection(list);
这个方式的原理是为每一个方法加上Synchronized，不推荐使用，因为锁的粒度太大了，严重影响效率。
使用并发包下的CopyOnWriteArrayList。
这个使用了读写锁，是一种并发环境下推荐的线程安全的ArrayList的解决方案。这个类的使用会在后续博客中分析。

ArrayList和LinkList的区别

LinkedList的成员变量

/**
 * Pointer to first node.
 * Invariant: (first == null && last == null) ||
 *            (first.prev == null && first.item != null)
 */
transient Node<E> first;

/**
 * Pointer to last node.
 * Invariant: (first == null && last == null) ||
 *            (last.next == null && last.item != null)
 */
transient Node<E> last;

由此可见:LinkedList内部其实就是一个双链表实现。由于它们的实现很相似，就不另写一篇单独分析。

它们的区别如下：
      从实现来看：ArrayList用的是数组实现，LinkedList用的是链表实现。ArrayList会自动扩容，LinkedList不需要扩容机制。
      从功能上来看，数组实现的ArrayList查找快，插入和删除比较慢。。链表实现的LinkedList插入删除快，查找慢。
      这其实是链表和数组的区别。例如数组a在查找的时候a[100]，直接可以通过下标加法迅速定位到指定的位置，查找效率是常数级的，而链表则必须通过头节点依次遍历100次才能得到指定的节点。如果需要删除a[100]，那么对应数组而言，删除的方式是通过后元素覆盖前元素的方式，删除的元素越靠前，删除的代价越大，而对于链表，删除和插入只需要简单的指针移动就可以完成。。

懵懂无知的青春

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
JDK1.8 --- ArrayList-LinkedList源码解析

ArrayList是一个可以自动扩容的线程不安全的动态数组。下面我们从源码级别分析一下它的实现方式。成员变量DEFAULT_CAPACITY:默认的初始容量为10(当元素大于初试容量时自动扩容)/** * Default initial capacity. */ private static final int DEFAULT_CAPACITY = 10;e...
复制链接

扫一扫

专栏目录