ArrayList是怎么扩容的

最新推荐文章于 2024-05-16 08:54:38 发布

殇月陨

最新推荐文章于 2024-05-16 08:54:38 发布

阅读量989

点赞数 3

分类专栏： Core Java 文章标签： java collection

本文链接：https://blog.csdn.net/syy_c_j/article/details/76150881

版权

Core Java 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

欢迎访问配色更好看的个人站

结论：

新创建的ArrayList内部存储是一个空数组
首次添加元素扩容为默认容量 DEFAULT_CAPACITY=10
日常扩容是当前容量的1.5倍
扩容时使用 System.arraycopy 复制数组，native 方法，效率很不错

PS: JDK版本1.8.0_66

故事的开始

近期面试被问到一个问题：ArrayList是如何扩容的？在这深入了解一下其内部实现。
首先要知道什么时候会触发扩容 —– 不难想到，在添加元素时需要考虑容量是否足够，来看一下 ArrayList 的源码：

    /**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

看上去 ensureCapacityInternal(size + 1) 应该是验证空间是否足够，并且扩容了。且继续看：

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

minCapacity 是指本次扩容需求的最小空间。如果是新创建的ArrayList，会取 DEFAULT_CAPACITY(10) 和 minCapacity(1) 的最大值，即新创建的ArrayList首次增加元素时会直接需求 10 个存储空间。

我有一句话不知当讲不讲

elementData 是什么？

elementData 是ArryList实际存储元素的地方，是用一个数组实现d的。新创建一个ArrayList时，elementData 是一个 空数组 ，当添加第一个元素时， elementData会被按照 DEFAULT_CAPACITY 扩容.

 /**
  * The array buffer into which the elements of the ArrayList are stored.
  * The capacity of the ArrayList is the length of this array buffer. Any
  * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
  * will be expanded to DEFAULT_CAPACITY when the first element is added.
  */
 transient Object[] elementData; // non-private to simplify nested class access

 /**
  * Shared empty array instance used for default sized empty instances. We
  * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
  * first element is added.
  */
 private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

 /**
  * Default initial capacity.
  */
 private static final int DEFAULT_CAPACITY = 10;

size 是实际存入的对象数量，与容量 capacity 不同。

渐入佳境

确定扩充容量后，进行下一步 ensureExplicitCapacity(minCapacity)：

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

modCount 用来统计ArrayList变更的次数。

当需求的容量 minCapacity 大于内部存储数组的长度 elementData.length 时，需要进行扩容。

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

可以看到：

扩容的核心算法 oldCapacity + (oldCapacity >> 1) ：原始容量右移（除以2） + 原始容量，相当于扩容了 1.5 倍。
如果扩容后的容量小于需求的容量 minCapacity ，就按照需求的容量来扩容；
如果扩容后超出了最大容量 MAX_ARRAY_SIZE=Integer.MAX_VALUE - 8 ，会使用特殊的方法 hugeCapacity(minCapacity) 重新计算扩充后的容量大小。

超大容量处理

具体逻辑看下面的代码，逻辑比较简单：

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

再进一步

扩容时是怎么进行搬运原数据的？

核心搬运方法： Arrays.copyOf(elementData, newCapacity)，注释太长就不贴了:

    public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        @SuppressWarnings("unchecked")
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

这里在创建数组的时候用了一些优化技巧：

创建数组`Array.newInstance`

内部实现是一个 native 方法，应该是类似 c 语言中的内存申请。

    /**
     * Creates a new array with the specified component type and
     * length....
     */
    public static Object newInstance(Class<?> componentType, int length)
        throws NegativeArraySizeException {
        return newArray(componentType, length);
    }
    private static native Object newArray(Class<?> componentType, int length)
        throws NegativeArraySizeException;

复制数组`System.arraycopy`

这是一个 native 方法，在网上看到一篇文章阐述其内原理，搬运过来。原文

    public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

深入熔岩之心

找到对应的openjdk6-src/hotspot/src/share/vm/prims/jvm.cpp，这里有JVM_ArrayCopy的入口:

JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,
                               jobject dst, jint dst_pos, jint length))
  JVMWrapper("JVM_ArrayCopy");
  // Check if we have null pointers
  if (src == NULL || dst == NULL) {
    THROW(vmSymbols::java_lang_NullPointerException());
  }
  arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));
  arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));
  assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");
  assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");
  // Do copy
  Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread);
JVM_END

前面的语句都是判断，知道最后的copy_array(s, src_pos, d, dst_pos, length, thread)是真正的copy，进一步看这里，在openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass.cpp中:

void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) {
  assert(s->is_typeArray(), "must be type array");

  // Check destination
  if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) {
    THROW(vmSymbols::java_lang_ArrayStoreException());
  }

  // Check is all offsets and lengths are non negative
  if (src_pos < 0 || dst_pos < 0 || length < 0) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check if the ranges are valid
  if  ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())
     || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check zero copy
  if (length == 0)
    return;

  // This is an attempt to make the copy_array fast.
  int l2es = log2_element_size();
  int ihs = array_header_in_bytes() / wordSize;
  char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es);
  char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es);
  Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es);//还是在这里处理copy
}

这个函数之前的仍然是一堆判断，直到最后一句才是真实的拷贝语句。

在openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp中找到对应的函数:

// Copy bytes; larger units are filled atomically if everything is aligned.
void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) {
  address src = (address) from;
  address dst = (address) to;
  uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size;

  // (Note:  We could improve performance by ignoring the low bits of size,
  // and putting a short cleanup loop after each bulk copy loop.
  // There are plenty of other ways to make this faster also,
  // and it's a slippery slope.  For now, let's keep this code simple
  // since the simplicity helps clarify the atomicity semantics of
  // this operation.  There are also CPU-specific assembly versions
  // which may or may not want to include such optimizations.)

  if (bits % sizeof(jlong) == 0) {
    Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong));
  } else if (bits % sizeof(jint) == 0) {
    Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint));
  } else if (bits % sizeof(jshort) == 0) {
    Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort));
  } else {
    // Not aligned, so no need to be atomic.
    Copy::conjoint_jbytes((void*) src, (void*) dst, size);
  }
}

上面的代码展示了选择哪个copy函数，我们选择conjoint_jints_atomic，在openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp进一步查看:

// jints,                 conjoint, atomic on each jint
  static void conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    assert_params_ok(from, to, LogBytesPerInt);
    pd_conjoint_jints_atomic(from, to, count);
  }

继续向下查看，在openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp中:

static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
  _Copy_conjoint_jints_atomic(from, to, count);
}

继续向下查看，在openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp中:

void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    if (from > to) {
      jint *end = from + count;
      while (from < end)
        *(to++) = *(from++);
    }
    else if (from < to) {
      jint *end = from;
      from += count - 1;
      to   += count - 1;
      while (from >= end)
        *(to--) = *(from--);
    }
  }

可以看到，直接就是内存块赋值的逻辑了，这样避免很多引用来回倒腾的时间，必然就变快了。

殇月陨

关注

3
点赞
踩
2

收藏

觉得还不错? 一键收藏
2
评论
ArrayList是怎么扩容的

新创建的ArrayList内部存储是一个空数组首次添加元素扩容为默认容量 DEFAULT_CAPACITY=10日常扩容是当前容量的1.5倍扩容时使用 System.arraycopy 复制数组，native 方法，效率很不错
复制链接

扫一扫