ArrayList是怎么扩容的

欢迎访问配色更好看的个人站

结论:

  1. 新创建的ArrayList内部存储是一个空数组
  2. 首次添加元素扩容为默认容量 DEFAULT_CAPACITY=10
  3. 日常扩容是当前容量的1.5倍
  4. 扩容时使用 System.arraycopy 复制数组,native 方法,效率很不错

PS: JDK版本1.8.0_66

故事的开始

近期面试被问到一个问题:ArrayList是如何扩容的?在这深入了解一下其内部实现。
首先要知道什么时候会触发 扩容 —– 不难想到,在添加元素时需要考虑容量是否足够,来看一下 ArrayList 的源码:

    /**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

看上去 ensureCapacityInternal(size + 1) 应该是验证空间是否足够,并且扩容了。且继续看:

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

minCapacity 是指本次扩容需求的最小空间。如果是新创建的ArrayList,会取 DEFAULT_CAPACITY(10)minCapacity(1) 的最大值,即新创建的ArrayList首次增加元素时会直接需求 10 个存储空间。

我有一句话不知当讲不讲

  1. elementData 是什么?

    elementData 是ArryList实际存储元素的地方,是用一个数组实现d的。新创建一个ArrayList时,elementData 是一个 空数组 ,当添加第一个元素时, elementData会被按照 DEFAULT_CAPACITY 扩容.

     /**
      * The array buffer into which the elements of the ArrayList are stored.
      * The capacity of the ArrayList is the length of this array buffer. Any
      * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
      * will be expanded to DEFAULT_CAPACITY when the first element is added.
      */
     transient Object[] elementData; // non-private to simplify nested class access
    
     /**
      * Shared empty array instance used for default sized empty instances. We
      * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
      * first element is added.
      */
     private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    
     /**
      * Default initial capacity.
      */
     private static final int DEFAULT_CAPACITY = 10;
  2. size 是实际存入的对象数量, 与容量 capacity 不同。

渐入佳境

确定扩充容量后,进行下一步 ensureExplicitCapacity(minCapacity)

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

modCount 用来统计ArrayList变更的次数。

当需求的容量 minCapacity 大于内部存储数组的长度 elementData.length 时,需要进行扩容。

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

可以看到:

  • 扩容的核心算法 oldCapacity + (oldCapacity >> 1) :原始容量右移(除以2) + 原始容量,相当于扩容了 1.5 倍。
  • 如果扩容后的容量小于需求的容量 minCapacity ,就按照需求的容量来扩容;
  • 如果扩容后超出了最大容量 MAX_ARRAY_SIZE=Integer.MAX_VALUE - 8 ,会使用特殊的方法 hugeCapacity(minCapacity) 重新计算扩充后的容量大小。

超大容量处理

具体逻辑看下面的代码,逻辑比较简单:

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

再进一步

扩容时是怎么进行搬运原数据的?

核心搬运方法: Arrays.copyOf(elementData, newCapacity),注释太长就不贴了:

    public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        @SuppressWarnings("unchecked")
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

这里在创建数组的时候用了一些优化技巧:

创建数组Array.newInstance

内部实现是一个 native 方法,应该是类似 c 语言中的内存申请。

    /**
     * Creates a new array with the specified component type and
     * length....
     */
    public static Object newInstance(Class<?> componentType, int length)
        throws NegativeArraySizeException {
        return newArray(componentType, length);
    }
    private static native Object newArray(Class<?> componentType, int length)
        throws NegativeArraySizeException;

复制数组System.arraycopy

这是一个 native 方法, 在网上看到一篇文章阐述其内原理,搬运过来。 原文

    public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

深入熔岩之心

找到对应的openjdk6-src/hotspot/src/share/vm/prims/jvm.cpp,这里有JVM_ArrayCopy的入口:

JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,
                               jobject dst, jint dst_pos, jint length))
  JVMWrapper("JVM_ArrayCopy");
  // Check if we have null pointers
  if (src == NULL || dst == NULL) {
    THROW(vmSymbols::java_lang_NullPointerException());
  }
  arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));
  arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));
  assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");
  assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");
  // Do copy
  Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread);
JVM_END

前面的语句都是判断,知道最后的copy_array(s, src_pos, d, dst_pos, length, thread)是真正的copy,进一步看这里,在openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass.cpp中:

void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) {
  assert(s->is_typeArray(), "must be type array");

  // Check destination
  if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) {
    THROW(vmSymbols::java_lang_ArrayStoreException());
  }

  // Check is all offsets and lengths are non negative
  if (src_pos < 0 || dst_pos < 0 || length < 0) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check if the ranges are valid
  if  ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())
     || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check zero copy
  if (length == 0)
    return;

  // This is an attempt to make the copy_array fast.
  int l2es = log2_element_size();
  int ihs = array_header_in_bytes() / wordSize;
  char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es);
  char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es);
  Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es);//还是在这里处理copy
}

这个函数之前的仍然是一堆判断,直到最后一句才是真实的拷贝语句。

在openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp中找到对应的函数:

// Copy bytes; larger units are filled atomically if everything is aligned.
void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) {
  address src = (address) from;
  address dst = (address) to;
  uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size;

  // (Note:  We could improve performance by ignoring the low bits of size,
  // and putting a short cleanup loop after each bulk copy loop.
  // There are plenty of other ways to make this faster also,
  // and it's a slippery slope.  For now, let's keep this code simple
  // since the simplicity helps clarify the atomicity semantics of
  // this operation.  There are also CPU-specific assembly versions
  // which may or may not want to include such optimizations.)

  if (bits % sizeof(jlong) == 0) {
    Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong));
  } else if (bits % sizeof(jint) == 0) {
    Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint));
  } else if (bits % sizeof(jshort) == 0) {
    Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort));
  } else {
    // Not aligned, so no need to be atomic.
    Copy::conjoint_jbytes((void*) src, (void*) dst, size);
  }
}

上面的代码展示了选择哪个copy函数,我们选择conjoint_jints_atomic,在openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp进一步查看:

// jints,                 conjoint, atomic on each jint
  static void conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    assert_params_ok(from, to, LogBytesPerInt);
    pd_conjoint_jints_atomic(from, to, count);
  }

继续向下查看,在openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp中:

static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
  _Copy_conjoint_jints_atomic(from, to, count);
}

继续向下查看,在openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp中:

void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    if (from > to) {
      jint *end = from + count;
      while (from < end)
        *(to++) = *(from++);
    }
    else if (from < to) {
      jint *end = from;
      from += count - 1;
      to   += count - 1;
      while (from >= end)
        *(to--) = *(from--);
    }
  }

可以看到,直接就是内存块赋值的逻辑了,这样避免很多引用来回倒腾的时间,必然就变快了。

  • 3
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值