ArrayList 详解

最新推荐文章于 2024-09-04 21:44:11 发布

东平王北星

最新推荐文章于 2024-09-04 21:44:11 发布

阅读量1k

点赞数 1

分类专栏： java核心

本文链接：https://blog.csdn.net/mz4138/article/details/80896645

版权

java核心专栏收录该内容

10 篇文章 2 订阅

订阅专栏

构造简介

创建对象

通过new ArrayList(); 创建对象时,可以看到这里内部就只是初始化了一下 elementData 为默认节点而已。

但是这里可以看到 transient 来定义的 elementData , 那么ArrayList 是实现了序列化的。

空构造

    transient Object[] elementData;
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

初始化容量

    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }

复制目标集合

转换成数组对象
判断长度为0
- 为0 设置为: EMPTY_ELEMENTDATA
- 不为0 ,用 System.arraycopy 的native方法,创建一个新的数组

    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if ((size = elementData.length) != 0) {
            // c.toArray might (incorrectly) not return Object[] (see 6260652)
            if (elementData.getClass() != Object[].class)
                elementData = Arrays.copyOf(elementData, size, Object[].class);
        } else {
            // replace with empty array.
            this.elementData = EMPTY_ELEMENTDATA;
        }
    }

核心价值

add 添加

添加要做的有下面几件事情:
1. 判断内部容量是否需要增长,(需要就增长)
2. 设置数组最后一位为要添加的对象

  public boolean add(E e) {
        //判断内部容量是否需要增长,(需要就增长)
        ensureCapacityInternal(size + 1);  
        //设置数组最后一位为要添加的对象,
        elementData[size++] = e;
        return true;
    }

那么如何判断是否”需要” 呢？
1. 如果是无参构造, 那么就设置最小容量为10,也可以通过有参构造指定初始化容量
2. 确认具体的容量, 这里为 10

    private static final int DEFAULT_CAPACITY = 10;

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

如何确认具体容量? 接下来解读函数 ensureExplicitCapacity
1. modCount++,这是用来做并发验证的一个字段
2. 判断最小容量minCapacity 大于内部数组elementData的长度时,动态增加 elementData

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity); // 增加 elementData
    }

增加数组长度

到了这里就已经明确,如何动态增加数组长度的地方了

判断新的容量小于最小容量, 设置新容量为最小容量
判断新容量是否大于最大容量,然后设置最大容量
通过 System.arraycopy 的native方法,将elementData设置为新数组(长度修改,内容不变)

    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    private void grow(int minCapacity) {
        // overflow-conscious code
        // 内部数组长度, 初次进入时为 0
        int oldCapacity = elementData.length;
        // 新长度 = 内部长度 + 内部长度/2, 初次进入为0
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0) // 初次进入时 0 - 10 < 0 ,新容量 为最小容量
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

add 总结

默认初始容量为10(在add时初始化) ,可以通过构造函数传入初始容量
增长比例为1.5倍: newCapacity = oldCapacity + (oldCapacity >> 1)
设置巨大的数组长度时,预留了Integer.MAX_VALUE - 8, 当然也可以是-10,-3. 只要不是-2以内就行(VM限制)

删除对象

删除对象时,分以下两种情况
1. null 用 == 判断快一些
2. 不为null,用equals判断,慢一些
但是无论怎样都需要遍历整个数组,然后调用fastRemove删除对象

    public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }

fastRemove删除对象

判断要移除的是不是最后一位,如果不是最后一位,用System.arraycopy 把后面的所有内容复制到当前位置,依次类推
size=size-1
将数组的最后一位设置为null, 让gc可以回收那个不被占用的内存

    private void fastRemove(int index) {
        modCount++;
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    }

第三步可能比较难以理解一些? 这里对第三步进行详细说明

数组容量并不是size, 现在以空构造添加了一条数据为例.

        ArrayList<String> arrayList = new ArrayList<>();
        arrayList.add("a");

此时,ArrayList内部的数组的长度 elementData.length = 10, ArrayList 的 size = 1

所以如果有五条数据,删除第三条, 会把第四条第五条, 移动到第三条第四条的位置上

然后再把第五条的值设置为null,否则,数组一直引用着那个外界一直无法引用的值,导致内存无法回收(内存泄漏)

状态	数组的值	代码
未删除的原始数据	“a”,”b”,c”,”d”,”e”
删除第三位	“a”,”b”,”d”,”e”,”e”	System.arraycopy(elementData, index+1, elementData, index, numMoved);
最后一位设置为null	“a”,”b”,”d”,”e”,null	elementData[4] = null;

低价值

删除索引

和删除对象一样,仍然要将数组的最后一位设置为 null,让GC去收集
1. 范围检查
2. 获得旧值
3. 批量覆盖
4. 置空最后一位
5. 返回旧值


    public E remove(int index) {
        rangeCheck(index);

        modCount++;
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work

        return oldValue;
    }

get 获取对象

这里面只是做了一个访问位置的校验而已,大于实际已存储的长度就抛出异常


    E elementData(int index) {
        return (E) elementData[index];
    }

    // 校验请求的索引,是否大于实际已存储的大小
    private void rangeCheck(int index) {
         if (index >= size)
             throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    public E get(int index) {
        rangeCheck(index);

        return elementData(index);
    }

set 设置指定位置的值

验证设置的位置不能大于等于实际存储的大小
获取旧值
设置新值
返回旧值


    E elementData(int index) {
        return (E) elementData[index];
    }

    // 校验请求的索引,是否大于实际已存储的大小
    private void rangeCheck(int index) {
         if (index >= size)
             throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }
    public E set(int index, E element) {
        rangeCheck(index);

        E oldValue = elementData(index);
        elementData[index] = element;
        return oldValue;
    }

contains

判断是不是有索引

 public boolean contains(Object o) {
        return indexOf(o) >= 0;
    }

indexOf

分两种类型去遍历

/**
     * Returns the index of the first occurrence of the specified element
     * in this list, or -1 if this list does not contain the element.
     * More formally, returns the lowest index <tt>i</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
     * or -1 if there is no such index.
     */
    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }

拓展

序列化

写入

写入默认头部信息
写入对象大小
依次调用数组对象的序列化写入
验证并发错误

s.defaultWriteObject(); 是写入ArrayList 对象需要写入的内容

elementData定义为 transient ,不会被写入

  /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */

transient Object[] elementData;

private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException{
        // Write out element count, and any hidden stuff
        int expectedModCount = modCount;
        s.defaultWriteObject();

        // Write out size as capacity for behavioural compatibility with clone()
        s.writeInt(size);

        // Write out all elements in the proper order.
        for (int i=0; i<size; i++) {
            s.writeObject(elementData[i]);
        }

        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
    }

设置最大容量

代码:

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    /**
    * 数组可以分配的最大值.一些虚拟机头部 
    * 一些VM在数组中保留头部字
    * 如果要分配超过这个数值可能导致 内存溢出 : 请求数组大小超过虚拟机限制
    */
 private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

 private static int hugeCapacity(int minCapacity) {
        // a+b 大于 Integer.MAX_VALUE时,结果为负数,表示容量溢出
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

由于外面已经判断过了: newCapacity - MAX_ARRAY_SIZE > 0时,才进入当前方法体

所以也就是说,当新容量在 Integer.MAX_VALUE-8

容量小于0

什么时候会小于0?
当a+b>Integer.MAX_VALUE 时,结果为负数,所以当要申请的值已经为负数时,直接抛出异常

容量大于MAX_ARRAY_SIZE

容量刚巧就在 Integer.MAX_VALUE - 8

容量小于MAX_ARRAY_SIZE

最小容量小于Integer.MAX_VALUE - 8 时,将数组容量设置为 Integer.MAX_VALUE - 8

那么关于这里就有一个问题: 为什么是-8,不是 -10,-1,-5之类的

从 MAX_ARRAY_SIZE 的描述看出来,是为了给部分虚拟机的头部预留一些字节出来

例如我们用的SUN的JVM, 数组最大值是: Integer.MAX_VALUE - 2 (分配时,native方法校验是否可以分配)

设置为Integer.MAX_VALUE-1,或者Integer.MAX_VALUE时,抛出异常java.lang.OutOfMemoryError: Requested array size exceeds VM limit

如果是其他的虚拟机,可能预留的头有5个,3个? 这里设置-8 只是为了多预留一点,如果将来数组类的头部需要超过8个,这个ArrayList也必将跟着改动

接下来看看每个类型的空数组占用的大小

类型	字节	写法	大小
boolean	1	boolean [] arr = new boolean[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 1 - 2,接近2GB
byte	1	byte [] arr = new byte[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 1 - 2,接近2GB
char	2	char [] arr = new char[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 2 - 2,接近4GB
short	2	short [] arr = new short[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 2 - 2,接近4GB
int	4	int [] arr = new int[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 4 - 2,接近8GB
float	4	float [] arr = new float[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 4 - 2,接近8GB/td>
long	8	long [] arr = new long[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 8 - 2,接近16GB
double	8	double [] arr = new double[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 8 - 2,接近16GB
空Object数组位	4	Object [] arr = new Object[Integer.MAX_VALUE * 1 - 2]	Integer.MAX_VALUE * 4 - 2,接近8GB

在测试Object数组最大值之前,要设置jvm内存参数: -Xms9g -Xmx9g -Xmn20M 新生代20M 就可以

否则会报错:OutOfMemoryError ,无法超过虚拟机默认限制

数组的长度详解:
- https://stackoverflow.com/questions/35756277/why-the-maximum-array-size-of-arraylist-is-integer-max-value-8
- https://blog.csdn.net/renfufei/article/details/78170188?locationNum=2&fps=1

ArrayList并发验证

/**
     * The number of times this list has been <i>structurally modified</i>.
     * Structural modifications are those that change the size of the
     * list, or otherwise perturb it in such a fashion that iterations in
     * progress may yield incorrect results.
     *
     * <p>This field is used by the iterator and list iterator implementation
     * returned by the {@code iterator} and {@code listIterator} methods.
     * If the value of this field changes unexpectedly, the iterator (or list
     * iterator) will throw a {@code ConcurrentModificationException} in
     * response to the {@code next}, {@code remove}, {@code previous},
     * {@code set} or {@code add} operations.  This provides
     * <i>fail-fast</i> behavior, rather than non-deterministic behavior in
     * the face of concurrent modification during iteration.
     *
     * <p><b>Use of this field by subclasses is optional.</b> If a subclass
     * wishes to provide fail-fast iterators (and list iterators), then it
     * merely has to increment this field in its {@code add(int, E)} and
     * {@code remove(int)} methods (and any other methods that it overrides
     * that result in structural modifications to the list).  A single call to
     * {@code add(int, E)} or {@code remove(int)} must add no more than
     * one to this field, or the iterators (and list iterators) will throw
     * bogus {@code ConcurrentModificationExceptions}.  If an implementation
     * does not wish to provide fail-fast iterators, this field may be
     * ignored.
     */
    protected transient int modCount = 0;

统计数组 elementData 结构被修改的次数

add,remove,clear,replaceAll,sort 五个操作的时候, 会修改

set 不会修改modCount

在进行 subList,writeObject(序列化写入),forEach,Iterator 操作时,验证是否有并发修改操作

就算加到超过 int 以外变成最小值也不怕. 因为只是进行验证并发时,其他线程没有修改modCount就可以

遍历细节详解

代码:

        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                        //do something
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                     //do something
        }
    }

判断分两种情况
1. null ,用==比较,更快
2. Object ,调用实际对象的 equals 比较,会慢一些

在 contains,remove对象,indexOf 中都需要用到遍历.当需要遍历不是同一类型的对象时,推荐使用lambda,同一类型对象使用重写equals+hashCode

例1:

ArrayList,这个时候想要删除Water, 一个完全不相关的类.

但是只要重写Water的equals进行判断相等处理就可以

public class Water{

    private int number;
    public int hashCode(){
        return number;
    }

    public boolean equals(Object o){
        if(o instanceof Teacher){
            return ((Teacher)o).getWaterNumber == number;
        }
        if(o instanceof Water){
                    return ((Water)o).number == number;
         }
         return false;
    }
}