ArrayList是Java中提供的动态数组结构,其父类为AbstractList<E>,实现了List<E>, RandomAccess, Cloneable, Serializable四个接口
一、ArrayList的初始化
ArrayList中包括一个静态变量DEFAULT_CAPACITY,值固定为10,用来规定一个ArrayList对象的默认初始容量。当new ArrayList时,若不指定初始容量,则初始容量为10。
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
/**
* The array buffer into which the elements of the ArrayList are stored.
* The capacity of the ArrayList is the length of this array buffer. Any
* empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
* will be expanded to DEFAULT_CAPACITY when the first element is added.
*/
transient Object[] elementData; // non-private to simplify nested class access
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
DEFAULTCAPACITY_EMPTY_ELEMENTDATA是ArrayList中另一个静态变量,是一个空的Object数组。elementData是ArrayList中存储数据的结构,就是一个普通的Object数组。
当new ArrayList时指定了初始容量,会直接new一个对应大小的Object数组并赋值给elementData。若指定的值等于0,elementData赋值为EMPTY_ELEMENTDATA,虽然都是空的对象数组,但DEFAULTCAPACITY_EMPTY_ELEMENTDATA代表ArrayList当前容量为10,在无参构造函数中使用,EMPTY_ELEMENTDATA代表ArrayList当前容量为0,是用户指定的容量。若指定的容量小于0,则抛出IllegalArgumentException
/**
* Constructs an empty list with the specified initial capacity.
*
* @param initialCapacity the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity);
}
}
在初始化ArrayList时,还可以传入一个集合对象,可以直接设定新创建的ArrayList的内容。首先用toArray方法将传入的对象转为数组并赋值给elementData,但由于toArray方法的返回值不一定是Object数组,因此还需判断elementData的类型。
public ArrayList(Collection<? extends E> c) {
elementData = c.toArray();
if ((size = elementData.length) != 0) {
// c.toArray might (incorrectly) not return Object[] (see 6260652)
if (elementData.getClass() != Object[].class)
elementData = Arrays.copyOf(elementData, size, Object[].class);
} else {
// replace with empty array.
this.elementData = EMPTY_ELEMENTDATA;
}
}
二、ArrayList的容量和扩容
首先要说明,ArrayList中不存在记录capacity的成员变量,所谓容量,就是elementData数组的length,也就是当前能存下的最大元素个数,超过这个个数就需要将elementData数组扩容,也就是给这个数组重新分配一个空间。
扩容相关方法的参数都为minCapacity,也就是需要保证ArrayList有至少minCapacity的空间。但有一种特殊情况,就是前面初始化时用到的DEFAULTCAPACITY_EMPTY_ELEMENTDATA。用这个进行初始化后数组长度为0,也就是容量为0。但当第一个元素通过add方法加入到ArrayList后,会直接把数组容量扩为DEFAULT_CAPACITY,即10。
trimToSize方法将ArrayList的capacity变成与当前size一致的值,调用Arrays.copyOf方法复制出一个新的数组并赋值给elementData。
/**
* Trims the capacity of this <tt>ArrayList</tt> instance to be the
* list's current size. An application can use this operation to minimize
* the storage of an <tt>ArrayList</tt> instance.
*/
public void trimToSize() {
modCount++;
if (size < elementData.length) {
elementData = (size == 0)
? EMPTY_ELEMENTDATA
: Arrays.copyOf(elementData, size);
}
}
grow是真正实现物理上的数组扩容的方法,其它的扩容方法只是计算出minCapacity,判断是否需要调用grow方法。
oldCapacity是当前elementData数组的长度,尝试将oldCapacity增加一半来计算newCapacity,如果这样计算出的newCapacity依然小于minCapacity,则将newCapacity赋值为minCapacity,最后用Arrays.copyOf方法创造一个新的、容量为newCapacity的数组。
Remark: 假设要插入的数据量为 n n n,初始容量为 m 0 m_0 m0,一次扩容会将容量变为1.5倍, k k k次扩容可将 m 0 m_0 m0变为 1. 5 k × m 0 1.5^k\times m_0 1.5k×m0,因此 n n n个元素全都插入进去一共需要扩容 k = l o g 1.5 n m 0 k=log_{1.5}\frac{n}{m_0} k=log1.5m0n。假设 n n n为1000000, m 0 m_0 m0是默认值10,共需扩容29次(实际比这个多一些,因为每次乘1.5要取整)
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
MAX_ARRAY_SIZE是ArrayList允许的最大容量,值固定为Integer.MAX_VALUE - 8。如果grow方法中计算出的newCapacity比这个数还大,就要调用hugeCapacity再次确定newCapacity的数值,可能是Integer.MAX_VALUE,也可能是MAX_ARRAY_SIZE。如果ArrayList尝试分配一个比MAX_ARRAY_SIZE还大的空间,可能会引发OutOfMemoryError。也就是说,ArrayList理论上的最大容量是Integer.MAX_VALUE,但由于虚拟机的限制可能不能分配这么大的空间。
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}
ensureExplicitCapacity方法判断是否需要扩容,若需要,调用grow方法,来保证ArrayList有着至少minCapacity的容量。
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
ensureCapacity方法是扩容相关的唯一一个public方法,首先会判断当前elementData数组是不是DEFAULTCAPACITY_EMPTY_ELEMENTDATA,如果是,说明当前"容量"已经是10了(实际上不是,add一个元素之后才会变成10),如果用户指定的minCapacity<=10,不需要任何操作。否则调用ensureExplicitCapacity判断是否需要扩容。
public void ensureCapacity(int minCapacity) {
int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
// any size if not default element table
? 0
// larger than default for default empty table. It's already
// supposed to be at default size.
: DEFAULT_CAPACITY;
if (minCapacity > minExpand) {
ensureExplicitCapacity(minCapacity);
}
}
ensureCapacityInternal是调用add方法时用来扩容的方法。依然要注意DEFAULTCAPACITY_EMPTY_ELEMENTDATA这种特殊情况,这时扩容后的容量至少为10,因此可能要对minCapacity做修改。
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
总结: grow方法是真正的扩容,ensureExplicitCapacity方法判断是否需要扩容,以及传入扩容后的真正容量。ensureCapacityInternal是给add方法使用的,ensureCapacity是给用户使用的。
三、元素的查找
indexOf方法接受一个object类型的参数o,查找ArrayList中o第一次出现的位置,返回下标,若不存在返回-1。o可以是null
lastIndexOf从后向前查找,返回o最后一次出现的位置,若不存在返回-1。
/**
* Returns the index of the first occurrence of the specified element
* in this list, or -1 if this list does not contain the element.
* More formally, returns the lowest index <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,
* or -1 if there is no such index.
*/
public int indexOf(Object o) {
if (o == null) {
for (int i = 0; i < size; i++)
if (elementData[i]==null)
return i;
} else {
for (int i = 0; i < size; i++)
if (o.equals(elementData[i]))
return i;
}
return -1;
}
/**
* Returns a shallow copy of this <tt>ArrayList</tt> instance. (The
* elements themselves are not copied.)
*
* @return a clone of this <tt>ArrayList</tt> instance
*/
public Object clone() {
try {
ArrayList<?> v = (ArrayList<?>) super.clone();
v.elementData = Arrays.copyOf(elementData, size);
v.modCount = 0;
return v;
} catch (CloneNotSupportedException e) {
// this shouldn't happen, since we are Cloneable
throw new InternalError(e);
}
}
elementData方法、get方法用于找到指定下标的元素。区别在于elementData方法不会对指定的下标进行检查,在编译时也不会有任何警告(使用了SuppressWarnings注解)。
而get方法会先调用rangeCheck方法检查传入下标和ArrayList当前size的关系,若传入的下标超过了当前size,抛出IndexOutOfBoundsException。
@SuppressWarnings("unchecked")
E elementData(int index) {
return (E) elementData[index];
}
/**
* Returns the element at the specified position in this list.
*
* @param index index of the element to return
* @return the element at the specified position in this list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
Remark: 在用户能保证elementData已分配足够空间的情况下,可以使用elementData方法达到C++ STL中vector.resize()的效果。
四、元素的插入与删除
set方法用于将指定位置的元素替换为指定对象,同时返回旧值。与get方法相同,也会先调用rangeCheck方法检查传入的下标的合法性。
/**
* Replaces the element at the specified position in this list with
* the specified element.
*
* @param index index of the element to replace
* @param element element to be stored at the specified position
* @return the element previously at the specified position
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E set(int index, E element) {
rangeCheck(index);
E oldValue = elementData(index);
elementData[index] = element;
return oldValue;
}
add方法既可以在ArrayList最后插入元素,也可以在指定位置插入元素。在插入前先调用ensureCapacityInternal方法保证ArrayList有足够的容量插入新元素。如果要在指定index位置插入元素,会调用System.arraycopy方法将index及之后的所有元素向后移动一个位置。插入元素后将size+1。
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
addAll方法可以在ArrayList的最后或指定位置插入一个集合对象的全部元素,同样通过调用System.arraycopy来实现插入操作。但要注意,如果传入的参数在addAll方法执行过程中发生了变化,这个操作的结果是未定义的。 也就是说,传入的集合对象可以是当前的ArrayList,但这样的结果是不确定的。
/**
* Appends all of the elements in the specified collection to the end of
* this list, in the order that they are returned by the
* specified collection's Iterator. The behavior of this operation is
* undefined if the specified collection is modified while the operation
* is in progress. (This implies that the behavior of this call is
* undefined if the specified collection is this list, and this
* list is nonempty.)
*
* @param c collection containing elements to be added to this list
* @return <tt>true</tt> if this list changed as a result of the call
* @throws NullPointerException if the specified collection is null
*/
public boolean addAll(Collection<? extends E> c) {
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacityInternal(size + numNew); // Increments modCount
System.arraycopy(a, 0, elementData, size, numNew);
size += numNew;
return numNew != 0;
}
public boolean addAll(int index, Collection<? extends E> c) {
rangeCheckForAdd(index);
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacityInternal(size + numNew); // Increments modCount
int numMoved = size - index;
if (numMoved > 0)
System.arraycopy(elementData, index, elementData, index + numNew,
numMoved);
System.arraycopy(a, 0, elementData, index, numNew);
size += numNew;
return numNew != 0;
}
remove方法可以传入index,实现对指定位置的删除,也可以传入Object,实现对某一对象首次出现的删除。
删除指定位置元素会令删除位置之后的部分向前移动一个位置,用System.arraycopy实现,移动之后数组的最后一个位置会空出来,将这个对象置为null,之后GC会回收这个空间。
删除指定对象的remove方法会遍历整个数组,使用Object.equals判断对象是否相同,找到对象的第一次出现位置后记录index,调用fastRemove删除。返回一个布尔值,说明是否找到并删除了指定对象。
fastRemove方法与remove(int index)的实现相同,只不过不需要判断index是否合法(因为这是private方法,只能由remove(Object o)方法调用,传入的index一定合法)。
/**
* Removes the element at the specified position in this list.
* Shifts any subsequent elements to the left (subtracts one from their
* indices).
*
* @param index the index of the element to be removed
* @return the element that was removed from the list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E remove(int index) {
rangeCheck(index);
modCount++;
E oldValue = elementData(index);
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
return oldValue;
}
/**
* Removes the first occurrence of the specified element from this list,
* if it is present. If the list does not contain the element, it is
* unchanged. More formally, removes the element with the lowest index
* @param o element to be removed from this list, if present
* @return <tt>true</tt> if this list contained the specified element
*/
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}
/*
* Private remove method that skips bounds checking and does not
* return the value removed.
*/
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,numMoved);
elementData[--size] = null; // clear to let GC do its work
}
五、迭代器
ArrayList中有两种迭代器,Itr类和ListItr类,这两个都是ArrayList中的内部类,ListItr是Itr的子类。
Itr中有三个变量,cursor代表下一个将要访问的元素的下标,lastRet代表上一次访问的元素的下标,expectedModCount用来保证在迭代过程中ArrayList不会被修改。
next方法来返回下一个元素,第一次调用时返回的时第一个元素,因为cursor初始时为0,之后每次调用next会被加1。
checkForComodification方法用来检查迭代过程中ArrayList有没有被修改。ArrayList中有一个变量为modCount,每次对数组做插入或删除时modCount会+1。Itr类中的expectedModCount初始时等于modCount,在checkForComodification方法中会检查expectedModCount和modCount是否仍然相等。若不相等说明迭代中数组被插入或删除了某个元素,这时抛出ConcurrentModificationException。
private class Itr implements Iterator<E> {
int cursor; // index of next element to return
int lastRet = -1; // index of last element returned; -1 if no such
int expectedModCount = modCount;
public boolean hasNext() {
return cursor != size;
}
@SuppressWarnings("unchecked")
public E next() {
checkForComodification();
int i = cursor;
if (i >= size)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[lastRet = i];
}
//还有remove和forEachRemaining方法
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
}
ListItr类继承了Itr类,初始时可以指定index,让迭代器当前在index位置上。方法中加入了hasPrevious()和previousIndex(),能够访问迭代器之前的元素。