Java单列集合(Collection -> List)源码分析,JDK1.8源码
List
ArrayList(线程不安全)
特点:增删慢,查询快,底层结构为数组,效率高
扩容机制
新建ArrayList对象时,默认容量为10(若未添加数据,默认容量为0)。ArrayList类源码如下:
// 当调用有参构造方法,但是设置初始容量为0时,elementData为此空数组
/**
* Shared empty array instance used for empty instances.
*/
private static final Object[] EMPTY_ELEMENTDATA = {};
// 默认空elementData数组
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
// 默认容量大小
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
// 空参构造方法
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
// 有参构造方法
/**
* Constructs an empty list with the specified initial capacity.
*
* @param initialCapacity the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
}
}
当ArrayList中存储的对象达到10,在存放第11个数据的时候,会自动进行扩容,扩容的增量为0.5 * ArrayList的当前容量,即新容量与旧容量的比为1.5:1,但是还有第二种扩容方式,后续跟随源码能介绍到
源码详解如下:
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
在增加数据时候,会先进行内部容量的判断,即ensureCapacityInternal()方法,传入的参数值为当前数据个数size + 1
private void ensureCapacityInternal(int minCapacity) {
ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
细看此方法可发现,传入的参数值为minCapacity(最小容量)
此方法调用了两个方法,先看calculateCapacity方法,它存在两个参数,分别是elementData(数据数组),minCapacity(最小容量)。
private static int calculateCapacity(Object[] elementData, int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
return Math.max(DEFAULT_CAPACITY, minCapacity);
}
return minCapacity;
}
此处可见,此处出现了本章未提到的的静态变量,DEFAULTCAPACITY_EMPTY_ELEMENTDATA,此数据为创建ArrayList的时候执行的构造方法赋值给,默认的数组对象,源代码如下
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
在calculateCapacity的方法中进行了数组数据是否为空的判断,取小容量和默认容量的最大值。
跳出方法之后回到ensureCapacityInternal方法,根据返回的值,调用ensureExplicitCapacity方法判断是否需要扩容,源码如下
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
此处可明确的看见,如果所需容量的最小值减去当前数据数组长度时,触发grow方法,进行扩容,grow方法源码:
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
此处的设置newCapacity方式为,旧数据数组长度加上旧数据数组长度右移1位(oldCapacity >> 1 数值方面即为oldCapacity的一半)
继续研究源码可发现当第一次计算出扩容之后的大小后,会跟最小容量进行比较,若扩容之后的大小比最小容量小,即设置最小容量为扩容之后的容量。
我们继续查看下一条语句,在此处可以看到新的静态变量MAX_ARRAY_SIZE,即为ArrayList最大容量。
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
此处可看到,ArrayList的最大容量为Integer.MAX_VALUE - 8 ,Integer.MAX_VALUE 在Integer类中使用的十六进制表示为0x7fffffff,即为int的最大值。此处定义ArrayList的最大容量为int的最大值减去8的意义涉及到java对象的组成。详细了解可查看另一个博主的文章
回到grow方法,在判断新容量大于ArrayList最大容量之后,会进行一个hugeCapacity的操作。源码如下
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}
当大于的时候,会返回Integer的最大值,即0x7fffffff,此处判断minCapacity可能出现的负数情况为进行增加时出现(0x7fffffff + 1 < 0)
让我们再次回到grow方法,在方法结尾数据数组使用了Arrays类的copy方法,源码如下
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
@SuppressWarnings("unchecked")
T[] copy = ((Object)newType == (Object)Object[].class)
? (T[]) new Object[newLength]
: (T[]) Array.newInstance(newType.getComponentType(), newLength);
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}
此处可看到,返回的copy对象,实际上为一个新的数组对象,newType这个类为elementData中数据的类,所以最后得到的数据数组为new newType[new length],使用System.arraycopy,将原数组的数据复制给新的数组。
ArrayList扩容机制总结:当ArrayList插入第当前容量+1的数据的时候,触发扩容机制,扩容会出现两种方式(方式一:扩容后为扩容前容量的1.5倍。方式二:扩容后为扩容前容量+插入数据的数据数组长度(在调用ArrayList.addAll()可能会出现此方式))
扩容机制测试
扩容机制测试,因为ArrayList中的容量为private,此处使用反射进行容量的获取,具体代码和结果如下:
/**
* @author divergent
*/
public class ArrayListCapacityTest {
public static int getArrayListCapacity(List<String> list) throws NoSuchFieldException, IllegalAccessException {
// 获取属性
Field elementData = list.getClass().getDeclaredField("elementData");
// 设置访问权限
elementData.setAccessible(true);
Object[] o = (Object[]) elementData.get(list);
return o.length;
}
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
List<String> list = new ArrayList<>();
List<String> list2 = new ArrayList<>();
System.out.println("未添加数据时的容量:" + getArrayListCapacity(list));
for (int i = 0; i < 10; i++) {
list.add("sssssss");
}
System.out.println("添加数据在1 - 10 后的容量:" + getArrayListCapacity(list));
list.add("sss");
System.out.println("添加单个数据触发扩容条件之后的容量:" + getArrayListCapacity(list));
for (int i = 0; i < 20; i++) {
list2.add("sssssss");
}
list.addAll(list2);
System.out.println("添加多个数据,触发第二种扩容机制之后的容量:" + getArrayListCapacity(list));
}
}
结果为:
解析计算方式:未添加数据,容量为0
添加数据在1-10之间,触发第一次扩容默认为10
添加第11个数据,触发第一种方式扩容:10 + 10 >> 1 = 10 + 5 = 15
添加第11个数据之后,使用addAll方法插入长度为20的列表数据:size的变化为0 -> 10 -> 11 -> 11+20 = 31
查询机制
索引查询,源码如下:
// 索引查询
public E get(int index) {
rangeCheck(index);
checkForComodification();
return ArrayList.this.elementData(offset + index);
}
// 判断是否越界
private void rangeCheck(int index) {
if (index < 0 || index >= this.size)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}
private void checkForComodification() {
if (ArrayList.this.modCount != this.modCount)
throw new ConcurrentModificationException();
}
由源码可知,get方法根据索引,即elementData数组的下标,找到数据。
插入机制
根据索引插入,源码如下
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
由源码可看见,索引插入额步骤为判断是否越界等,然后进行System.arraycopy操作,源码如下
/**
* Copies an array from the specified source array, beginning at the
* specified position, to the specified position of the destination array.
* A subsequence of array components are copied from the source
* array referenced by <code>src</code> to the destination array
* referenced by <code>dest</code>. The number of components copied is
* equal to the <code>length</code> argument. The components at
* positions <code>srcPos</code> through
* <code>srcPos+length-1</code> in the source array are copied into
* positions <code>destPos</code> through
* <code>destPos+length-1</code>, respectively, of the destination
* array.
* <p>
* If the <code>src</code> and <code>dest</code> arguments refer to the
* same array object, then the copying is performed as if the
* components at positions <code>srcPos</code> through
* <code>srcPos+length-1</code> were first copied to a temporary
* array with <code>length</code> components and then the contents of
* the temporary array were copied into positions
* <code>destPos</code> through <code>destPos+length-1</code> of the
* destination array.
* <p>
* If <code>dest</code> is <code>null</code>, then a
* <code>NullPointerException</code> is thrown.
* <p>
* If <code>src</code> is <code>null</code>, then a
* <code>NullPointerException</code> is thrown and the destination
* array is not modified.
* <p>
* Otherwise, if any of the following is true, an
* <code>ArrayStoreException</code> is thrown and the destination is
* not modified:
* <ul>
* <li>The <code>src</code> argument refers to an object that is not an
* array.
* <li>The <code>dest</code> argument refers to an object that is not an
* array.
* <li>The <code>src</code> argument and <code>dest</code> argument refer
* to arrays whose component types are different primitive types.
* <li>The <code>src</code> argument refers to an array with a primitive
* component type and the <code>dest</code> argument refers to an array
* with a reference component type.
* <li>The <code>src</code> argument refers to an array with a reference
* component type and the <code>dest</code> argument refers to an array
* with a primitive component type.
* </ul>
* <p>
* Otherwise, if any of the following is true, an
* <code>IndexOutOfBoundsException</code> is
* thrown and the destination is not modified:
* <ul>
* <li>The <code>srcPos</code> argument is negative.
* <li>The <code>destPos</code> argument is negative.
* <li>The <code>length</code> argument is negative.
* <li><code>srcPos+length</code> is greater than
* <code>src.length</code>, the length of the source array.
* <li><code>destPos+length</code> is greater than
* <code>dest.length</code>, the length of the destination array.
* </ul>
* <p>
* Otherwise, if any actual component of the source array from
* position <code>srcPos</code> through
* <code>srcPos+length-1</code> cannot be converted to the component
* type of the destination array by assignment conversion, an
* <code>ArrayStoreException</code> is thrown. In this case, let
* <b><i>k</i></b> be the smallest nonnegative integer less than
* length such that <code>src[srcPos+</code><i>k</i><code>]</code>
* cannot be converted to the component type of the destination
* array; when the exception is thrown, source array components from
* positions <code>srcPos</code> through
* <code>srcPos+</code><i>k</i><code>-1</code>
* will already have been copied to destination array positions
* <code>destPos</code> through
* <code>destPos+</code><i>k</I><code>-1</code> and no other
* positions of the destination array will have been modified.
* (Because of the restrictions already itemized, this
* paragraph effectively applies only to the situation where both
* arrays have component types that are reference types.)
*
* @param src the source array.
* @param srcPos starting position in the source array.
* @param dest the destination array.
* @param destPos starting position in the destination data.
* @param length the number of array elements to be copied.
* @exception IndexOutOfBoundsException if copying would cause
* access of data outside array bounds.
* @exception ArrayStoreException if an element in the <code>src</code>
* array could not be stored into the <code>dest</code> array
* because of a type mismatch.
* @exception NullPointerException if either <code>src</code> or
* <code>dest</code> is <code>null</code>.
*/
public static native void arraycopy(Object src, int srcPos,
Object dest, int destPos,
int length);
此处方法中的native关键字,讲起来也相对复杂,想要详细了解的小伙伴这里给出了一个链接,帮助了解。因为是system类中的代码,所以并不能看见其具体实现方式,但是根据注解,我们能清晰看到length ,the number of array elements to be copied,将两者联系起来对比参数我们就会发现,传入的arraycopy中的length为size - index,即我们需要复制的数据数组的个数。src为源数组,srcPos为源数组的开始位置,dest为目标数组,destPos为目标数组的开始位置,结合add方法传入的参数可知,将elementData数组,从index为位置开始到数组末尾,复制到elementData数组中,新下标为index+1,即整个数组从index开始,整体向后移动一位,让出index下标的位置,然后赋值给插入的数据。
总结:ArrayList的插入机制,涉及到数组的位移复制,所以执行插入较慢。
Vector(线程安全)
- 特点:增删慢,查询快,底层数据结构为数组,线程安全(插入,扩容,复制等方法加上了sychronized同步锁),效率低
- 查询机制和插入机制与ArrayList相同
- 新建对象时,初始化容量为10,初始化源码如下;
// 构造方法
/**
* Constructs an empty vector so that its internal data array
* has size {@code 10} and its standard capacity increment is
* zero.
*/
public Vector() {
this(10);
}
/**
* Constructs an empty vector with the specified initial capacity and
* with its capacity increment equal to zero.
*
* @param initialCapacity the initial capacity of the vector
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public Vector(int initialCapacity) {
this(initialCapacity, 0);
}
/**
* Constructs an empty vector with the specified initial capacity and
* capacity increment.
*
* @param initialCapacity the initial capacity of the vector
* @param capacityIncrement the amount by which the capacity is
* increased when the vector overflows
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public Vector(int initialCapacity, int capacityIncrement) {
super();
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
this.elementData = new Object[initialCapacity];
this.capacityIncrement = capacityIncrement;
}
根据源码分析,实际上调用无参构造方法的时候,实际调用为携带默认容量大小,增量为0的构造方法,在扩容的流程上大相径庭,他们的grow方法只有一行不一样,让我们来看看Vector的grow方法的源码
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + ((capacityIncrement > 0) ?
capacityIncrement : oldCapacity);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
elementData = Arrays.copyOf(elementData, newCapacity);
}
可以很明显的看到在设置newCapacity的时候,如果容量增量小于等于0,则使用旧容量,即扩容之后的新容量为之前容量的2倍。其他和ArrayList并无区别。
扩容机制测试
因为Vector中的属性除了序列号为private之外其余均为protected,所以只能用反射获取到当前Vector的容量信息,代码如下
/**
* @author divergent
*/
public class VectorCapacityTest {
public static int getVectorCapacity(Vector<String> v) throws NoSuchFieldException, IllegalAccessException {
Field elementData = v.getClass().getDeclaredField("elementData");
elementData.setAccessible(true);
Object[] o = (Object[] ) elementData.get(v);
return o.length;
}
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
// 空参构造方法
Vector<String> v1 = new Vector<>();
// 设置初始容量的构造方法
Vector<String> v2 = new Vector<>(5);
// 设置初始容量和容量增量的构造方法
Vector<String> v3 = new Vector<>(5,3);
System.out.println("空参构造方法初始容量 v1:" + getVectorCapacity(v1));
System.out.println("设置初始容量为5 v2: " + getVectorCapacity(v2));
System.out.println("设置初始容量为5,增量为3 v3:" + getVectorCapacity(v3));
for (int i = 0; i < 6; i++) {
v1.add("sss");
v2.add("sss");
v3.add("sss");
}
System.out.println("---------- 每个Vector增加6后 --------------");
System.out.println("空参构造方法初始容量 v1:" + getVectorCapacity(v1));
System.out.println("设置初始容量为5 v2: " + getVectorCapacity(v2));
System.out.println("设置初始容量为5,增量为3 v3:" + getVectorCapacity(v3));
}
}
结果为
由结果可知,为定义初始容量和增量时,初始容量为10,定义增量时,增量不再是变为原来的2倍,而是增加增量。
Vector也和ArrayList一样有第二种扩容方式,此处代码未写出测试。
LinkList(线程不安全)
- 特点:增删快,查询慢,底层结构为双向链表,效率高
因为LinkList是双向链表,自然就不存在容量这一说,LinkList一共有三个属性,源码如下
transient int size = 0;
// 头节点
/**
* Pointer to first node.
* Invariant: (first == null && last == null) ||
* (first.prev == null && first.item != null)
*/
transient Node<E> first;
// 末节点
/**
* Pointer to last node.
* Invariant: (first == null && last == null) ||
* (last.next == null && last.item != null)
*/
transient Node<E> last;
此处出现Node类,下为Node类的源码:
private static class Node<E> {
E item;
Node<E> next;
Node<E> prev;
Node(Node<E> prev, E element, Node<E> next) {
this.item = element;
this.next = next;
this.prev = prev;
}
}
Node类一共有三个属性,当前节点数据,上一个节点,下一个节点
LinkList提供的add方法也比ArrayList和Vector多,addFirst方法实际调用linkFirst方法,addLast,add方法均调用的linkLast方法。
以下为linkFirst和linkLast的源码:
/**
* Links e as first element.
*/
private void linkFirst(E e) {
final Node<E> f = first;
final Node<E> newNode = new Node<>(null, e, f);
first = newNode;
if (f == null)
last = newNode;
else
f.prev = newNode;
size++;
modCount++;
}
/**
* Links e as last element.
*/
void linkLast(E e) {
final Node<E> l = last;
final Node<E> newNode = new Node<>(l, e, null);
last = newNode;
if (l == null)
first = newNode;
else
l.next = newNode;
size++;
modCount++;
}
查看源码可知,头插和尾插实际上就是新建一个节点,赋予新节点头尾节点属性即可插入链表。
查询机制
LinkList的查询的源码如下
/**
* Returns the element at the specified position in this list.
*
* @param index index of the element to return
* @return the element at the specified position in this list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E get(int index) {
checkElementIndex(index);
return node(index).item;
}
/**
* Returns the (non-null) Node at the specified element index.
*/
Node<E> node(int index) {
// assert isElementIndex(index);
if (index < (size >> 1)) {
Node<E> x = first;
for (int i = 0; i < index; i++)
x = x.next;
return x;
} else {
Node<E> x = last;
for (int i = size - 1; i > index; i--)
x = x.prev;
return x;
}
}
根据源码可知,LinkList的查询也是先判断是否越界,然后调用node方法,判断index的大小处于链表的前半部还是后半部,然后一个节点一个节点的寻找目标位置。
总结:和ArrayList及Vector相比,查询机制不稳定,且耗时。
LinkList还支持指定插入位置,将新数据链接到指定节点之前等。
无扩容机制一说,增删快,查询只能挨个找
结语
此文章是作者研究源码所得出的结论,所有的文字和归纳均为作者自己的思路,此文章并未使用uml进行分析,菜鸟程序员。欢迎各位大牛指出归纳上不足,思路上的不足。
文章仅供参考,若有机会,尽量自己研究源码,一起进步。