Collection
我们知道Java中集合分有序和无序两种,其中List有三个实现。
Collection
├List
│├LinkedList
│├ArrayList
│└Vector
│ └Stack
└Set
List实现
ArrayList、Vector和LinkedList虽然都实现了List接口,还是有一些区别的。
存储
从集合的存储上,ArrayList 和Vector是采取数组体式格式存储数据,而且此数组元素数大于实际存储的数据以便增长和插入元素,因此都允许直接序号索引元素,但是插入数据涉及数组元素移动等内存操纵,所以索引数据快,插入数据慢。源码如下:
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
// ....
transient Object[] elementData; // non-private to simplify nested class access
// ....
public class Vector<E>
extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
// ....
protected Object[] elementData;
/**
* The number of valid components in this {@code Vector} object.
* Components {@code elementData[0]} through
* {@code elementData[elementCount-1]} are the actual items.
*
* @serial
*/
protected int elementCount;
对应的LinkedList是采用双向链表实现存储,因此按序号索引数据须要进行向前或向后遍历,然则插入数据时只须要记录本项的前后项即可,所以插入数度较快!源码如下:
public class LinkedList<E>
extends AbstractSequentialList<E>
implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
transient int size = 0;
/**
* Pointer to first node.
* Invariant: (first == null && last == null) ||
* (first.prev == null && first.item != null)
*/
transient Node<E> first;
/**
* Pointer to last node.
* Invariant: (first == null && last == null) ||
* (last.next == null && last.item != null)
*/
transient Node<E> last;
/**
* Constructs an empty list.
*/
public LinkedList() {
}
同步
Vector因为应用了synchronized方法(即线程安全),所以机能上比ArrayList要稍微差点。
同样的 LinkedList 也没有同步办法,因此当需要多线程访问LinkedList时要实现同步。比如在创建List时创建一个线程安全的List:
List list = Collections.synchronizedList(new LinkedList(...));
null元素
因是有序的,三种实现都是可以存放null。
@Test
public void testCollection() {
ArrayList al = new ArrayList<String>();
al.add(null);
al.add(null);
LinkedList ll = new LinkedList<String>();
ll.add(null);
ll.add(null);
Vector v = new Vector<String>();
v.add(null);
v.add(null);
System.out.println("ArrayList " + al.size());
System.out.println("LinkedList " + ll.size());
System.out.println("Vector " + al.size());
}
输出:
ArrayList 2
LinkedList 2
Vector 2
扩容
提到集合,最容易忽略的就是容量了。这里我用的是JDK 1.8.
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);// 默认容量 为 10
}
ensureExplicitCapacity(minCapacity);
}
private void ensureExplicitCapacity(int minCapacity) {
modCount++;// 扩容次数
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {// 真正扩容的地方
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);// 右移相当于 0.5, 因此扩容至 1.5 倍
if (newCapacity - minCapacity < 0)// 判断新容量是否足够
newCapacity = minCapacity;// 不够就将数组长度设置为需要的长度.
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);//将原来数组的值copy新数组中去, 同时ArrayList的引用指向新数组
}
对于Vector扩容,则多了个扩容因子。
首先默认容量还是 10.
public Vector(int initialCapacity) {
this(initialCapacity, 0);
}
/**
* Constructs an empty vector so that its internal data array
* has size {@code 10} and its standard capacity increment is
* zero.
*/
public Vector() {
this(10);
}
扩容算法:
public synchronized void ensureCapacity(int minCapacity) {// synchronized 的使用保证了是线程同步。
if (minCapacity > 0) {
modCount++;
ensureCapacityHelper(minCapacity);
}
}
/**
* This implements the unsynchronized semantics of ensureCapacity.
* Synchronized methods in this class can internally call this
* method for ensuring capacity without incurring the cost of an
* extra synchronization.
*
* @see #ensureCapacity(int)
*/
private void ensureCapacityHelper(int minCapacity) {
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private void grow(int minCapacity) {// 真正扩容
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + ((capacityIncrement > 0) ?
capacityIncrement : oldCapacity);
// 关键地方
// 扩容因子 >0 时,新容量 = 旧容量 + 扩容因子
// 否则, 新容量 = 旧容量 * 2
if (newCapacity - minCapacity < 0) // 同样, 新容量不够就将数组长度设置为需要的长度.
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
elementData = Arrays.copyOf(elementData, newCapacity);
}
至于LinkedList 是双向链表实现,理论上没有容量限制。当然实际使用时,还是要受内存堆空间的限制。
Collections.synchronizedList
ArrayList
上面提到可用Collections的静态方法synchronizedList()生成一个线程安全的对象,如下:
List<String> alist = Collections.synchronizedList(new ArrayList<String>());
但是,该对象内部使用的锁是哪个对象呢?该方法源码如下:
public static <T> List<T> synchronizedList(List<T> list) {
return (list instanceof RandomAccess ?
new SynchronizedRandomAccessList<>(list) :
new SynchronizedList<>(list));
}
查看ArrayList源码知道,其实现了RandomAccess接口。
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{ // ....
}
因此synchronizedList()返回的是SynchronizedRandomAccessList的实例。通过下面源码知道该类调用了父类的构造函数, 且其同步方法是基于mutex属性的。
static class SynchronizedRandomAccessList<E>
extends SynchronizedList<E>
implements RandomAccess {
SynchronizedRandomAccessList(List<E> list) {
super(list);
}
SynchronizedRandomAccessList(List<E> list, Object mutex) {
super(list, mutex);
}
public List<E> subList(int fromIndex, int toIndex) {
synchronized (mutex) {
return new SynchronizedRandomAccessList<>(
list.subList(fromIndex, toIndex), mutex);
}
}
// ...
}
该属性是定义在其父类SynchronizedCollection中,源码如下:
static class SynchronizedCollection<E> implements Collection<E>, Serializable {
// ...
final Collection<E> c; // Backing Collection
final Object mutex; // Object on which to synchronize
SynchronizedCollection(Collection<E> c) {
this.c = Objects.requireNonNull(c);
mutex = this;
}
}
看到这里我们知道mutex其实就是当前实例。根据继承,子类SynchronizedRandomAccessList的mutex也是SynchronizedRandomAccessList的this对象。因此使用下面返回的对象是线程安全的。
Collections.synchronizedList(new ArrayList<String>());
LinkedList
回看刚才的代码,同样对于 LinkedList 也是线程安全的。
List<String> llist = Collections.synchronizedList(new LinkedList<String>());
遍历
在《阿里巴巴Java开发手册中》有这样一条。
不要在foreach循环里进行元素的remove/add操作。remove元素请使用Iterator方式,如果并发操作,需要对Iterator对象加锁。
看下代码:
public void traverse() {
List<String> a = new ArrayList<>();
a.add("1");
a.add("2");
for (String temp : a) {
if("2".equals(temp)) {
a.remove(temp);// 该行代码会导致 java.util.ConcurrentModificationException
}
}
}
至于为啥,有兴趣的可以自己研究,与快速失败有关。相反采用以下代码就不会。
public void traverse2() {
List<String> a = new ArrayList<>();
a.add("1");
a.add("2");
Iterator<String> it = a.iterator();
while (it.hasNext()) {
String temp = it.next();
if("2".equals(temp)) {
it.remove();
}
}
System.out.println(a.size());// size 为 1
}