前言: 本文中的代码基于JDK1.8
线程不安全的List
- java.util.LinkedList
- java.util.ArrayList
线程安全的List
- java.util.Vector
- java.util.Stack (Vector的子类,对Vector进行了封装,只能进行先进后出的栈操作)
- java.util.Collections.SynchronizedList (Collections中的一个静态内部类)
- java.util.concurrent.CopyOnWriteArrayList
Vector如何保证线程安全的?
我们先来看一下Vector的关键代码
public class Vector<E>
extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
protected Object[] elementData;
public synchronized E get(int index) {
if (index >= elementCount)
throw new ArrayIndexOutOfBoundsException(index);
return elementData(index);
}
public synchronized E set(int index, E element) {
if (index >= elementCount)
throw new ArrayIndexOutOfBoundsException(index);
E oldValue = elementData(index);
elementData[index] = element;
return oldValue;
}
public synchronized boolean add(E e) {
modCount++;
ensureCapacityHelper(elementCount + 1);
elementData[elementCount++] = e;
return true;
}
public synchronized E remove(int index) {
modCount++;
if (index >= elementCount)
throw new ArrayIndexOutOfBoundsException(index);
E oldValue = elementData(index);
int numMoved = elementCount - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--elementCount] = null; // Let gc do its work
return oldValue;
}
}
可以看到Vector的实现方式比较简单粗暴,直接使用了synchronized锁来保证线程安全,所有的方法操作都是串行进行。
Collections.SynchronizedList是如何保证线程安全的?
static class SynchronizedList<E>
extends SynchronizedCollection<E>
implements List<E> {
final List<E> list;
SynchronizedList(List<E> list) {
super(list);
this.list = list;
}
public E get(int index) {
synchronized (mutex) {return list.get(index);}
}
public E set(int index, E element) {
synchronized (mutex) {return list.set(index, element);}
}
public void add(int index, E element) {
synchronized (mutex) {list.add(index, element);}
}
public E remove(int index) {
synchronized (mutex) {return list.remove(index);}
}
}
可以看到通过SynchronizedList的构造方法传入我们需要加锁的List,相当于对List的方法进行了一层包装,使用SynchronizedList提供的方法来操作List。
为什么Vector和Collections.SynchronizedList的get方法要加锁呢?
我个人理解为Vector和Collections.SynchronizedList的get方法加了synchronized后可以保证顺序性与实时一致性,当一个线程在读取数据时,一定可以看到其他线程解锁前写入的全部数据,并且Vector和Collections.SynchronizedList的数组并没有用volatile修饰,如果不加锁,也无法保证可见性。
CopyOnWriteArrayList
简介
CopyOnWriteArrayList用于替代同步List,在某些情况下它提供了更好的并发性能,并且迭代期间不需要对容器进行加锁或复制。
写时复制(Copy-On-Write)容器的线程安全性在于,只要正确的发布一个事实不可变的对象,那么在访问该对象时就不需要进一步的同步。在每次修改时,都会创建并重新发布一个新的容器副本,从而实现可变性。
但是显然,每次修改容器使都会复制底层数组,这需要一定的开销,特别是当容器的规模较大时。
仅当迭代操作远远多于修改操作时,才应该使用"写时复制"容器。
我们可以看到CopyOnWriteArrayList源码中维护了一个array的对象数组用于存储集合的每个元素,并且是通过getArray和setArray来访问,数组默认初始化长度为0.
主要方法
新增元素
public boolean add(E e) {
final ReentrantLock lock = this.lock;
lock.lock(); // 加锁
try {
Object[] elements = getArray(); // 获取旧数组
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len + 1); // 扩容数组
newElements[len] = e;
setArray(newElements); // 用新数组替换老数组
return true;
} finally {
lock.unlock();
}
}
我们可以看到新增元素的时候是加锁的,并且需要将老数组拷贝一份,在新数组上增加元素后,再将数组写回全局变量array,替换老数组。
删除元素
public E remove(int index) {
final ReentrantLock lock = this.lock;
lock.lock(); // 加锁
try {
Object[] elements = getArray();
int len = elements.length;
E oldValue = get(elements, index);
int numMoved = len - index - 1;
// 如果元素是最后一个,直接拷贝减去,否则需要将前后分别拷贝
if (numMoved == 0)
setArray(Arrays.copyOf(elements, len - 1));
else {
Object[] newElements = new Object[len - 1];
System.arraycopy(elements, 0, newElements, 0, index);
System.arraycopy(elements, index + 1, newElements, index,
numMoved);
setArray(newElements);
}
return oldValue; // 返回删除的元素
} finally {
lock.unlock();
}
}
修改元素
public E set(int index, E element) {
final ReentrantLock lock = this.lock;
lock.lock();
try {
Object[] elements = getArray();
E oldValue = get(elements, index);
if (oldValue != element) {
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len);
newElements[index] = element;
setArray(newElements);
} else {
// Not quite a no-op; ensures volatile write semantics
setArray(elements);
}
return oldValue;
} finally {
lock.unlock();
}
}
获取元素
public E get(int index) {
return get(getArray(), index);
}
可以看到读操作是无锁的
迭代集合
public Iterator<E> iterator() {
return new COWIterator<E>(getArray(), 0);
}
static final class COWIterator<E> implements ListIterator<E> {
private final Object[] snapshot;
private int cursor;
private COWIterator(Object[] elements, int initialCursor) {
cursor = initialCursor;
snapshot = elements;
}
public boolean hasNext() {
return cursor < snapshot.length;
}
public boolean hasPrevious() {
return cursor > 0;
}
@SuppressWarnings("unchecked")
public E next() {
if (! hasNext())
throw new NoSuchElementException();
return (E) snapshot[cursor++];
}
@SuppressWarnings("unchecked")
public E previous() {
if (! hasPrevious())
throw new NoSuchElementException();
return (E) snapshot[--cursor];
}
public int nextIndex() {
return cursor;
}
public int previousIndex() {
return cursor-1;
}
public void remove() {
throw new UnsupportedOperationException();
}
public void set(E e) {
throw new UnsupportedOperationException();
}
public void add(E e) {
throw new UnsupportedOperationException();
}
@Override
public void forEachRemaining(Consumer<? super E> action) {
Objects.requireNonNull(action);
Object[] elements = snapshot;
final int size = elements.length;
for (int i = cursor; i < size; i++) {
@SuppressWarnings("unchecked") E e = (E) elements[i];
action.accept(e);
}
cursor = size;
}
}
可以看到的是,在对CopyOnWriteArrayList进行迭代的时候,是对数组做了一个快照引用,是不允许自己在迭代过程中修改元素的,并且其他线程如果对集合做了修改,并不会影响这次迭代,此次迭代是看不到的,所以也就不存在fail-fast问题。
性能测试
下面我们用JMH对Vector、Collections.SynchronizedList、CopyOnWriteArrayList做一个性能基准测试
写操作
@Fork(1)
@Threads(1)
@State(value = Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Measurement(iterations = 5, time = 1)
public class ListWriteTest {
private static final int SIZE = 100_000;
private static final int THREAD_SIZE = 4;
private void testListWrite(List<Integer> list) {
Runnable runnable = () -> {
for (int i = 0; i < SIZE; ++i) {
list.add(i);
}
};
List<Thread> threadList = IntStream.of(THREAD_SIZE).mapToObj(num -> new Thread(runnable)).collect(Collectors.toList());
threadList.forEach(Thread::start);
for (Thread thread : threadList) {
try {
thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
@Benchmark
public void testVectorWrite() {
testListWrite(new Vector<>());
}
@Benchmark
public void testSynchronizedWrite() {
testListWrite(Collections.synchronizedList(new ArrayList<>()));
}
@Benchmark
public void testCopyOnWriteArrayListWrite() {
testListWrite(new CopyOnWriteArrayList<>());
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(ListWriteTest.class.getSimpleName())
.result("result.json")
.resultFormat(ResultFormatType.JSON).build();
new Runner(opt).run();
}
}
运行结果如下,可以看到CopyOnWriteArrayList比Vector慢2000多倍,比Collections.SynchronizedList慢1600多倍
Vector > Collections.SynchronizedList > CopyOnWriteArrayList
Benchmark Mode Cnt Score Error Units
ListWriteTest.testCopyOnWriteArrayListWrite avgt 5 717519212.500 ± 41981491.296 ns/op
ListWriteTest.testSynchronizedWrite avgt 5 437740.739 ± 29130.674 ns/op
ListWriteTest.testVectorWrite avgt 5 353874.172 ± 6534.240 ns/op
读操作
@Fork(1)
@Threads(1)
@State(value = Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Measurement(iterations = 5, time = 1)
public class ListReadTest {
private static final int SIZE = 100_000;
private static final int THREAD_SIZE = 4;
private static final List<Integer> vector = new Vector<>();
private static final List<Integer> synchronizedList = Collections.synchronizedList(new ArrayList<>());
private static final List<Integer> cowList = new CopyOnWriteArrayList<>();
public ListReadTest() {
for (int i = 0; i < 100; ++i) {
vector.add(i);
synchronizedList.add(i);
cowList.add(i);
}
}
private void testListRead(List<Integer> list) {
Runnable runnable = () -> {
ThreadLocalRandom current = ThreadLocalRandom.current();
for (int i = 0; i < SIZE; ++i) {
list.get(current.nextInt(0, 100));
}
};
List<Thread> threadList = IntStream.of(THREAD_SIZE).mapToObj(num -> new Thread(runnable)).collect(Collectors.toList());
threadList.forEach(Thread::start);
for (Thread thread : threadList) {
try {
thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
@Benchmark
public void testVectorRead() {
testListRead(vector);
}
@Benchmark
public void testSynchronizedRead() {
testListRead(synchronizedList);
}
@Benchmark
public void testCopyOnWriteArrayListRead() {
testListRead(cowList);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(ListReadTest.class.getSimpleName())
.result("result.json")
.resultFormat(ResultFormatType.JSON).build();
new Runner(opt).run();
}
}
可以看到CopyOnWriteArrayList比Collections.SynchronizedList快了3.8倍,比Vector快了3.3倍。
CopyOnWriteArrayList > Vector > Collections.SynchronizedList
Benchmark Mode Cnt Score Error Units
ListReadTest.testCopyOnWriteArrayListRead avgt 5 285867.076 ± 21779.424 ns/op
ListReadTest.testSynchronizedRead avgt 5 1086997.915 ± 19161.630 ns/op
ListReadTest.testVectorRead avgt 5 967604.177 ± 11735.727 ns/op
总结:
CopyOnWriteArryList基于写时复制技术实现,读操作是无锁的,写操作有锁,体现了读写分离的思想,但无法提供实时一致性,当读多写少的时候可以考虑使用CopyOnWriteArrayList代替同步List。
优点
对于一些读多写少的数据,写时复制的做法就很不错,例如配置信息、黑白名单等变化非常少的数据,这是一种无锁的实现,可以帮助我们程序实现更高的并发。
缺点
数据一致性问题: CopyOnWriteArrayList只能帮我们实现最终一致性,不能保证数据实时一致性
写性能问题: CopyOnWriteArrayList写操作比其他有锁List慢几千倍,因为每次写都需要拷贝数据,并且频繁写内存从而引发Java的GC频繁。