ThreadPoolExecutor 与 ThreadLocal 配合使用中出现数据不一致问题
前段时间写过一段测试代码,发现使用了ThreadLocal出现了数据不一致的问题,之前也一直用过,没有出现过.所以感到很疑惑.于是针对这个case研究了下源码
单元测试代码
/**
* <p>
* 测试ThreadLocal结合ThreadPoolExecutor是否存在数据不安全情况
* </p>
*
* @author sunla
* @version 1.0
*/
public class ThreadLocalTest {
private static final ThreadPoolExecutor EXECUTOR = new ThreadPoolExecutor(2, 2, 1000, TimeUnit.MILLISECONDS,
new LinkedBlockingDeque<Runnable>(20), new ThreadFactoryBuilder().setNameFormat("name-%s").build(),
new ThreadPoolExecutor.AbortPolicy());
private static final ThreadLocal<String> LOCAL = new ThreadLocal<String>();
@Test
public void startTest() throws Exception {
LOCAL.set("main start");
EXECUTOR.execute(()->{
System.out.println(
String.format("value is %s", LOCAL.get()));
try {
TimeUnit.MILLISECONDS.sleep(1000);
}
catch (InterruptedException e) {
e.printStackTrace();
}
});
System.out.println(
String.format("value is %s", LOCAL.get()));
}
}
输出结果
NULL
main start
很奇怪.ThreadLocal 就是为了保持回话中变量共享 为什么不一致呢
我们平时使用ThreadLocal的场景 可以理解为会话中的数据共享,那怎么在这里出现了与期望不一致的结果呢?其实这跟ThreadLocal的内部实现有关系.
ThreadLocal 内部实现
/**
* Returns the value in the current thread's copy of this
* thread-local variable. If the variable has no value for the
* current thread, it is first initialized to the value returned
* by an invocation of the {@link #initialValue} method.
*
* @return the current thread's value of this thread-local
*/
public T get() {
Thread t = Thread.currentThread();
ThreadLocalMap map = getMap(t);
if (map != null) {
/** threadLocalMap是内部实现的map
* 特点是key是ThreadLocal的引用
* 至于为什么这样设计?我们往下看
*/
ThreadLocalMap.Entry e = map.getEntry(this);
if (e != null) {
@SuppressWarnings("unchecked")
T result = (T)e.value;
return result;
}
}
return setInitialValue();
}
/**
* Variant of set() to establish initialValue. Used instead
* of set() in case user has overridden the set() method.
*
* @return the initial value
*/
private T setInitialValue() {
T value = initialValue();
Thread t = Thread.currentThread();
ThreadLocalMap map = getMap(t);
if (map != null)
map.set(this, value);
else
createMap(t, value);
return value;
}
/** 每个thread都有独立的threadlocalmap */
ThreadLocalMap getMap(Thread t) {
return t.threadLocals;
}
ThreadLocalMap实现
static class ThreadLocalMap {
/**
* The entries in this hash map extend WeakReference, using
* its main ref field as the key (which is always a
* ThreadLocal object). Note that null keys (i.e. entry.get()
* == null) mean that the key is no longer referenced, so the
* entry can be expunged from table. Such entries are referred to
* as "stale entries" in the code that follows.
*/
static class Entry extends WeakReference<ThreadLocal<?>> {
/** The value associated with this ThreadLocal. */
Object value;
Entry(ThreadLocal<?> k, Object v) {
super(k);
value = v;
}
}
//TO DO
}
看了threadLocal的实现 我们知道了,原来thread local 实例有个map 存储的key就是线程的引用,value就是需要共享的变量.
那我们上面的代码 难道获取的不是同一个线程?
看过ThreadPoolExecutor实现的就知道 真的不是一个
ThreadPoolExecutor 实现
/**
* 添加工作线程
*/
private boolean addWorker(Runnable firstTask, boolean core) {
//TO DO
w = new Worker(firstTask);
final Thread t = w.thread;
//TO DO
}
/** 工作类 实现runnable接口 */
private final class Worker
extends AbstractQueuedSynchronizer
implements Runnable
{
Worker(Runnable firstTask) {
setState(-1);
this.firstTask = firstTask;
this.thread = getThreadFactory().newThread(this);
}
/** 重写run方法 最终执行的就是runWorker */
public void run() {
runWorker(this);
}
}
这下真相大白了,原因在线程池中会开辟新的线程执行task.如果在主线程中(main) 放入到ThreadLocal的value在task中获取到的就不再是main线程的ref了.而且线程池自己开辟的. 所以导致数据不一致.
### 这里提个问题 threadlocal有个remove方法,
如果我们不显示调用 会怎么样?
第二节 ThreadLocal 内存泄漏
在上文阅读源码过程中 发现threadLoalMap 中的内部类Entry 是继承weakReference 的
static class Entry extends WeakReference<ThreadLocal<?>> {
/** The value associated with this ThreadLocal. */
Object value;
Entry(ThreadLocal<?> k, Object v) {
super(k);
value = v;
}
}
从构造函数中知道 把threadLocal 的key 调用了父类构造方法
public class WeakReference<T> extends Reference<T> {
/**
* Creates a new weak reference that refers to the given object. The new
* reference is not registered with any queue.
*
* @param referent object the new weak reference will refer to
*/
public WeakReference(T referent) {
super(referent);
}
/**
* Creates a new weak reference that refers to the given object and is
* registered with the given queue.
*
* @param referent object the new weak reference will refer to
* @param q the queue with which the reference is to be registered,
* or <tt>null</tt> if registration is not required
*/
public WeakReference(T referent, ReferenceQueue<? super T> q) {
super(referent, q);
}
}
最终的效果就跟创建一个key(弱引用) 指向threadLocal的效果一样的
WeakReference<ThreadLocal> key = new WeakReference(new ThreadLocal());
那弱引用有什么用呢?
java 的引用类型有3种(都跟gc 有关系)
strong reference 强引用
Student student = new Student();
只有在gc root 不可达 而且触发gc情况下才会回收
weak reference 弱引用
Student student = new Student();
WeakReference weak = new WeakReference(student);
Student stu = weak.get();
当没有强引用指向student这个对象 而只有一个weak弱引用指向student的时候 只要有gc 就会被回收
softReference 软引用
Student student = new Student();
SoftReference weak = new SoftReference(student);
Student stu = weak.get();
当没有强引用指向student这个对象 而只有一个weak软引用指向student的时候 当触发gc 而且内存不足的时候才会回收
现在回归正题 不显示调用remove 会内存泄露么?
看段代码
ThreadLocal<String> threadLocal = new ThreadLocal<String>();
threadLocal.set("haha");
threadLocal = null;
我们结合弱引用知识 假设
1.ThreadLocalMap 的entry 的 key 是个强引用
那么 代码中主动声明的
threadLocal = null;
会被回收么(当前线程还在runing)? 答案是 NO
因为threadLocalMap 中的数组对象还有它的强引用呢
2.ThreadLocalMap 的entry 的 key 是个弱引用
还会被回收么(当前线程还在runing) 答案是YES 。
那这样是不是就代表不会内存泄露了? 我们还忘了entry这个实体呢。
private void set(ThreadLocal<?> key, Object value) {
// We don't use a fast path as with get() because it is at
// least as common to use set() to create new entries as
// it is to replace existing ones, in which case, a fast
// path would fail more often than not.
Entry[] tab = table;
int len = tab.length;
int i = key.threadLocalHashCode & (len-1);
for (Entry e = tab[i];
e != null;
e = tab[i = nextIndex(i, len)]) {
ThreadLocal<?> k = e.get();
if (k == key) {
e.value = value;
return;
}
if (k == null) {
replaceStaleEntry(key, value, i);
return;
}
}
// 如果强大的强引用呢
tab[i] = new Entry(key, value);
int sz = ++size;
if (!cleanSomeSlots(i, sz) && sz >= threshold)
rehash();
}
所以说主动调用remove 是一种自我保护 也是一个好习惯.