感觉写这个标题,明眼人一看可能觉得这不就是死锁吗?但是今天说的情况还不是真正意义上的死锁,顶多算是宏观意义上的死锁。而且这个情况使用jstack工具查看不到死锁的信息。
使用线程池不当,导致的线程相互等待
今天的例子
public class Test {
static ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(1, 1, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>());
public static void main(String[] args) throws ExecutionException, InterruptedException {
Future<String> outterFuture = threadPoolExecutor.submit(() -> { Future<String> innerFuture = threadPoolExecutor.submit(() -> { System.out.println("inner finish");
return "inner finish";
}); String s = innerFuture.get();
System.out.println("outter get inner finish:" + s);
System.out.println("outter finish");
return "outter finish";
}); String s = outterFuture.get();
System.out.println("process get outter finish:" + s);
}}
意思就是提交了一个线程1,线程1里面提交了一个线程2,线程1等待线程2的结果。可能有些人很明显就看出问题了,当然这个是简化后的结果,实际情况线程池使用可能比这隐晦的多。执行这个方法,直接就会导致两个线程相互等待。
jstack现象
2020-09-12 09:52:41
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode):
"Attach Listener" #11 daemon prio=9 os_prio=0 tid=0x00007fbf38001000 nid=0x37c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007fbf9819c800 nid=0x7932 waiting on condition [0x00007fbf77af9000]
java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000006c8e08478> (a java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at Test.lambda$main$1(Test.java:24)
at Test$$Lambda$1/1418481495.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007fbf980d2000 nid=0x7930 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007fbf980c7000 nid=0x792f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fbf980c4800 nid=0x792e waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fbf980c3000 nid=0x792d waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fbf980c0000 nid=0x792c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fbf980be800 nid=0x792b runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fbf9808b800 nid=0x792a in Object.wait() [0x00007fbf84371000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000006c8e01a60> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000006c8e01a60> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fbf98086800 nid=0x7929 in Object.wait() [0x00007fbf84472000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000006c8e0f950> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000006c8e0f950> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00007fbf98008800 nid=0x791e waiting on condition [0x00007fbf9e635000]
java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000006c8e177b8> (a java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at Test.main(Test.java:31)
"VM Thread" os_prio=0 tid=0x00007fbf9807f000 nid=0x7928 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fbf9801d800 nid=0x791f runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fbf9801f800 nid=0x7920 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fbf98021800 nid=0x7921 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fbf98023000 nid=0x7922 runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fbf98025000 nid=0x7923 runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fbf98027000 nid=0x7925 runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fbf98028800 nid=0x7926 runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fbf9802a800 nid=0x7927 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007fbf980d5000 nid=0x7931 waiting on condition
JNI global references: 201
通过jstack没有主动发现死锁情况。由于真实情况业务和组件的线程很多更难判断。
线程池参数解析
下面是ThreadPoolExecutor线程池参数最多的构造函数
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler) {
...... }
函数的参数含义如下(具体细节请自行百度):
- corePoolSize: 线程池核心线程数
- maximumPoolSize:线程池最大数
- keepAliveTime: 空闲线程存活时间
- unit: 时间单位
- workQueue: 线程池所使用的缓冲队列
- threadFactory:线程池创建线程使用的工厂
- handler: 线程池对拒绝任务的处理策略
原因分析1
例子中定义的核心线程数和最大线程数都是1,说明线程池只能同时有一个线程在执行。然后定义了一个线程队列存放待执行的线程。问题就在于,提交线程outter,该线程就占据了核心线程数1,然后线程outter里面提交了一个线程inner,并等待线程inner的执行结果。而线程inner一直没执行,因为线程inner需要等待线程池当前执行线程数小于最大线程数之后才能,在队列中等待的线程。导致了线程outter占据了线程池能执行任务的最大数量,等待线程inner的结果,线程inner等待线程池来执行而未返回结果。
原因分析2
其实通过jstack 的日志也是能发现问题的,如名为Reference Handler和名为Finalizer的线程中,自生waiting on和locked的条件是相同的,就是自己等自己,出现了一直等待。
死锁
这里先温习一下死锁的情况。
死锁条件
- 互斥使用,即当资源被一个线程使用(占有)时,别的线程不能使用
- 不可抢占,资源请求者不能强制从资源占有者手中夺取资源,资源只能由资源占用者主动释放
- 请求和保持,即当资源的请求者在请求其他的资源的同时保持对原有资源的占有
- 循环等待,即存在一个等待队列: P1占有P2的资源,P2占有P3的资源,P3占有P1的资源。
死锁例子
public class DeadLock implements Runnable{
private static Object obj1 = new Object();
private static Object obj2 = new Object();
private boolean flag;
public DeadLock(boolean flag){
this.flag = flag;
} @Override
public void run(){
System.out.println(Thread.currentThread().getName() + "运行");
if(flag){
synchronized(obj1){
System.out.println(Thread.currentThread().getName() + "已经锁住obj1");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace(); } synchronized(obj2){
// 执行不到这里
System.out.println("1秒钟后,"+Thread.currentThread().getName()
+ "锁住obj2");
}
}
}else{
synchronized(obj2){
System.out.println(Thread.currentThread().getName() + "已经锁住obj2");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
synchronized(obj1){
// 执行不到这里
System.out.println("1秒钟后,"+Thread.currentThread().getName()
+ "锁住obj1");
}
}
}
}
public static void main(String[] args) {
Thread t1 = new Thread(new DeadLock(true), "线程1");
Thread t2 = new Thread(new DeadLock(false), "线程2");
t1.start();
t2.start();
}
}
jstack现象
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode):
"DestroyJavaVM" #13 prio=5 os_prio=0 tid=0x0000000003866000 nid=0x2ffc waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"线程2" #12 prio=5 os_prio=0 tid=0x000000001e6b8000 nid=0x20e4 waiting for monitor entry [0x000000001f8bf000]
java.lang.Thread.State: BLOCKED (on object monitor) at com.wp.security.springboot.DeadLock.run(DeadLock.java:42)
- waiting to lock <0x000000076b47b980> (a java.lang.Object)
- locked <0x000000076b47b990> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:748)
"线程1" #11 prio=5 os_prio=0 tid=0x000000001eec8800 nid=0x11d8 waiting for monitor entry [0x000000001f7bf000]
java.lang.Thread.State: BLOCKED (on object monitor) at com.wp.security.springboot.DeadLock.run(DeadLock.java:28)
- waiting to lock <0x000000076b47b990> (a java.lang.Object)
- locked <0x000000076b47b980> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #10 daemon prio=9 os_prio=0 tid=0x000000001e607000 nid=0x3888 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C1 CompilerThread2" #9 daemon prio=9 os_prio=2 tid=0x000000001e57c800 nid=0x1a1c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C2 CompilerThread1" #8 daemon prio=9 os_prio=2 tid=0x000000001e56f000 nid=0x37b4 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #7 daemon prio=9 os_prio=2 tid=0x000000001e56e800 nid=0x1eb0 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"Monitor Ctrl-Break" #6 daemon prio=5 os_prio=0 tid=0x000000001e56a800 nid=0x2298 runnable [0x000000001e9be000]
java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0x000000076b4cf910> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0x000000076b4cf910> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:61)
"Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x000000001cf8a000 nid=0x1e84 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x000000001cf74000 nid=0x2330 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x000000001cf4e800 nid=0x4168 in Object.wait() [0x000000001e2bf000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000076b208ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x000000076b208ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:212)
"Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x0000000003956000 nid=0x3478 in Object.wait() [0x000000001e1bf000]
java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000076b206bf8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x000000076b206bf8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"VM Thread" os_prio=2 tid=0x000000001cf27000 nid=0x47a4 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x000000000387b800 nid=0x1ec8 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x000000000387d000 nid=0x47a0 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x000000000387e800 nid=0x3364 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x0000000003881800 nid=0x4848 runnable
"VM Periodic Task Thread" os_prio=2 tid=0x000000001e5e5800 nid=0x1318 waiting on condition
JNI global references: 12
Found one Java-level deadlock:============================="线程2":
waiting to lock monitor 0x000000001cf4b598 (object 0x000000076b47b980, a java.lang.Object),
which is held by "线程1"
"线程1":
waiting to lock monitor 0x000000001cf4ded8 (object 0x000000076b47b990, a java.lang.Object),
which is held by "线程2"
Java stack information for the threads listed above:==================================================="线程2":
at com.wp.security.springboot.DeadLock.run(DeadLock.java:42)
- waiting to lock <0x000000076b47b980> (a java.lang.Object)
- locked <0x000000076b47b990> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:748)
"线程1":
at com.wp.security.springboot.DeadLock.run(DeadLock.java:28)
- waiting to lock <0x000000076b47b990> (a java.lang.Object)
- locked <0x000000076b47b980> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:748)
Found 1 deadlock.
这里看线程1和线程2中的waiting to lock 和locked 后的资源,一目了然。而且jstack结尾也有提示发现死锁Found one Java-level deadlock
为什么jstack不能主动发现死锁
在线程池的例子中并没有明确的是通过占用锁,导致死锁,所以这个例子中不算死锁。而死锁的例子很明确,就是两个线程相互抢占锁导致的,所以这个就是死锁,在jstack中会发现死锁。
如何判断类似于死锁的相互等待
出现类似这种情况,在jstack不提示的情况下,通过分析业务逻辑的线程确实难以发现问题所在。我对比了一下这两个例子的线程dump,注意到waiting on、waiting to lock、parking to wait for、locked这几个关键字。在百度查了一下。
- waiting on condition表示非Object.wait的条件等待,比如说你调用了sleep,park等操作
- parking to wait for 就是调用了park动作了
- waiting to lock 就是等待一个锁对象
死锁的例子中jstack之所以能检测出死锁,我猜估计他是通过waiting to lock 和 locked 判断,也就是真正意义上的死锁。而waiting on和locked,是今天讨论线程池中线程等待出现的情况。如果想判断线程是否出现这种类似于死锁的相互等待和死锁,其实需要判断所有的waiting和locked条件中是否相同。
如果感觉本文对你有一点帮助,点关注一起学习进步~
也可以关注我公众号,上面有更多技术干货文章以及相关资料共享