记一次面试腾讯全资子公司问到的问题,其中问到的一个问题,线上环境发生死锁,你怎么排查?我的回答时找到对应机器及对应进程号,根据命令jstack pid命令即可找到死锁原因。现用一个实际例子演示一下,以加深自己的记忆。
先写一段死锁代码,如下:
public class Atr implements Runnable{
private String lockA;
private String lockB;
public Atr(String lockA,String lockB){
this.lockB=lockB;
this.lockA=lockA;
}
@Override
public void run() {
synchronized (lockA){
System.out.println("等待A");
try {
Thread.sleep(1000);//获取死锁的效果明显
synchronized (lockB){
System.out.println("获取锁B");
}
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public static void main(String[] args) {
String lockA = "lockA";
String lockB = "lockB";
new Thread(new Atr(lockA,lockB),"VVVVV").start();
new Thread(new Atr(lockB,lockA),"BBBBB").start();
}
}
然后在该目录下执行 以下命令进行编译,编译成Atr.class文件
javac Atr.java
然后在该目录下执行下面命令:
java Atr
注意:如果提示Could not find or load main class,请先检查java CLASSPATH环境变量是否配置正确,可参考我的linux服务器java环境配置(windows java环境配置可自行网上查找):
export JAVA_HOME=/usr/local/jdk1.8.0_211
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
环境问题解决后执行java Atr命令启动程序,可以看到以下打印信息:
我们发现,程序只输出了两行内容,然后程序就不再打印其它的东西了,但是程序并没有停止。这样就产生了死锁。 当线程"BBBBB"使用synchronized
锁住了lockA的同时,线程"VVVVV"也是用synchronized
锁住了lockB。当两个线程都执行完第一个打印任务的时候,线程"BBBBB"想锁住lockB,线程"VVVVV"想锁住lockA。但是,线程"BBBBB"当前锁着lockA,线程"VVVVV"锁着lockB。所以两个线程都无法继续执行下去,就造成了死锁。
然后通过jps -l 查找正在运行的java程序的pid,如下图所示:
如上图所示可知进程pid 为 15107
接着我们使用jstack pid执行命令:
jstack 15107
控制台上可查看到如下堆栈信息:
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode):
"Attach Listener" #12 daemon prio=9 os_prio=0 tid=0x00007f162c001000 nid=0x3c92 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"DestroyJavaVM" #11 prio=5 os_prio=0 tid=0x00007f167c009800 nid=0x3b04 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"BBBBB" #10 prio=5 os_prio=0 tid=0x00007f167c0e3800 nid=0x3b15 waiting for monitor entry [0x00007f1661de4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at Atr.run(Atr.java:17)
- waiting to lock <0x00000000e1c5bf90> (a java.lang.String)
- locked <0x00000000e1c5bfc8> (a java.lang.String)
at java.lang.Thread.run(Thread.java:748)
"VVVVV" #9 prio=5 os_prio=0 tid=0x00007f167c0e2000 nid=0x3b14 waiting for monitor entry [0x00007f1661ee5000]
java.lang.Thread.State: BLOCKED (on object monitor)
at Atr.run(Atr.java:17)
- waiting to lock <0x00000000e1c5bfc8> (a java.lang.String)
- locked <0x00000000e1c5bf90> (a java.lang.String)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f167c0ce800 nid=0x3b12 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f167c0c1800 nid=0x3b11 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f167c0bf800 nid=0x3b10 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f167c0bc800 nid=0x3b0f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f167c0bb000 nid=0x3b0e runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f167c088000 nid=0x3b0d in Object.wait() [0x00007f16625ec000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e1c08ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000e1c08ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f167c085800 nid=0x3b0c in Object.wait() [0x00007f16626ed000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e1c06bf8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000e1c06bf8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"VM Thread" os_prio=0 tid=0x00007f167c07b800 nid=0x3b0b runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f167c01e800 nid=0x3b05 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f167c020800 nid=0x3b06 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f167c022000 nid=0x3b07 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f167c024000 nid=0x3b08 runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f167c026000 nid=0x3b09 runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f167c027800 nid=0x3b0a runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f167c0d1800 nid=0x3b13 waiting on condition
JNI global references: 5
Found one Java-level deadlock:
=============================
"BBBBB":
waiting to lock monitor 0x00007f16380062c8 (object 0x00000000e1c5bf90, a java.lang.String),
which is held by "VVVVV"
"VVVVV":
waiting to lock monitor 0x00007f1638004e28 (object 0x00000000e1c5bfc8, a java.lang.String),
which is held by "BBBBB"
Java stack information for the threads listed above:
===================================================
"BBBBB":
at Atr.run(Atr.java:17)
- waiting to lock <0x00000000e1c5bf90> (a java.lang.String)
- locked <0x00000000e1c5bfc8> (a java.lang.String)
at java.lang.Thread.run(Thread.java:748)
"VVVVV":
at Atr.run(Atr.java:17)
- waiting to lock <0x00000000e1c5bfc8> (a java.lang.String)
- locked <0x00000000e1c5bf90> (a java.lang.String)
at java.lang.Thread.run(Thread.java:748)
Found 1 deadlock.
由上面堆栈信息Found one Java-level deadlock
指出造成死锁的两个线程的内容。然后,又通过 Java stack information for the threads listed above
来显示更详细的死锁的信息。 其上面意思是:
线程"BBBBB"在想要执行第17行的时候,当前锁住了资源
<0x00000000e1c5bfc8>
,但是他在等待资源<0x00000000e1c5bf90>
线程"VVVVV"在想要执行第17行的时候,当前锁住了资源<0x00000007d6aa2c98>
,但是他在等待资源<0x00000007d6aa2ca8>
由于这两个线程都持有资源,并且都需要对方的资源,所以造成了死锁。 原因我们找到了,就可以具体问题具体分析,解决这个死锁了。