偶尔会遇到一些jvm占用cup资源高的问题,有些命令不经常用,到用的时候需要现查,在这里做下记录:
java自带的jstack命令非常有用,能够抓取到java当前所有线程的线程栈信息,有助于我们排查一些线程问题,如死锁,资源占用高,被长期阻塞,高并发下锁导致的性能等问题;jstack线程栈命令打印出来线程信息如下:
"http-nio-8080-exec-2" #28 daemon prio=5 os_prio=0 tid=0x00007f85c0004800 nid=0x10ec waiting on condition [0x00007f85b0df1000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e88726f0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:103)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
先来说说第一行的含义 :
"http-nio-8080-exec-2" 这个就是线程的名称
prio=5 是这个线程的优先级
tid=0x00007f85c0004800 就是java的线程id
nid=0x10ec 就是native id ,就是我们操作系统查看线程看到的id,只是这里是16进制而已,这个是我们查找的关键
waiting on condition 是线程当前的状态
[0x00007f85b0df1000] 这个是线程栈的地址
第二行的含义:
线程状态WAITING (parking),调用UNSafe的parking 阻塞等待;
大家也许看出来了,这个例子其实就是我随意从tomcat线程栈信息里面拿的一个栈信息;
标黄色的nid=0x10ec就是我们要找的线程标识,0x10ec转换为10进制就是4332;它就是我们在linux下看到的线程id;
第二步就是如何在linux下找到这个线程;
查看进程线程的方式有很多种,top 命令下 按 shift+h,是可以开启线程模式的;但是这样太麻烦,我们假设通过top命令已经找到了进程id为4255的java进程:
root@iZuf6c2qeenlv7dqgioq2eZ bin]# top
top - 22:53:03 up 40 days, 7:55, 4 users, load average: 0.14, 0.07, 0.10
Tasks: 91 total, 1 running, 90 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1922244k total, 1833912k used, 88332k free, 108644k buffers
Swap: 0k total, 0k used, 0k free, 786740k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
964 root 20 0 66372 4428 2200 S 0.7 0.2 111:19.22 AliYunDun
4255 root 20 0 2521m 139m 13m S 0.7 7.4 0:38.57 java
23890 root 20 0 148m 21m 2120 S 0.7 1.2 204:43.83 AliHids
1 root 20 0 19232 812 520 S 0.0 0.0 0:01.30 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:05.47 ksoftirqd/0
那么我们这样就可以去查看这个java进程所有的线程 top -H -p 4255
[root@iZuf6c2qeenlv7dqgioq2eZ bin]# top -H -p 4255
top - 22:54:42 up 40 days, 7:57, 4 users, load average: 0.09, 0.08, 0.10
Tasks: 37 total, 0 running, 37 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1922244k total, 1834400k used, 87844k free, 108676k buffers
Swap: 0k total, 0k used, 0k free, 786740k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4255 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4256 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.65 java
4257 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4258 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.51 java
4259 root 20 0 2521m 139m 13m S 0.0 7.4 0:01.02 java
4260 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.02 java
4261 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4262 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4263 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4264 root 20 0 2521m 139m 13m S 0.0 7.4 0:05.28 java
4265 root 20 0 2521m 139m 13m S 0.0 7.4 0:02.44 java
4266 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4267 root 20 0 2521m 139m 13m S 0.0 7.4 0:13.21 java
4268 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.81 java
4271 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4272 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.46 java
4273 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.61 java
4276 root 20 0 2521m 139m 13m S 0.0 7.4 0:06.60 java
4277 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.01 java
4278 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.02 java
4279 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.00 java
4280 root 20 0 2521m 139m 13m S 0.0 7.4 0:01.36 java
4281 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.78 java
这时候我们发现没有我们想要找的4322的进程怎么办;通过shift+ > 或shift <来上下翻页
[root@iZuf6c2qeenlv7dqgioq2eZ bin]# top -H -p 4255
top - 22:58:53 up 40 days, 8:01, 4 users, load average: 0.08, 0.08, 0.10
Tasks: 37 total, 0 running, 37 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1922244k total, 1835640k used, 86604k free, 108876k buffers
Swap: 0k total, 0k used, 0k free, 786760k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4267 root 20 0 2521m 139m 13m S 0.0 7.4 0:13.37 java
4276 root 20 0 2521m 139m 13m S 0.0 7.4 0:06.67 java
4264 root 20 0 2521m 139m 13m S 0.0 7.4 0:05.30 java
4265 root 20 0 2521m 139m 13m S 0.0 7.4 0:02.45 java
4280 root 20 0 2521m 139m 13m S 0.0 7.4 0:01.38 java
4259 root 20 0 2521m 139m 13m S 0.0 7.4 0:01.04 java
4268 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.83 java
4281 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.79 java
4283 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.75 java
4256 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.65 java
4273 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.61 java
4258 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.52 java
4272 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.47 java
4292 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.30 java
4383 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.30 java
4479 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.28 java
4332 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.15 java
4333 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.14 java
4473 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.14 java
4435 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.12 java
4474 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.11 java
4480 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.11 java
4483 root 20 0 2521m 139m 13m S 0.0 7.4 0:00.11 java
到这里就将关键的地方讲解完毕了:
这是反向的演示了如何去查找jvm特定的线程信息,但是正常情况下我们排查的步骤是这样的:
1.通过top命令显示出java进程,通过shift+p 按cpu资源占用排序或shift+m按内存资源占用排序,找到最耗费资源的java进程的进程id;
2.通过“top -H -p 进程id”找到这个进程最耗费资源的线程id;
3.将线程id转换为16进制(字母要小写,否则第4步你什么也得不到);
4.执行“jstack java进程id |grep -A 10 java线程id的16进制”,查看线程状态信息;
查看进程占用cpu最多的线程:
ps -mp pid-o THREAD,tid,time
[qz@report bin]$ ps -mp 12101 -o THREAD,tid,time
USER %CPU PRI SCNT WCHAN USER SYSTEM TID TIME
qz 2.0 - - - - - - 00:30:10
qz 0.0 19 - futex_ - - 12101 00:00:00
qz 0.0 19 - poll_s - - 12108 00:00:01
qz 0.0 19 - futex_ - - 12115 00:00:01
qz 0.0 19 - futex_ - - 12116 00:00:01
qz 0.0 19 - futex_ - - 12117 00:00:01
qz 0.0 19 - futex_ - - 12118 00:00:01
qz 0.0 19 - futex_ - - 12119 00:00:16
qz 0.0 19 - futex_ - - 12120 00:00:00
qz 0.0 19 - futex_ - - 12121 00:00:02
qz 0.0 19 - futex_ - - 12122 00:00:00
qz 0.1 19 - futex_ - - 12123 00:02:06
qz 0.1 19 - futex_ - - 12124 00:01:46
qz 0.0 19 - futex_ - - 12125 00:00:19
qz 0.0 19 - futex_ - - 12126 00:00:00
qz 0.0 19 - futex_ - - 12127 00:00:35
qz 0.0 19 - futex_ - - 12128 00:00:03