最近收到线上Tomcat线程数目超出的报警,于是想要分析下问题的原因:
首先进入线上,使用ps -aux命令,查看jvm进程,可以得到运行tomcat的jdk的地址:
/home/work/app/.jdk/bin/java
于是就知道了jdk的jstack、jps等命令的目录,然后找到jvm进程
/home/work/app/.jdk/bin/jps 29145 Jps 208 Bootstrap
1
2
3
|
/
home
/
work
/
app
/
.
jdk
/
bin
/
jps
29145
Jps
208
Bootstrap
|
得到了jvm的tomcat进程是208;
把堆栈导出,下载到本地:
jstack -l 208 > log.txt
1
|
jstack
-
l
208
>
log
.
txt
|
下载后,发现线程堆栈中,有大量的这样的日志:
"pool-103-thread-1" prio=10 tid=0x00007f038001e000 nid=0x759d waiting on condition [0x00007f022e5e4000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000912fab28> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) "pool-102-thread-1" prio=10 tid=0x00007f0380011000 nid=0x71ed waiting on condition [0x00007f022e6e5000] <span style="color: #ff6600;"><strong> java.lang.Thread.State: WAITING (parking) </strong></span> at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000912fa170> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) <span style="color: #ff6600;"><strong> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) </strong></span> at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
"pool-103-thread-1"
prio
=
10
tid
=
0x00007f038001e000
nid
=
0x759d
waiting
on
condition
[
0x00007f022e5e4000
]
java
.
lang
.
Thread
.
State
:
WAITING
(
parking
)
at
sun
.
misc
.
Unsafe
.
park
(
Native
Method
)
-
parking
to
wait
for
<
0x00000000912fab28
>
(
a
java
.
util
.
concurrent
.
locks
.
AbstractQueuedSynchronizer
$
ConditionObject
)
at
java
.
util
.
concurrent
.
locks
.
LockSupport
.
park
(
LockSupport
.
java
:
156
)
at
java
.
util
.
concurrent
.
locks
.
AbstractQueuedSynchronizer
$
ConditionObject
.
await
(
AbstractQueuedSynchronizer
.
java
:
1987
)
at
java
.
util
.
concurrent
.
LinkedBlockingQueue
.
take
(
LinkedBlockingQueue
.
java
:
399
)
at
java
.
util
.
concurrent
.
ThreadPoolExecutor
.
getTask
(
ThreadPoolExecutor
.
java
:
947
)
at
java
.
util
.
concurrent
.
ThreadPoolExecutor
$
Worker
.
run
(
ThreadPoolExecutor
.
java
:
907
)
at
java
.
lang
.
Thread
.
run
(
Thread
.
java
:
662
)
"pool-102-thread-1"
prio
=
10
tid
=
0x00007f0380011000
nid
=
0x71ed
waiting
on
condition
[
0x00007f022e6e5000
]
<
span
style
=
"color: #ff6600;"
>
<
strong
>
java
.
lang
.
Thread
.
State
:
WAITING
(
parking
)
<
/
strong
>
<
/
span
>
at
sun
.
misc
.
Unsafe
.
park
(
Native
Method
)
-
parking
to
wait
for
<
0x00000000912fa170
>
(
a
java
.
util
.
concurrent
.
locks
.
AbstractQueuedSynchronizer
$
ConditionObject
)
at
java
.
util
.
concurrent
.
locks
.
LockSupport
.
park
(
LockSupport
.
java
:
156
)
<
span
style
=
"color: #ff6600;"
>
<
strong
>
at
java
.
util
.
concurrent
.
locks
.
AbstractQueuedSynchronizer
$
ConditionObject
.
await
(
AbstractQueuedSynchronizer
.
java
:
1987
)
at
java
.
util
.
concurrent
.
LinkedBlockingQueue
.
take
(
LinkedBlockingQueue
.
java
:
399
)
<
/
strong
>
<
/
span
>
at
java
.
util
.
concurrent
.
ThreadPoolExecutor
.
getTask
(
ThreadPoolExecutor
.
java
:
947
)
at
java
.
util
.
concurrent
.
ThreadPoolExecutor
$
Worker
.
run
(
ThreadPoolExecutor
.
java
:
907
)
at
java
.
lang
.
Thread
.
run
(
Thread
.
java
:
662
)
|
可以看到,线程处于WAITING状态,阻塞在试图从任务队列中取任务(LinkedBlockingQueue.take),这个任务队列指的是ThreadPoolExecutor的线程池启动的线程任务队列;
也就是说,这些线程都是空闲状态,在等着任务的到来呢!
补充下LinkedBlockingQueue的知识:
并发库中的BlockingQueue是一个比较好玩的类,顾名思义,就是阻塞队列。该类主要提供了两个方法put()和take(),前者将一个对象放到队列尾部,如果队列已经满了,就等待直到有空闲节点;后者从head取一个对象,如果没有对象,就等待直到有可取的对象。
1
2
|
并发库中的
BlockingQueue是一个比较好玩的类,顾名思义,就是阻塞队列。该类主要提供了两个方法
put
(
)和
take
(
),前者将一个对象放到队列尾部,如果队列已经满了,就等待直到有空闲节点;后者从
head取一个对象,如果没有对象,就等待直到有可取的对象。
|
定位到问题就简单了,查找代码,发现有个位置启动了线程池提交了任务,但是任务执行完返回后,线程池没有关闭导致的;
问题总结:
1、使用ExecutorService提交的线程任务,也要记得关闭;
2、启动新线程的时候,最好给线程起个名字,这样线程堆栈的问题排查更加容易;
文章地址:http://www.crazyant.net/1858.html,转载请注明来源