Java应用调试工具之jstack




   本文经历的事情发生在2008年


   本人比较懒,碰到问题才会想到去解决, 一次偶然的机会,发现线上环境的tomcat自从把连接方式改成nio以后,时不时的发现有服务器CPU占用100%的情况。很奇怪的现象,为整个系统带来了不稳定因素,需要尽快解决。于是去网上查找资料,但是没有找到现成的答案。

   定下心来自己分析,先需要知道tomcat中的线程在干什么,于是找到了jstack命令。

命令格式:jstack ${pid} 
 


   其中pid可以通过jps命令获得。

   于是我们可以得到当前应用中工作线程的堆栈, 各个线程分别在做什么 , 比如很多线程都执行在同一个代码块,那这部分就可能有问题。

    结果如下:

 

"MaintThread" daemon prio=10 tid=0x91cd3400 nid=0x4f40 waiting on condition [0x8faec000..0x8faecfb0]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at com.danga.MemCached.SockIOPool$MaintThread.run(SockIOPool.java:1492)

"Thread-3" prio=10 tid=0x8f9ff800 nid=0x4f3f waiting on condition [0x8fb2d000..0x8fb2df30]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at com.tudou.usersrv.client.HotSpareProxyMemcachedClient$1.run(HotSpareProxyMemcachedClient.java:71)

"MaintThread" daemon prio=10 tid=0x91dcfc00 nid=0x4f3e waiting on condition [0x8fb6e000..0x8fb6eeb0]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at com.danga.MemCached.SockIOPool$MaintThread.run(SockIOPool.java:1492)

"MaintThread" daemon prio=10 tid=0x91cd2800 nid=0x4f3d waiting on condition [0x8fbaf000..0x8fbafe30]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at com.danga.MemCached.SockIOPool$MaintThread.run(SockIOPool.java:1492)

"Low Memory Detector" daemon prio=10 tid=0x09a16c00 nid=0x4f37 runnable [0x00000000..0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x09a15000 nid=0x4f36 waiting on condition [0x00000000..0x9250f688]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x09a13800 nid=0x4f35 waiting on condition [0x00000000..0x92590608]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x09a12800 nid=0x4f34 runnable [0x00000000..0x925d1c10]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x099ff800 nid=0x4f33 in Object.wait() [0x92812000..0x92812eb0]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
        - locked <0x978b2300> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x099ff000 nid=0x4f32 in Object.wait() [0x92853000..0x92853e30]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0x998d4160> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x09918800 nid=0x4f2c runnable [0xf7fb9000..0xf7fba208]
   java.lang.Thread.State: RUNNABLE
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
        - locked <0x978b24f8> (a java.net.SocksSocketImpl)
        at java.net.ServerSocket.implAccept(ServerSocket.java:453)
        at java.net.ServerSocket.accept(ServerSocket.java:421)
        at org.apache.catalina.core.StandardServer.await(StandardServer.java:389)
        at org.apache.catalina.startup.Catalina.await(Catalina.java:642)
        at org.apache.catalina.startup.Catalina.start(Catalina.java:602)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
        at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413)

"VM Thread" prio=10 tid=0x099fc400 nid=0x4f31 runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0x0991f400 nid=0x4f2d runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0x09920400 nid=0x4f2e runnable 

"GC task thread#2 (ParallelGC)" prio=10 tid=0x09921400 nid=0x4f2f runnable 

"GC task thread#3 (ParallelGC)" prio=10 tid=0x09922400 nid=0x4f30 runnable 

"VM Periodic Task Thread" prio=10 tid=0x09a18400 nid=0x4f38 waiting on condition 
 

   
    使用linux自带的ps命令:

ps -eL -o pid,%cpu,lwp|grep -i 16907
 


    列出子线程的cpu占用率等情况,再对照jstack命令得到的线程号,就可以知道cpu占用高的线程执行到了那里。

    通过这种方式,确定了是由于tomcat的nio用到了jdk6的nio包,命中了其中的一个死锁bug。apache组织的人也建议在高并发的情况下不要开启nio.

 

     在生产环境中,jstack命令帮助解决了内存泄露引起的垃圾回收线程占用高CPU, 以及服务线程阻塞在某个远程接口处导致无响应等问题。

 

     希望这篇文章可以帮助到大家解决更多问题,谢谢!

 

展开阅读全文

没有更多推荐了,返回首页