描述
记录一次最近公司服务器CPU占用率过高原因排查过程。
定位进程
通过Linux自带命令top
查看当前占用CPU最高的进程,记录进程号。通过top -p pid
查看指定pid
进程CPU占用情况。
定位线程
输入命令 H
或 shift+h
查看进程内所有线程CPU占用情况。
定位代码位置
定位占用CPU最高的线程PID为15153
,
通过JDK自带命令jstack
获取进程号为7643
进程的堆栈调用信息。
执行jstack 7643 > /tmp/stack.info
将进程堆栈信息到处到文件 。
上面我们得出占用CPU最高线程PID为15153
,将其转化为16进制
,结果为3b31
。
在 /tmp/stack.info
中搜索0x3b31
,找出nid为0x3b31
的执行线程,查看堆栈信息。
"ForkJoinPool-3-worker-1" daemon prio=10 tid=0x00007fd35429b800 nid=0x3b31 runnable [0x00007fd32f218000]
java.lang.Thread.State: RUNNABLE
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2328)
at java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:2818)
at java.io.ObjectInputStream.readHandle(ObjectInputStream.java:1452)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1511)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.ArrayList.readObject(ArrayList.java:771)
at sun.reflect.GeneratedMethodAccessor5178.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.GeneratedMethodAccessor266.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at net.rubyeye.xmemcached.transcoders.BaseSerializingTranscoder.deserialize(BaseSerializingTranscoder.java:112)
at net.rubyeye.xmemcached.transcoders.SerializingTranscoder.decode0(SerializingTranscoder.java:98)
at net.rubyeye.xmemcached.transcoders.SerializingTranscoder.decode(SerializingTranscoder.java:90)
at net.rubyeye.xmemcached.XMemcachedClient.fetch0(XMemcachedClient.java:657)
at net.rubyeye.xmemcached.XMemcachedClient.get0(XMemcachedClient.java:1085)
at net.rubyeye.xmemcached.XMemcachedClient.get(XMemcachedClient.java:1043)
at net.rubyeye.xmemcached.XMemcachedClient.get(XMemcachedClient.java:1054)
at com.test.infrastructure.cache.xmemcached.XMemcachedServiceImpl.get(XMemcachedServiceImpl.java:302)
at com.test.trade.product.storage.cache.ProductCacheStorageImpl.get(ProductCacheStorageImpl.java:81)
at com.test.trade.product.storage.ProductStorageImpl.getProduct(ProductStorageImpl.java:201)
at com.test.trade.product.storage.ProductStorageImpl.getProduct(ProductStorageImpl.java:51)
at com.test.trade.product.storage.ProductStorageImpl$$FastClassBySpringCGLIB$$172364a8.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:651)
at com.test.trade.product.storage.ProductStorageImpl$$EnhancerBySpringCGLIB$$b00016c5.getProduct(<generated>)
at com.test.trade.product.service.ProductServiceImpl.getProduct(ProductServiceImpl.java:148)
at com.test.trade.product.service.ProductServiceImpl.getProduct(ProductServiceImpl.java:45)
at com.test.trade.product.service.ProductServiceImpl$$FastClassBySpringCGLIB$$2346c968.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:651)
at com.test.trade.product.service.ProductServiceImpl$$EnhancerBySpringCGLIB$$c269a385.getProduct(<generated>)
at com.test.trade.rule.service.PromotionServiceImpl.checkPromotionByGoods(PromotionServiceImpl.java:532)
at com.test.trade.rule.service.PromotionServiceImpl.checkPromotionByGoodsId(PromotionServiceImpl.java:439)
at com.test.trade.rule.service.PromotionServiceImpl.checkPromotionByGoods(PromotionServiceImpl.java:476)
at com.test.web.controller.goods.GoodsController$RefreshPromotionTagTask.compute(GoodsController.java:1599)
at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:177)
at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:377)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:654)
at java.util.concurrent.ForkJoinTask.invokeAll(ForkJoinTask.java:685)
at com.test.web.controller.goods.GoodsController$RefreshPromotionTagTask.compute(GoodsController.java:1613)
at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:177)
at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:377)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:654)
at java.util.concurrent.ForkJoinTask.invokeAll(ForkJoinTask.java:685)
at com.test.web.controller.goods.GoodsController$RefreshPromotionTagTask.compute(GoodsController.java:1613)
at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:177)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:334)
at java.util.concurrent.ForkJoinWorkerThread.execTask(ForkJoinWorkerThread.java:604)
at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:784)
at java.util.concurrent.ForkJoinPool.work(ForkJoinPool.java:646)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:398)
从中可以得出我们自己的业务代码为GoodsController
类执行代码位置GoodsController$RefreshPromotionTagTask.compute(GoodsController.java:1613)
。
对线程堆栈关键信息说明
"ForkJoinPool-3-worker-1" daemon prio=10 tid=0x00007fd35429b800 nid=0x3b31 runnable [0x00007fd32f218000] java.lang.Thread.State: RUNNABLE
ForkJoinPool-3-worker-1:线程名
daemon:标明为守护线程
prio=10:线程优先级,默认优先级为5,值越小优先级越大,范围1~10
tid=0x00007fd35429b800:线程内部ID
nid=0x3b31:操作系统分配给线程的ID
runnable:线程状态为可运行状态
[0x00007fd32f218000]:pc计数器的位置,表明线程运行到哪了。可用来分析死锁
java.lang.Thread.State: 线程状态
线程状态说明
代码中共有除RUNNING之外的6种状态,为了表示线程正在执行,特加了RUNNING这种状态。我们需要重点关注RUNNABLE、BLOCKED、WAITING和TIME_WAITING四种状态,jstack打印的线程堆栈中也会时时出现。
1)BLOCKED:很好理解,就是线程在等待获取锁进入同步块或者同步方法中。两个死锁的线程即是Blocked。
2)WAITING:比BLOCKED状态进步一些,指我已经获得锁了,但由于有些条件不满足,我自己等会,调用object.wait()方法。等条件满足了,别的线程调用notify再叫我。另外也可以调用Thread.join()方法,顾名思义就是调用别的线程的join方法,让别人join进来先执行,那我就只能等会了。但是由于wait()和notify()以及notifyAll()用于协调对共享资源的存取,所以必须在synchronized块中使用。所以即便wait状态的线程被notfiy唤醒了,也需要再次获得锁,所以唤醒后进入Blocked状态。
3)TIMED_WAITING:类比WAITING,差异是不需要notify()或者notifyAlL()方法唤醒,时间到了我自己醒了。另外sleep比较好理解,就是让当前线程睡一会,与wait的区别是它不释放锁。
4)RUNNABLE:不用多说,在JAVA虚拟机中已经在运行,但是在等待操作系统资源,比如CPU时间片。
结语
定位代码位置后,接下来就是结合业务逻辑分析优化代码了。
Good Luck~😊
更多信息可关注我的公众号 Java知识星球
,不定时推送更多原创技术知识文章。