FULL GC有可能导致JVM暂停

http://hllvm.group.iteye.com/group/topic/28745

与另外个程序交互有个心跳检测, 15s/次, 31s没有收到心跳检测的返回消息就认为连接断了, 

现在遇到一个诡异问题, 不知什么原因 导致程序中1分多钟内日志无任何输出, 接着就是检测到超时, 可是接着连续输出多次发送keepalive. 

发送keepalive就是很简单的java.util.Timer, 速率固定 

启动参数 
-Xms16m -Xmx128m -XX:+HeapDumpOnOutOfMemoryError 

程序启动时间[15:55:10.225] [23.12.2011] 


第一次 
Java代码   收藏代码
  1. [09:51:16.288] [26.12.2011] [Thread-14][DEBUG] Keepalive receiver started or reseted.  
  2.   
  3. [09:53:26.819] [26.12.2011] [Timer-9197][ERROR] Keepalive timeout reached  
  4.   
  5. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  6.   
  7. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  8.   
  9. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  10.   
  11. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  12.   
  13. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  14.   
  15. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  16.   
  17. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  18.   
  19. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent        
  20.   
  21. [09:53:26.819] [26.12.2011] [Timer-9197][DEBUG] Sent <Disconnect/>  


重连启动后第二次 
Java代码   收藏代码
  1. [09:57:17.053] [26.12.2011] [Thread-9207][DEBUG] Keepalive receiver started or reseted.  
  2.   
  3. [09:58:29.381] [26.12.2011] [Timer-9259][ERROR] Keepalive timeout reached  
  4.   
  5. [09:58:29.381] [26.12.2011] [Timer-9259][DEBUG] Sent <Disconnect/>  
  6.   
  7. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
  8.   
  9. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
  10.   
  11. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
  12.   
  13. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
qianhd 2011-12-30
另外 -Xloggc 这个jvm的gc日志能以压缩的形式输出吗? 或者过滤一些.
RednaxelaFX 2011-12-30
GC暂停超过1分钟的情况不是没有,但以-Xmx128m的配置来看这个很诡异。
请问能使用 -XX:+PrintSafepointStatistics 来看看VM暂停的统计信息么?
一共是这几个参数:

C++代码   收藏代码
  1. product(bool, PrintSafepointStatistics, false,                            \  
  2.         "print statistics about safepoint synchronization")               \  
  3.                                                                           \  
  4. product(intx, PrintSafepointStatisticsCount, 300,                         \  
  5.         "total number of safepoint statistics collected "                 \  
  6.         "before printing them out")                                       \  
  7.                                                                           \  
  8. product(intx, PrintSafepointStatisticsTimeout,  -1,                       \  
  9.         "print safepoint statistics only when safepoint takes"            \  
  10.         " more than PrintSafepointSatisticsTimeout in millis")            \  
qianhd 2011-12-30
应该可以 
但是这问题也就遇到过这一次  未必能重现 
这地方只要出现过一次的问题 都能报bug 

就连temp目录垃圾文件太多 导致程序里创建临时文件太慢 使得程序响应变慢 都能报个bug.
william_ai 2011-12-30
3分钟以内,但是频率不能太高了。
xiaoyu 2011-12-31
Threaddump 一下. 先看看有没有其他东西在block住... 对于2边网络, 也最好使用tcpdump看一下是否出现什么问题(是不是ARP失效等, 是不是发送, 接收缓冲区满了等等)...

x_root 2012-01-01
你可以将readTimeout时间设大点,这个问题我以前也遇到过,GC时间过长,导致服务端以为连接无效而关闭了。

不过你的设置-Xmx128m却导致GC要一分多钟却有点不可思议。
iaimstar 2012-01-04
Xmx128m 也能一分钟以上?你这fullgc是坑爹啊
moonese 2012-02-27
会不会是对方程序的问题? 最后查出原因了?
qianhd 2012-02-29
最终没有查出原因, 因为很难复现 
这破烂货一般来说只要开3天以上, 就会有很多莫名其妙的问题. 所以建议用户不用就关掉 
从上面的[Timer-9200]就可以看出代码质量有多少差 
当然这破烂货不是我写的.
jjshanwei 2012-03-07
Xmx128m 的fullgc 要一分钟? 应改不是gc本身的问题
moonese 2012-03-08
jjshanwei 写道
Xmx128m 的fullgc 要一分钟? 应改不是gc本身的问题


不知道是不是这台机器太老,内存太小,上面跑的程序太多的缘故。 

建议加上打印gc日志的参数 -verbose:gc -XX:+PrintGCDetails -Xloggc:timer_gc.log,分析gc的日志,就知道是不是 GC 导致的问题了。 

或是用 jconsole / jvisualvm 监控下程序的 heap 使用情况。 

怀疑是程序里面有死锁或是代码写的不好,有性能问题,这个也可以通过工具查出的。
qianhd 2012-08-21
RednaxelaFX 写道
GC暂停超过1分钟的情况不是没有,但以-Xmx128m的配置来看这个很诡异。 
请问能使用 -XX:+PrintSafepointStatistics 来看看VM暂停的统计信息么? 
一共是这几个参数: 

C++代码   收藏代码
  1. product(bool, PrintSafepointStatistics, false,                            \  
  2.         "print statistics about safepoint synchronization")               \  
  3.                                                                           \  
  4. product(intx, PrintSafepointStatisticsCount, 300,                         \  
  5.         "total number of safepoint statistics collected "                 \  
  6.         "before printing them out")                                       \  
  7.                                                                           \  
  8. product(intx, PrintSafepointStatisticsTimeout,  -1,                       \  
  9.         "print safepoint statistics only when safepoint takes"            \  
  10.         " more than PrintSafepointSatisticsTimeout in millis")            \  




有没有资料详细讲一下这个参数?  
噩梦的问题又来了~
RednaxelaFX 2012-08-21
例如这帖:http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-August/002968.html ?
你可以搜一下OpenJDK的hotspot-gc-dev和hotspot-gc-use这两个邮件列表的过往内容。这个参数不是“标准参数”所以也没啥公开文档描述它的意义的。
qianhd 2012-08-21
RednaxelaFX 写道
例如这帖:http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-August/002968.html ? 
你可以搜一下OpenJDK的hotspot-gc-dev和hotspot-gc-use这两个邮件列表的过往内容。这个参数不是“标准参数”所以也没啥公开文档描述它的意义的。



Orz 
你真是24小时在线~ 

如果我设置个 MaxGCPauseMillis这个参数 
有意义吗? 
RednaxelaFX 2012-08-21
qianhd 写道

如果我设置个 MaxGCPauseMillis这个参数 
有意义吗? 

不是完全没意义不过这参数我印象中只有在ParallelScavenge和G1中的作用才比较明显,CMS虽然也有用这个参数但我印象中它没啥用。你最好到hotspot-gc-use那边问问,靠谱些。
qianhd 2012-08-23
我加了3个参数 
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -Xloggc:gclog.log 

今天早上一看, 果然断了, 晚上8点17的样子 
common GC竟然花了41秒 诧异~ 

另外,不是full GC才会stop all threads吗? 

再另外,我设定过最大和最小内存 -Xms64m -Xmx256m 
但是实际上最大时内存占用达到了400多m 
这是为啥? 

再再另外,有没有办法记录当时的JVM的一些状态信息? 

求解惑 

引用

2012-08-22T20:17:27.921+0800: 16489.272: [GC 191334K->84242K(299840K), 41.4177511 secs] 
Total time for which application threads were stopped: 41.4731411 seconds 
Total time for which application threads were stopped: 0.0489481 seconds 
Total time for which application threads were stopped: 0.1110761 seconds 
Total time for which application threads were stopped: 0.0007286 seconds 
Total time for which application threads were stopped: 0.0001268 seconds 
Total time for which application threads were stopped: 2.8581772 seconds 
Total time for which application threads were stopped: 0.0112389 seconds 
Total time for which application threads were stopped: 0.0001335 seconds 
Total time for which application threads were stopped: 0.0003246 seconds 
Total time for which application threads were stopped: 0.0003774 seconds 
Total time for which application threads were stopped: 0.0001439 seconds 
Total time for which application threads were stopped: 0.0000897 seconds 
Total time for which application threads were stopped: 0.0001408 seconds 
Total time for which application threads were stopped: 0.0001687 seconds 
Total time for which application threads were stopped: 0.0001380 seconds 
Total time for which application threads were stopped: 0.0001374 seconds 
Total time for which application threads were stopped: 0.0002028 seconds 
Total time for which application threads were stopped: 0.0001171 seconds 
Total time for which application threads were stopped: 0.0001277 seconds 
Total time for which application threads were stopped: 0.0002257 seconds 
Total time for which application threads were stopped: 0.0353492 seconds 
Total time for which application threads were stopped: 0.0004783 seconds 
Total time for which application threads were stopped: 0.0001612 seconds 
Total time for which application threads were stopped: 0.0075786 seconds 
Total time for which application threads were stopped: 0.0001754 seconds 
Total time for which application threads were stopped: 0.0002405 seconds 
Total time for which application threads were stopped: 0.0005079 seconds 
Total time for which application threads were stopped: 0.0001257 seconds 
Total time for which application threads were stopped: 0.0004294 seconds 
Total time for which application threads were stopped: 0.0001970 seconds 
Total time for which application threads were stopped: 0.0001310 seconds 
Total time for which application threads were stopped: 0.0001763 seconds 
Total time for which application threads were stopped: 0.0001685 seconds 
Total time for which application threads were stopped: 0.0004185 seconds 
Total time for which application threads were stopped: 0.0001067 seconds 
Total time for which application threads were stopped: 0.0005286 seconds 
Total time for which application threads were stopped: 0.0001140 seconds 
Total time for which application threads were stopped: 0.0002386 seconds 
Total time for which application threads were stopped: 0.0003646 seconds 
Total time for which application threads were stopped: 0.0001048 seconds 
Total time for which application threads were stopped: 0.0003132 seconds 
Total time for which application threads were stopped: 0.0001953 seconds 
Total time for which application threads were stopped: 0.0003942 seconds 
Total time for which application threads were stopped: 0.0004210 seconds 
Total time for which application threads were stopped: 0.0001009 seconds 
Total time for which application threads were stopped: 0.0001430 seconds 
Total time for which application threads were stopped: 0.0000816 seconds 
Total time for which application threads were stopped: 0.0297057 seconds 
Total time for which application threads were stopped: 0.0462548 seconds 
Total time for which application threads were stopped: 0.0003872 seconds 
Total time for which application threads were stopped: 0.0004657 seconds 
Total time for which application threads were stopped: 0.0004777 seconds 
Total time for which application threads were stopped: 0.0004568 seconds 
Total time for which application threads were stopped: 0.0003623 seconds 
Total time for which application threads were stopped: 0.0004906 seconds 
Total time for which application threads were stopped: 0.0003693 seconds 
Total time for which application threads were stopped: 0.0004129 seconds 
Total time for which application threads were stopped: 0.0003953 seconds 
Total time for which application threads were stopped: 0.0052699 seconds 
Total time for which application threads were stopped: 0.0002953 seconds 
Total time for which application threads were stopped: 0.0000514 seconds 
Total time for which application threads were stopped: 0.0000310 seconds 
Total time for which application threads were stopped: 0.0000603 seconds 
Total time for which application threads were stopped: 0.0000293 seconds 
Total time for which application threads were stopped: 0.0000777 seconds 
Total time for which application threads were stopped: 0.0001048 seconds 
Total time for which application threads were stopped: 0.0003676 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0001201 seconds 
Total time for which application threads were stopped: 0.0003676 seconds 
Total time for which application threads were stopped: 0.0003995 seconds 
Total time for which application threads were stopped: 0.0004118 seconds 
Total time for which application threads were stopped: 0.0004026 seconds 
Total time for which application threads were stopped: 0.0003911 seconds 
Total time for which application threads were stopped: 0.0367284 seconds 
Total time for which application threads were stopped: 0.0003959 seconds 
Total time for which application threads were stopped: 0.0004009 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0003727 seconds 
Total time for which application threads were stopped: 0.0002084 seconds 
Total time for which application threads were stopped: 0.0003738 seconds 
Total time for which application threads were stopped: 0.0003671 seconds 
Total time for which application threads were stopped: 0.0004165 seconds 
Total time for which application threads were stopped: 0.0004026 seconds 
Total time for which application threads were stopped: 0.0003674 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0004165 seconds 
Total time for which application threads were stopped: 0.0003908 seconds 
Total time for which application threads were stopped: 0.0004109 seconds 
Total time for which application threads were stopped: 0.0004540 seconds 
Total time for which application threads were stopped: 0.0003730 seconds 
Total time for which application threads were stopped: 0.0003889 seconds 
Total time for which application threads were stopped: 0.0003495 seconds 
Total time for which application threads were stopped: 0.0003665 seconds 
Total time for which application threads were stopped: 0.0003805 seconds 
Total time for which application threads were stopped: 0.0004277 seconds 
Total time for which application threads were stopped: 0.0003864 seconds 
Total time for which application threads were stopped: 0.0003939 seconds 
Total time for which application threads were stopped: 0.0002249 seconds 
Total time for which application threads were stopped: 0.0003922 seconds 
Total time for which application threads were stopped: 0.0006319 seconds 
Total time for which application threads were stopped: 0.0003757 seconds 
Total time for which application threads were stopped: 0.0004816 seconds 
Total time for which application threads were stopped: 0.0004068 seconds 
Total time for which application threads were stopped: 0.0003878 seconds 
Total time for which application threads were stopped: 0.0004177 seconds 
Total time for which application threads were stopped: 0.0003771 seconds 
Total time for which application threads were stopped: 0.0003903 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0004051 seconds 
Total time for which application threads were stopped: 0.0003984 seconds 
Total time for which application threads were stopped: 0.0003757 seconds 
Total time for which application threads were stopped: 0.0004886 seconds 
Total time for which application threads were stopped: 0.0004056 seconds 
Total time for which application threads were stopped: 0.0003822 seconds 
Total time for which application threads were stopped: 0.0003660 seconds 
Total time for which application threads were stopped: 0.0004520 seconds 
Total time for which application threads were stopped: 0.0003788 seconds 
Total time for which application threads were stopped: 0.0003671 seconds 
Total time for which application threads were stopped: 0.0003752 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0002159 seconds 
Total time for which application threads were stopped: 0.0005065 seconds 
Total time for which application threads were stopped: 0.0003685 seconds 
Total time for which application threads were stopped: 0.0004872 seconds 
Total time for which application threads were stopped: 0.0006339 seconds 
Total time for which application threads were stopped: 0.0003903 seconds 
Total time for which application threads were stopped: 0.0004392 seconds 
qianhd 2012-08-24
引用

2012-08-24T10:47:26.359+0800: 62319.697: [Full GC62334.410: [SoftReference, 1591 refs, 0.0004489 secs]62334.411: [WeakReference, 7980 refs, 0.0011812 secs]62334.412: [FinalReference, 1683 refs, 0.0012770 secs]62334.413: [PhantomReference, 125 refs, 0.0000475 secs]62334.413: [JNI Weak Reference, 0.0003093 secs] [PSYoungGen: 7912K->0K(183360K)] [PSOldGen: 146039K->70830K(145024K)] 153952K->70830K(328384K) [PSPermGen: 42129K->42129K(65536K)], 17.9215166 secs] [Times: user=0.75 sys=0.25, real=17.92 secs] 



引用

2012-08-24T14:21:54.218+0800: 75187.848: [GC75197.407: [SoftReference, 0 refs, 0.0000056 secs]75197.407: [WeakReference, 102 refs, 0.0000212 secs]75197.407: [FinalReference, 1318 refs, 0.6698083 secs]75198.077: [PhantomReference, 1549 refs, 0.0088953 secs]75198.086: [JNI Weak Reference, 0.0315859 secs] [PSYoungGen: 170048K->1232K(171328K)] 240878K->72062K(316352K), 10.2911859 secs] [Times: user=2.01 sys=0.01, real=10.30 secs] 



从这个GC日志 能看出来17秒和10秒时间分别花在什么地方了吗
qianhd 2012-08-24
我发现一个现象, 长时间的GC的前面一般3,4个小时都没有发生过GC, 因为是个GUI程序,我都最下化, 这样内存占用会减少很多, 但是虚拟内存占用还是原来那么多. 
会不会在GC的时候,需要重新把虚拟内存的数据读回到内存, 这样就触发 先腾出内存,(某些进程的内容先暂存虚拟内存),然后把虚拟内存的数据读进来. 

会不会是这种原因导致的??
qianhd 2012-08-30
引用

2012-08-29T19:14:30.968+0800:  10069.800: [GC 10099.225: [SoftReference, 0 refs, 0.0000109 secs]10099.226: [WeakReference, 4072 refs, 0.0012099 secs]10099.227: [FinalReference, 984 refs, 1.5822450 secs]10100.809: [PhantomReference, 251 refs, 0.0001394 secs]10100.809: [JNI Weak Reference, 0.0994015 secs] [PSYoungGen: 175672K->8528K(167360K)] 251523K->100182K(353152K), 31.1580402 secs] [Times: user=0.61 sys=0.52, real=31.16 secs] 



其实GC执行, 并没有花很多时间, 从上面这段信息看, 开始准备GC在10069.800, 真正GC开始在10099.225, 中间花费了约30秒, 貌似时间大部分是花费在GC的准备上. 
这需要看看openJDK里面,执行GC前哪些操作比较耗时, 
我前面的猜测是可能和windows内存和虚拟内存有关, 这东西越来越底层了, 有点力不从心
qianhd 2012-09-28
搞定~ 

windows会在最小化GUI程序的时候trim这个进程的working set, 详见: 
http://support.microsoft.com/default.aspx?scid=kb;en-us;293215 

这样就导致了在下一次GC的时候, 需要重新从页面文件中reload数据, 这样就导致时间花费的很多. 

如何在最小化GUI程序的时候keep working set呢? MS给出了解决方案是针对VC应用程序的, java的话需要使用这个参数 -Dsun.awt.keepWorkingSetOnMinimize=true 

用Intellij的同学可能对这个参数有印象,"可以让IDEA最小化到任务栏时依然保持以占有的内存,当你重新回到IDEA,能够被快速显示,而不是由灰白的界面逐渐显现整个界面,加快回复到原界面的速度。" 

可惜我没用过Intellij,所以不知道有这参数, 
qianhd 2012-09-28
@RednaxelaFX 

能不能从openJDK源码的角度告诉我下, 这gclog的这些时间, 比如9622.177 9638.861 分别是表示什么时间点 

2012-09-26T19:00:49.468+0800:  9622.177: [GC  9638.861: [SoftReference, 0 refs, 0.0000050 secs]9638.861: [WeakReference, 3176 refs, 0.0005855 secs]9638.862: [FinalReference, 1005 refs, 0.0661215 secs]9638.928: [PhantomReference, 751 refs, 0.0058066 secs]9638.934: [JNI Weak Reference, 0.0040648 secs] [PSYoungGen: 147456K->19930K(154624K)] 207743K->80218K(312448K), 16.7716327 secs] [Times: user=2.31 sys=0.42, real=16.77 secs] 


对应的JVM参数 -XX:+PrintReferenceGC -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -Xloggc:gclog.log 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值