近日开始关注JVM的问题,先用jstat -gcutil pid随意看了一套web系统的JVM情况(容器是tomcat 7,jdk是1.7),结果非常惊讶,Full GC频繁,且周期性出现。
问题出现了,好兴奋啊,搞定它我就能增长经验了哇!!于是乎,和我们亲爱的SA同事交流了一番,我用nohup jstat -gccause pid 1s >> ~/xxx/xxx.log &先把gccause日志记录下来,经过半天日志记录追踪之后,发现LGCC是System.gc()。很好奇怎么会出现这东西,我写的代码肯定没有System.gc(),有点怀疑是tomcat的问题(当时也只是怀疑,无任何根据的)。为了解决这个问题,我先尝试在JVM启动参数中加入-XX:+DisableExplicitGC来禁止System.gc(),大半天后再用jstat看gc情况,FGC的次数是0,然后问题解决了。
Java HotSpot(TM) 64-Bit Server VM (25.241-b07) for linux-amd64 JRE (1.8.0_241-b07), built on Dec 11 2019 02:22:16 by "java_re" with gcc 7.3.0
Memory: 4k page, physical 263613744k(172270432k free), swap 4194300k(3143420k free)
CommandLine flags: -XX:InitialHeapSize=10737418240 -XX:MaxHeapSize=10737418240 -XX:MaxNewSize=1073741824 -XX:NewSize=2147483648 -XX:ParallelGCThreads=16 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:ThreadStackSize=1024 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
0.813: [GC (Allocation Failure) [PSYoungGen: 1572864K->7780K(1835008K)] 1572864K->7860K(10223616K), 0.0127777 secs] [Times: user=0.10 sys=0.02, real=0.02 secs]
2.530: [GC (Metadata GC Threshold) [PSYoungGen: 705920K->56496K(1835008K)] 706000K->56592K(10223616K), 0.0445969 secs] [Times: user=0.31 sys=0.05, real=0.05 secs]
2.575: [Full GC (Metadata GC Threshold) [PSYoungGen: 56496K->0K(1835008K)] [ParOldGen: 96K->53292K(8388608K)] 56592K->53292K(10223616K), [Metaspace: 20837K->20837K(1069056K)], 0.0693738 secs] [Times: user=0.79 sys=0.05, real=0.07 secs]
5.206: [GC (Allocation Failure) [PSYoungGen: 1572864K->262129K(1835008K)] 1626156K->738213K(10223616K), 0.1631593 secs] [Times: user=1.85 sys=0.70, real=0.16 secs]
25.057: [GC (Allocation Failure) [PSYoungGen: 1834993K->262118K(1835008K)] 2311077K->1040702K(10223616K), 0.1081239 secs] [Times: user=0.56 sys=0.84, real=0.11 secs]
35.064: [GC (System.gc()) [PSYoungGen: 1226735K->262134K(1835008K)] 2005319K->1046434K(10223616K), 0.0458713 secs] [Times: user=0.43 sys=0.02, real=0.04 secs]
35.110: [Full GC (System.gc()) [PSYoungGen: 262134K->0K(1835008K)] [ParOldGen: 784299K->995569K(8388608K)] 1046434K->995569K(10223616K), [Metaspace: 27727K->27727K(1075200K)], 0.8586229 secs] [Times: user=9.74 sys=0.82, real=0.86 secs]
36.267: [GC (System.gc()) [PSYoungGen: 131215K->1760K(1835008K)] 1126785K->997337K(10223616K), 0.0056979 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
36.273: [Full GC (System.gc()) [PSYoungGen: 1760K->0K(1835008K)] [ParOldGen: 995577K->989870K(8388608K)] 997337K->989870K(10223616K), [Metaspace: 27733K->27733K(1075200K)], 0.8482330 secs] [Times: user=12.50 sys=0.00, real=0.85 secs]
37.124: [GC (System.gc()) [PSYoungGen: 4836K->416K(1835008K)] 994706K->990286K(10223616K), 0.0056634 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
37.130: [Full GC (System.gc()) [PSYoungGen: 416K->0K(1835008K)] [ParOldGen: 989870K->988993K(8388608K)] 990286K->988993K(10223616K), [Metaspace: 27733K->27733K(1075200K)], 0.6378842 secs] [Times: user=9.58 sys=0.00, real=0.64 secs]
37.774: [GC (System.gc()) [PSYoungGen: 23548K->1568K(1835008K)] 1012541K->990561K(10223616K), 0.0044864 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
37.778: [Full GC (System.gc()) [PSYoungGen: 1568K->0K(1835008K)] [ParOldGen: 988993K->989068K(8388608K)] 990561K->989068K(10223616K), [Metaspace: 27733K->27733K(1075200K)], 0.8374397 secs] [Times: user=12.68 sys=0.00, real=0.84 secs]
38.621: [GC (System.gc()) [PSYoungGen: 32558K->1728K(1835008K)] 1021627K->990796K(10223616K), 0.0043954 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
38.626: [Full GC (System.gc()) [PSYoungGen: 1728K->0K(1835008K)] [ParOldGen: 989068K->989332K(8388608K)] 990796K->989332K(10223616K), [Metaspace: 27733K->27733K(1075200K)], 0.7232080 secs] [Times: user=10.88 sys=0.00, real=0.72 secs]
39.359: [GC (System.gc()) [PSYoungGen: 52028K->2688K(1835008K)] 1041361K->992020K(10223616K), 0.0047027 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
虽然问题临时解决了,为了知其然知其所以然,我继续深究了一番,最终找到了问题所在:是tomcat 7的一个默认配置(JreMemoryLeakPreventionListener)所导致的,更合适的解决方法似乎是 <Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener" gcDaemonProtection="false"/>。
Thank you Konstantin and Chris for your attention.
As stated in the initial post:
'We have recently deployed tomcat-6.0.28 in our organization and are
noticing every hour, a Full GC is occurring. The same application,
same JVM, same JVM args, just a new tomcat release.'
Using the default JreMemoryLeakPreventionListener configuration that
has 'gcDaemonProtection=true' will result in 1hr FullGCs using Sun
1.6 b18, b20 and b21on Solaris and Windows. We've tested and
successfully 'contained' the FullGC behavior using one of the below
configurations:
1) suppress the FullGC using JVM arg -XX:+DisableExplicitGC
2) keep the FullGC but to defer to the CMS collector using JVM arg
-XX:+ExplicitGCInvokesConcurrent
3) <Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener"
gcDaemonProtection="false"/>
4) Disable the listener altogether
We've decided to go with option 3.