客户发生的一个问题,正常使用的Tomcat服务,会再突然之前发生Tomcat停止的问题。
发生问题的时候,Tomcat的catalina.out里面,有下面的类似Error Log。
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f85942f6e76, pid=1887, tid=0x00007f8592efa700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x5cae76] G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*)+0x196
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid1887.log
[thread 140211673609984 also had an error]
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
我先是调查了一下我们写的代码和log文件,没发现任何问题。
Google了一下,感觉上时Java本身的Bug的可能性很高。
下面时检索到的,一些类似问题,其实很多人登录过,不过都没有解决。(2018/10时点)
- https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8179505
- https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8163536
- https://www.oracle.com/search/results?Ntt=G1ParScanThreadState%3A%3Acopy_to_survivor_space&Dy=1&Nty=1&cat=bugs&Ntk=S3
回答内容简单总结一下,
This kind of issues can be caused by any bug that corrupts heap memory.
It could be an issue with GC, with the compiler, with bad native code,
If you have strong reproducer kindly share with us. We will reproduce at our end and try to fix the issue.
继续Google,发现下面的一个博客,貌似跟G1GC有关系。
我们可客户的JVM设定,的确时用了G1GC。
-Xmx6144M -Xms6144M -Xss1024k -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Xloggc:/opt/jakarta-tomcat/logs/gc_%p_%t.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+DisableExplicitGC -XX:+DoEscapeAnalysis -XX:MaxGCPauseMillis=50 -XX:-OptimizeStringConcat -XX:+PrintClassHistogramAfterFullGC -XX:+PrintClassHistogramBeforeFullGC -XX:+UseCompressedOops -XX:+UseG1GC
总结来说,出现类似的问题,应该时Java本身Bug的可能性非常高。