概述
记录一次springboot程序部署在windows环境(jdk8),运行到深夜无故宕机。
现象
- springboot突然宕机
- 配置了内存溢出导出 +HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./java_heapdump.hprof,但无文件生成
- 自动生成 hs_err_pid***.log , 以及 mdmp 文件
分析 hs_err_pid 文件
Event: 44970.639 GC heap after
Heap after GC invocations=129 (full 3):
PSYoungGen total 186368K, used 753K [0x000000066b380000, 0x0000000683700000, 0x00000007c0000000)
eden space 185344K, 0% used [0x000000066b380000,0x000000066b380000,0x0000000676880000)
from space 1024K, 73% used [0x0000000683600000,0x00000006836bc470,0x0000000683700000)
to space 1536K, 0% used [0x0000000683400000,0x0000000683400000,0x0000000683580000)
ParOldGen total 1027072K, used 49986K [0x00000003c1a00000, 0x0000000400500000, 0x000000066b380000)
object space 1027072K, 4% used [0x00000003c1a00000,0x00000003c4ad0848,0x0000000400500000)
Metaspace used 68053K, capacity 70002K, committed 70400K, reserved 1112064K
class space used 7401K, capacity 7812K, committed 7936K, reserved 1048576K
}
Deoptimization events (10 events):
Event: 42463.612 Thread 0x00000000454c7000 Uncommon trap: reason=unreached action=reinterpret pc=0x0000000004ff4430 method=com.mysql.jdbc.MysqlCharset.getMatchingJavaEncoding(Ljava/lang/String;)Ljava/lang/String; @ 1
Event: 42463.613 Thread 0x00000000454c7000 Uncommon trap: reason=unreached action=reinterpret pc=0x00000000053abc4c method=com.mysql.jdbc.Field.getStringFromBytes(II)Ljava/lang/String; @ 53
Event: 42463.613 Thread 0x00000000454c7000 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00000000053f1920 method=sun.nio.cs.ThreadLocalCoders$1.hasName(Ljava/lang/Object;Ljava/lang/Object;)Z @ 30
Event: 42463.613 Thread 0x00000000454c7000 Uncommon trap: reason=null_check action=make_not_entrant pc=0x0000000005a9f018 method=sun.reflect.GeneratedConstructorAccessor74.newInstance([Ljava/lang/Object;)Ljava/lang/Object; @ 239
Event: 42463.614 Thread 0x00000000454c7000 Uncommon trap: reason=unreached action=reinterpret pc=0x0000000005960928 method=com.alibaba.druid.proxy.jdbc.WrapperProxyImpl.getAttribute(Ljava/lang/String;)Ljava/lang/Object; @ 4
Event: 42463.615 Thread 0x00000000454c7000 Uncommon trap: reason=unreached action=reinterpret pc=0x0000000003ab1bb0 method=com.alibaba.druid.proxy.jdbc.ConnectionProxyImpl.createChain()Lcom/alibaba/druid/filter/FilterChainImpl; @ 6
Event: 42463.615 Thread 0x00000000454c7000 Uncommon trap: reason=unreached action=reinterpret pc=0x00000000058d18fc method=com.mysql.jdbc.MysqlIO.sendCommand(ILjava/lang/String;Lcom/mysql/jdbc/Buffer;ZLjava/lang/String;I)Lcom/mysql/jdbc/Buffer; @ 160
Event: 42463.623 Thread 0x00000000454c3800 Uncommon trap: reason=unloaded action=reinterpret pc=0x0000000004d81d40 method=sun.reflect.GeneratedMethodAccessor210.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; @ 66
Event: 42463.641 Thread 0x00000000454c3800 Uncommon trap: reason=unreached action=reinterpret pc=0x0000000005996138 method=com.mysql.jdbc.MysqlIO.getSharedSendPacket()Lcom/mysql/jdbc/Buffer; @ 4
Internal exceptions (10 events):
Event: 42440.557 Thread 0x00000000454c7000 Exception <a 'java/net/NoRouteToHostException': No route to host: connect> (0x0000000674accaf0) thrown at [C:\workspace\8-2-build-windows-amd64-cygwin\jdk8u31\2394\hotspot\src\share\vm\prims\jni.cpp, line 742]
Event: 42441.064 Thread 0x00000000454c7000 Exception <a 'java/net/NoRouteToHostException': No route to host: connect> (0x0000000674afc7f0) thrown at [C:\workspace\8-2-build-windows-amd64-cygwin\jdk8u31\2394\hotspot\src\share\vm\prims\jni.cpp, line 742]
Event: 42441.570 Thread 0x00000000454c7000 Exception <a 'java/net/NoRouteToHostException': No route to host: connect> (0x0000000674b2b000) thrown at [C:\workspace\8-2-build-windows-amd64-cygwin\jdk8u31\2394\hotspot\src\share\vm\prims\jni.cpp, line 742]
Event: 42463.079 Thread 0x00000000454c7000 Exception <a 'java/net/ConnectException': Connection timed out: connect> (0x0000000674b59818) thrown at [C:\workspace\8-2-build-windows-amd64-cygwin\jdk8u31\2394\hotspot\src\share\vm\prims\jni.cpp, line 742]
Event: 42463.613 Thread 0x00000000454c7000 Implicit null exception at 0x0000000005a9ea6d to 0x0000000005a9eff9
Event: 42463.614 Thread 0x00000000454c7000 Implicit null exception at 0x0000000005960504 to 0x0000000005960915
Event: 42463.615 Thread 0x00000000454c7000 Implicit null exception at 0x00000000058d164d to 0x00000000058d18c9
Event: 42463.618 Thread 0x00000000454c3800 Exception <a 'java/security/PrivilegedActionException'> (0x00000006726215a8) thrown at [C:\workspace\8-2-build-windows-amd64-cygwin\jdk8u31\2394\hotspot\src\share\vm\prims\jvm.cpp, line 1312]
Event: 42463.618 Thread 0x00000000454c3800 Exception <a 'java/security/PrivilegedActionException'> (0x0000000672621fc8) thrown at [C:\workspace\8-2-build-windows-amd64-cygwin\jdk8u31\2394\hotspot\src\share\vm\prims\jvm.cpp, line 1312]
Event: 42463.641 Thread 0x00000000454c3800 Implicit null exception at 0x00000000059945ff to 0x0000000005996115
- 从日志可以看出gc并无问题,晚上业务量不大,只有一个定时任务在跑,堆内存占用少,不存在堆内存溢出问题
- 每次宕机都和 ‘java/net/ConnectException’: Connection timed out: connect 有关
- 可以看到和druid创建连接,发送数据包有关method=com.alibaba.druid.proxy.jdbc.ConnectionProxyImpl.createChain()Lcom/alibaba/druid/filter/FilterChainImpl;
分析mdmp文件
- 安装了windows sdk
- 使用windbg.exe可以打开mdmp文件
- 只能粗略看出是jvm非法访问了内存
分析应用日志
logback 日志有不少连接断开的错误
结论
- 报错可能和mysql数据库连接有关,猜测可能是 druid 或 jdbc驱动,或 该版本jvm存在bug,导致访问了非法内存,程序直接宕机。
- 该应用未配置连接定时检查,获取连接后检查等。
解决
jdbc 连接配置
url: jdbc:mysql://****&autoReconnect=true
druid 配置
validationQuery: "select 1 "
testOnBorrow: true
initialSize: 5
minIdle: 5
maxActive: 50
maxWait: 60000
timeBetweenEvictionRunsMillis: 60000
minEvictableIdleTimeMillis: 300000
配完后程序没再出现宕机