java程序异常退出的OOM问题使用MAT分析排查过程

7 篇文章 0 订阅

测试同事反馈测试环境集群中某个服务挂了, 查看日志没有任何异常,业务日志正常打印,然后突然程序挂了。

首先查看业务日志是正常打印的,

然后想到去查看linux日志信息

通过grep -C 50 java messsages

通过linux日志可以看到被系统killed

此时怀疑是否程序jvm配置的资源内存不够导致被linux系统杀掉了 ,

进入程序目录发现生成了一个hs_error_pid28351.log

这个文件的生成应该是jvm退出的时候生成的,通过more hs_error_pid28351.log 查看   

fatal error: OutOfMemory encountered: Requested array size exceeds VM limit

此时程序退出了,不知道是哪些对象把jvm内存耗完了。。, 然后又去看了下程序启动 jvm的配置参数

JAVA_OPTS="-Xms256m -Xmx24g -Xmn8g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:log/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=100M -XX:+CrashOnOutOfMemoryError"

其中 -XX:+CrashOnOutOfMemoryError  当发生oom后程序会立即退出,JVM还会生成文本和二进制崩溃文件

也打印了gc详细信息,去查看了 下gc日志,发现gc到某个时刻,  gc日志不打印了, 同时发现上面的日志经过多次 gc后,日志内存还是无法释放,

2022-06-07T15:40:27.089+0800: 64.273: [Full GC (Ergonomics) [PSYoungGen: 88064K->88024K(175104K)] [ParOldGen: 786319K->786319K(786432K)] 874383K->874344K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2765189 secs] [Times: user=8.19 sys=0.00, real=1.28 secs] 
2022-06-07T15:40:28.366+0800: 65.550: [Full GC (Ergonomics) [PSYoungGen: 88064K->88041K(175104K)] [ParOldGen: 786319K->786319K(786432K)] 874383K->874361K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2386290 secs] [Times: user=8.59 sys=0.00, real=1.24 secs] 
2022-06-07T15:40:29.605+0800: 66.789: [Full GC (Ergonomics) [PSYoungGen: 88064K->88044K(175104K)] [ParOldGen: 786319K->786319K(786432K)] 874383K->874364K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.9070903 secs] [Times: user=9.89 sys=0.03, real=1.91 secs] 
2022-06-07T15:40:31.513+0800: 68.697: [Full GC (Ergonomics) [PSYoungGen: 88064K->88044K(175104K)] [ParOldGen: 786319K->786319K(786432K)] 874383K->874364K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2258152 secs] [Times: user=8.14 sys=0.00, real=1.23 secs] 
2022-06-07T15:40:32.739+0800: 69.923: [Full GC (Ergonomics) [PSYoungGen: 88064K->88045K(175104K)] [ParOldGen: 786319K->786319K(786432K)] 874383K->874364K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8916665 secs] [Times: user=9.02 sys=0.02, real=1.88 secs] 
2022-06-07T15:40:34.631+0800: 71.815: [Full GC (Ergonomics) [PSYoungGen: 88064K->88050K(175104K)] [ParOldGen: 786319K->786319K(786432K)] 874383K->874369K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8324609 secs] [Times: user=9.25 sys=0.05, real=1.83 secs] 
2022-06-07T15:40:36.464+0800: 73.648: [Full GC (Ergonomics) [PSYoungGen: 88064K->88050K(175104K)] [ParOldGen: 786319K->786318K(786432K)] 874383K->874368K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8423418 secs] [Times: user=8.45 sys=0.03, real=1.84 secs] 
2022-06-07T15:40:38.307+0800: 75.491: [Full GC (Ergonomics) [PSYoungGen: 88064K->88053K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874371K(961536K), [Metaspace: 93032K->93032K(1132544K)], 2.3724553 secs] [Times: user=10.83 sys=0.05, real=2.36 secs] 
2022-06-07T15:40:40.680+0800: 77.864: [Full GC (Ergonomics) [PSYoungGen: 88064K->88054K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874373K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.7755906 secs] [Times: user=9.44 sys=0.00, real=1.77 secs] 
2022-06-07T15:40:42.456+0800: 79.640: [Full GC (Ergonomics) [PSYoungGen: 88064K->88057K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874376K(961536K), [Metaspace: 93032K->93032K(1132544K)], 3.1007377 secs] [Times: user=11.95 sys=0.06, real=3.09 secs] 
2022-06-07T15:40:45.557+0800: 82.741: [Full GC (Ergonomics) [PSYoungGen: 88064K->88059K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874377K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2195819 secs] [Times: user=8.74 sys=0.00, real=1.22 secs] 
2022-06-07T15:40:46.777+0800: 83.961: [Full GC (Ergonomics) [PSYoungGen: 88064K->88062K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874381K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2158907 secs] [Times: user=8.30 sys=0.03, real=1.22 secs] 
2022-06-07T15:40:47.993+0800: 85.177: [Full GC (Ergonomics) [PSYoungGen: 88064K->88062K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874381K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2364385 secs] [Times: user=8.42 sys=0.00, real=1.24 secs] 
2022-06-07T15:40:49.230+0800: 86.414: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874381K(961536K), [Metaspace: 93032K->93032K(1132544K)], 2.1153734 secs] [Times: user=8.77 sys=0.03, real=2.11 secs] 
2022-06-07T15:40:51.346+0800: 88.530: [Full GC (Ergonomics) [PSYoungGen: 88064K->88055K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874374K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2330226 secs] [Times: user=8.31 sys=0.01, real=1.23 secs] 
2022-06-07T15:40:52.580+0800: 89.764: [Full GC (Ergonomics) [PSYoungGen: 88064K->88058K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874376K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2285841 secs] [Times: user=8.25 sys=0.02, real=1.23 secs] 
2022-06-07T15:40:53.808+0800: 90.993: [Full GC (Ergonomics) [PSYoungGen: 88064K->88059K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874377K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8795535 secs] [Times: user=9.52 sys=0.01, real=1.87 secs] 
2022-06-07T15:40:55.688+0800: 92.872: [Full GC (Ergonomics) [PSYoungGen: 88063K->88056K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874382K->874374K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8573714 secs] [Times: user=8.75 sys=0.00, real=1.86 secs] 
2022-06-07T15:40:57.546+0800: 94.730: [Full GC (Ergonomics) [PSYoungGen: 88063K->88057K(175104K)] [ParOldGen: 786318K->786318K(786432K)] 874381K->874375K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.1935241 secs] [Times: user=8.06 sys=0.02, real=1.19 secs] 
2022-06-07T15:40:58.739+0800: 95.924: [Full GC (Ergonomics) [PSYoungGen: 88064K->88057K(175104K)] [ParOldGen: 786318K->786317K(786432K)] 874382K->874374K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.9287577 secs] [Times: user=9.53 sys=0.02, real=1.93 secs] 
2022-06-07T15:41:00.669+0800: 97.853: [Full GC (Ergonomics) [PSYoungGen: 88064K->88060K(175104K)] [ParOldGen: 786317K->786316K(786432K)] 874381K->874377K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8469753 secs] [Times: user=9.25 sys=0.03, real=1.85 secs] 
2022-06-07T15:41:02.517+0800: 99.701: [Full GC (Ergonomics) [PSYoungGen: 88064K->88061K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874378K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2223578 secs] [Times: user=8.34 sys=0.00, real=1.22 secs] 
2022-06-07T15:41:03.739+0800: 100.923: [Full GC (Ergonomics) [PSYoungGen: 88064K->88062K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874379K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.7926885 secs] [Times: user=9.27 sys=0.08, real=1.79 secs] 
2022-06-07T15:41:05.532+0800: 102.716: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874379K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8188174 secs] [Times: user=9.42 sys=0.05, real=1.82 secs] 
2022-06-07T15:41:07.351+0800: 104.535: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874379K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.1728942 secs] [Times: user=8.23 sys=0.00, real=1.17 secs] 
2022-06-07T15:41:08.525+0800: 105.709: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874379K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2216016 secs] [Times: user=8.59 sys=0.00, real=1.22 secs] 
2022-06-07T15:41:09.747+0800: 106.931: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874379K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2610382 secs] [Times: user=8.86 sys=0.03, real=1.26 secs] 
2022-06-07T15:41:11.008+0800: 108.192: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786316K->786316K(786432K)] 874380K->874380K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.2108993 secs] [Times: user=8.45 sys=0.02, real=1.21 secs] 
2022-06-07T15:41:12.219+0800: 109.403: [Full GC (Ergonomics) [PSYoungGen: 88064K->88063K(175104K)] [ParOldGen: 786316K->786315K(786432K)] 874380K->874379K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.7839150 secs] [Times: user=8.97 sys=0.02, real=1.78 secs] 
2022-06-07T15:41:14.003+0800: 111.187: [Full GC (Ergonomics) [PSYoungGen: 88064K->83455K(175104K)] [ParOldGen: 786315K->786269K(786432K)] 874379K->869725K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.5506682 secs] [Times: user=10.41 sys=0.06, real=1.55 secs] 
2022-06-07T15:41:15.564+0800: 112.748: [Full GC (Ergonomics) [PSYoungGen: 88064K->85227K(175104K)] [ParOldGen: 786269K->786266K(786432K)] 874333K->871493K(961536K), [Metaspace: 93032K->93032K(1132544K)], 1.8170896 secs] [Times: user=8.78 sys=0.00, real=1.81 secs] 
2022-06-07T15:41:17.385+0800: 114.570: [Full GC (Ergonomics) [PSYoungGen: 88064K->85303K(175104K)] [ParOldGen: 786266K->786266K(786432K)] 874330K->871569K(961536K), [Metaspace: 93036K->93036K(1132544K)], 1.8459706 secs] [Times: user=8.89 sys=0.03, real=1.84 secs] 
2022-06-07T15:41:19.239+0800: 116.423: [Full GC (Ergonomics) [PSYoungGen: 88064K->86548K(175104K)] [ParOldGen: 786266K->786266K(786432K)] 874330K->872814K(961536K), [Metaspace: 93036K->93036K(1132544K)], 1.2155478 secs] [Times: user=8.05 sys=0.03, real=1.22 secs] 
2022-06-07T15:41:20.457+0800: 117.642: [Full GC (Ergonomics) [PSYoungGen: 88006K->87101K(175104K)] [ParOldGen: 786266K->786266K(786432K)] 874272K->873367K(961536K), [Metaspace: 93036K->93036K(1132544K)], 1.2462476 secs] [Times: user=8.86 sys=0.00, real=1.25 secs] 
2022-06-07T15:41:21.708+0800: 118.892: [Full GC (Ergonomics) [PSYoungGen: 88064K->87514K(175104K)] [ParOldGen: 786266K->786266K(786432K)] 874330K->873780K(961536K), [Metaspace: 93036K->93036K(1132544K)], 1.2630408 secs] [Times: user=8.73 sys=0.01, real=1.26 secs] 
2022-06-07T15:41:22.971+0800: 120.156: [Full GC (Ergonomics) [PSYoungGen: 88064K->87767K(175104K)] [ParOldGen: 786266K->786266K(786432K)] 874330K->874033K(961536K), [Metaspace: 93036K->93036K(1132544K)], 1.2148857 secs] [Times: user=8.34 sys=0.00, real=1.22 secs] 
2022-06-07T15:41:24.188+0800: 121.372: [Full GC (Ergonomics) [PSYoungGen: 88064K->87903K(175104K)] [ParOldGen: 786266K->786266K(786432K)] 874330K->874169K(961536K), [Metaspace: 93036K->93036K(1132544K)], 1.2650010 secs] [Times: user=8.56 sys=0.02, real=1.27 secs] 
2022-06-07T15:41:25.470+0800: 122.654: [Full GC (Ergonomics) [PSYoungGen: 88064K->0K(175104K)] [ParOldGen: 786266K->32867K(315904K)] 874330K->32867K(491008K), [Metaspace: 93040K->93040K(1132544K)], 0.1703121 secs] [Times: user=0.33 sys=0.06, real=0.17 secs] 

 gc前后内存都无法回收。

但是并没看到生成二进蹦溃的文件。 通过/var/log/messages 可以看到 

Jun  6 17:12:50 localhost abrt-hook-ccpp: Process 28351 (java) of user 0 killed by SIGABRT - dumping core
Jun  6 17:14:13 localhost systemd-logind: Removed session 664882.
Jun  6 17:14:13 localhost systemd: Started Session 665121 of user root.
Jun  6 17:14:13 localhost systemd-logind: New session 665121 of user root.
Jun  6 17:14:14 localhost abrt-server: Executable '/usr/share/jdk1.8/bin/java' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Jun  6 17:14:14 localhost abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2022-06-06-17:12:50-28351' exited with 1
Jun  6 17:14:14 localhost abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2022-06-06-17:12:50-28351'

 可以dump core文件,失败了。abrt-server需要配置  ProcessUnpackaged is set to 'no'

这个先不管了。后续再去看。 我还是先更改下 jvm的启动参数

改为 

-XX:HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=log/gpl-dump.hprof

当发送oom的时候, 程序不会退出,但是会自动生成堆栈信息,以便于分析问题原因

告知测试继续测试,再出现问题,会生成一些堆栈信息,

过一段时间后测试又反馈,后台服务无法响应,进程还在,此时在log目录下看到了生成的 gpl-dump.hprof

接着把文件down下来使用mat工具分析

打开文件后

通过概览页发现,821.7M的某类对象占用

接着点击Leak_Suspects 查看系统视图

 

 通过keyword的关键信息基本可以可以确定程序执行这个方法生成的大量对象无法回收

此时进一步查看大量对象的信息都是什么

 

 发现大量的linkedhashmap

 通过上一步的 DatasetUtil.loadTaskresultBySql 关键方法找到一下代码,通过数据获取的数据rs太多了,然后每条列记录都用LinkedHashMap保存放到list中,导致内存溢出,最后更新程序限制查询条数解决此问题。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

奔跑的窝窝牛

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值