strace java_用gdb & strace追踪 jdk OOM bug

本文通过strace和gdb分析了一个JDK OOM问题,发现频繁的mprotect调用可能与JVM的编译优化逻辑有关,特别是与类的编译次数相关。进一步调查发现使用的JDK版本过旧(1.6.0_35),建议升级到最新版本以解决可能的bug。
摘要由CSDN通过智能技术生成

如上所述方案,得到故障期间系统调用异常的地方为:

[pid 21832] 17:15:26 clock_gettime(CLOCK_MONOTONIC,

[pid 21751] 17:15:26 futex(0x42564324, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1419585326, 152497000}, ffffffff

[pid 21747] 17:15:26 mprotect(0x7ff8e1e5d000, 32768, PROT_READ|PROT_WRITE

[pid 21832] 17:15:26 <... clock_gettime resumed> {19389, 378140695}) = 0 <0.000111>

[pid 21747] 17:15:26 <... mprotect resumed> ) = 0 <0.000096>

[pid 21832] 17:15:26 clock_gettime(CLOCK_MONOTONIC, {19389, 378329426}) = 0 <0.000045>

[pid 21832] 17:15:26 clock_gettime(CLOCK_MONOTONIC, {19389, 378442545}) = 0 <0.000044>

[pid 21832] 17:15:26 gettimeofday({1419585326, 103206}, NULL) = 0 <0.000046>

[pid 21832] 17:15:26 futex(0x7ff9b878d1e4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1419585326, 153206000}, ffffffff

[pid 21747] 17:15:26 mprotect(0x7ff8e1e65000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000060>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e6d000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000045>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e75000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000062>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e7d000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000043>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e85000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000056>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e8d000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000104>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e95000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000044>

[pid 21747] 17:15:26 mprotect(0x7ff8e1e9d000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000062>

[pid 21747] 17:15:26 mprotect(0x7ff8e1ea5000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000044>

[pid 21747] 17:15:26 mprotect(0x7ff8e1ead000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000055>

[pid 21747] 17:15:26 mprotect(0x7ff8e1eb5000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000057>

[pid 21747] 17:15:26 mprotect(0x7ff8e1ebd000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000045>

[pid 21747] 17:15:26 mprotect(0x7ff8e1ec5000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000045>

[pid 21747] 17:15:26 mprotect(0x7ff8e1ecd000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000043>

[pid 21747] 17:15:26 mprotect(0x7ff8e1ed5000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000044>

[pid 21747] 17:15:26 mprotect(0x7ff8e1edd000, 32768, PROT_READ|PROT_WRITE) = 0 <0.000055>

可以发现mprotect方法调用频繁【结合故障出现之间的系统调用进行对比】,且全在21747的线程内,查看doc可知每次会malloc 32K的内存

用gdb suspend进程后,查看21747对应的线程,并执行dt得到其堆栈:

Breakpoint 1, 0x00007ff9e12b14e0 in mprotect () from /lib/x86_64-linux-gnu/libc.so.6

(gdb) bt

#0 0x00007ff9e12b14e0 in mprotect () from /lib/x86_64-linux-gnu/libc.so.6

#1 0x00007ff9e1254671 in ?? () from /lib/x86_64-linux-gnu/libc.so.6

#2 0x00007ff9e1255b90 in malloc () from /lib/x86_64-linux-gnu/libc.so.6

#3 0x00007ff9e0cd83f8 in os::malloc(unsigned long) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#4 0x00007ff9e07f5f8c in ChunkPool::allocate(unsigned long) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#5 0x00007ff9e07f572a in Chunk::operator new(unsigned long, unsigned long) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#6 0x00007ff9e07f5d11 in Arena::grow(unsigned long) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#7 0x00007ff9e0cbffa8 in Node::out_grow(unsigned int) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#8 0x00007ff9e07c63e1 in Node::add_out(Node*) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#9 0x00007ff9e0c4b785 in PhaseIdealLoop::clone_loop(IdealLoopTree*, Node_List&, int, Node*) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#10 0x00007ff9e0c4fa59 in PhaseIdealLoop::partial_peel(IdealLoopTree*, Node_List&) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#11 0x00007ff9e0c32dac in IdealLoopTree::iteration_split_impl(PhaseIdealLoop*, Node_List&) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#12 0x00007ff9e0c33000 in IdealLoopTree::iteration_split(PhaseIdealLoop*, Node_List&) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#13 0x00007ff9e0c32f68 in IdealLoopTree::iteration_split(PhaseIdealLoop*, Node_List&) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#14 0x00007ff9e0c41095 in PhaseIdealLoop::build_and_optimize(bool, bool) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#15 0x00007ff9e097134f in Compile::Optimize() () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#16 0x00007ff9e096de84 in Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#17 0x00007ff9e08f0d3e in C2Compiler::compile_method(ciEnv*, ciMethod*, int) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#18 0x00007ff9e09786aa in CompileBroker::invoke_compiler_on_method(CompileTask*) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#19 0x00007ff9e0977f95 in CompileBroker::compiler_thread_loop() () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#20 0x00007ff9e0df0539 in compiler_thread_entry(JavaThread*, Thread*) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#21 0x00007ff9e0de9a41 in JavaThread::run() () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#22 0x00007ff9e0ce0d1f in java_start(Thread*) () from /global/install/jdk1.6.0_35/jre/lib/amd64/server/libjvm.so

#23 0x00007ff9e176eb50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0

#24 0x00007ff9e12b4a7d in clone () from /lib/x86_64-linux-gnu/libc.so.6

#25 0x0000000000000000 in ?? ()

(gdb) cont

Continuing.

由各frame中各方法名:CompileBroker::compiler_thread_loop(), Compile::Optimize()等,可以推断出此时应该与jvm类编译优化相关逻辑有关。结合已有先验知识,知道jvm对类的编译会与调用次数等因素有关。

在mprotect处下断点再cont发现会继续进入该断点,多次cont依旧如此

至此基本可认定是jvm的bug,查看机器的jdk版本:

java -version

java version "1.6.0_35"

Java(TM) SE Runtime Environment (build 1.6.0_35-b10)

Java HotSpot(TM) 64-Bit Server VM (build 20.10-b01, mixed mode)

而1.6最新的为1.6.0_45, 因此先不深究jvm具体的bug所在,先做升级

升级是否修复该故障,且听下回分解:)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值