记一次生产上线的问题

       昨天项目上线,在完成验证之后,我深吸一口气,完美上线,未出任何问题。然而半个小时之后,收到运维团队发过来的告警,说是有一台服务器的几个端口未被探测到。当时听说之后,我心中一紧,不对呀,我们服务器之后都观察了系统日志,交易一切正常,如果系统启动异常,日志是不会打印的。于是,我赶紧去操作间登录服务器,检测端口。

       netstat -anpl | grep 1099

       netstat -anpl | grep 8900

       输出显示,端口确实未启动。但系统对外服务的端口一切正常。于是我们找到系统启动日志,定位到如下异常:

       

       于是在网上搜索关于Failed to write core dump的文章,很多,但跟我们这次遇到的问题并不一致。

       于此同时,我们发现在启动脚本目录下多了一个日志文件hs_err_pid13893.log,这个文件就是JVM在启动时出现异常的线程堆栈信息,内容描述如下:

       

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x00007fa0a94e14e2, pid=13893, tid=0x00007f9e8fefe700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_172-b11) (build 1.8.0_172-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.172-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libzip.so+0x124e2]  newEntry+0x62
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  T H R E A D  ---------------

Current thread (0x00007f9ed4111000):  JavaThread "pool-FramePools-thread2" [_thread_in_native, id=14081, stack(0x00007f9e8fdfe000,0x00007f9e8feff000)]

siginfo: si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: 0x00007f9efc0cb0fa

Registers:
RAX=0x00007f9ee830cc60, RBX=0x00007f9f6e0e1e00, RCX=0x0000000000000000, RDX=0x00007f9ee830cc60
RSP=0x00007f9e8fefb4b0, RBP=0x00007f9e8fefb500, RSI=0x0000000000000000, RDI=0x00007f9ee8000020
R8 =0x00007f9ee8000088, R9 =0x00007f9e8fefb580, R10=0x0000000000000048, R11=0x00000000c6bfe5f0
R12=0x00007f9efc0cb0dd, R13=0x00007f9ee830cc60, R14=0x000000001cb26b69, R15=0x00007f9f6e0de110
RIP=0x00007fa0a94e14e2, EFLAGS=0x0000000000010206, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
  TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007f9e8fefb4b0)
0x00007f9e8fefb4b0:   00000000c6b13fe0 0000000100000000
0x00007f9e8fefb4c0:   00007f9f6e0de1e0 00007fa0b4c1a02e
0x00007f9e8fefb4d0:   00007f9ee830cc60 00007f9f6e0e1e00
0x00007f9e8fefb4e0:   00007f9f6e0e10b0 00007f9f6e0de110
0x00007f9e8fefb4f0:   000000001cb26b69 000000000000004f
0x00007f9e8fefb500:   00007f9e8fefb560 00007fa0a94e1c3f
0x00007f9e8fefb510:   00007f9ed4111000 0100000000000000
0x00007f9e8fefb520:   00007f9e8fefb580 00007f9e8fefb5d0
0x00007f9e8fefb530:   00007f9e8fefb5ce 00007f9e8fefb580
0x00007f9e8fefb540:   000000000000004f 00007f9ed41111f8
0x00007f9e8fefb550:   00007f9e8fefb9d8 0000000000000001
0x00007f9e8fefb560:   00007f9e8fefb9c0 00007fa0a94d2e90
0x00007f9e8fefb570:   00007fa09a4cf7a7 00007f9f6e0de110
0x00007f9e8fefb580:   672f63642f6d6f63 636e616e7265766f
0x00007f9e8fefb590:   61646174656d2f65 736c706d692f6174
0x00007f9e8fefb5a0:   656d69746e75722f 5f72656b6361502f
0x00007f9e8fefb5b0:   6465525f736e6f63 56626d4362625a6d
0x00007f9e8fefb5c0:   4f534a5f6f745f31 007373616c632e4e
0x00007f9e8fefb5d0:   00007f9e8fefb650 0000000000800000
0x00007f9e8fefb5e0:   00007f9ee8000078 00007f9ee8000020
0x00007f9e8fefb5f0:   00007f9ee8000020 0000000000000000
0x00007f9e8fefb600:   0000000000008010 00000038ad67a567
0x00007f9e8fefb610:   00007f9ed4111000 0000000000008000
0x00007f9e8fefb620:   00007f9e8fefb690 00007f9e8fefb868
0x00007f9e8fefb630:   000000768fefb690 0000000000000200
0x00007f9e8fefb640:   0000000000000040 0000000000000000
0x00007f9e8fefb650:   0000000000000001 0000000000000008
0x00007f9e8fefb660:   0000000000008030 0000023000000801
0x00007f9e8fefb670:   0000007c0000009b 0000007600000078
0x00007f9e8fefb680:   00007fa09a515f20 00007f9ee8000020
0x00007f9e8fefb690:   0000000000008000 0000000000000000
0x00007f9e8fefb6a0:   0000000000000019 00000000c6bfe409 

Instructions: (pc=0x00007fa0a94e14e2)
0x00007fa0a94e14c2:   00 48 c7 40 28 00 00 00 00 41 80 7f 30 00 0f 84
0x00007fa0a94e14d2:   8a 02 00 00 4c 8b 63 08 4d 2b 67 28 4d 03 67 18
0x00007fa0a94e14e2:   41 0f b6 5c 24 1d 41 0f b6 44 24 1c c1 e3 08 09
0x00007fa0a94e14f2:   c3 41 0f b6 44 24 1e 88 45 bd 41 0f b6 54 24 20 

Register to memory mapping:

RAX=0x00007f9ee830cc60 is an unknown value
RBX=0x00007f9f6e0e1e00 is an unknown value
RCX=0x0000000000000000 is an unknown value
RDX=0x00007f9ee830cc60 is an unknown value
RSP=0x00007f9e8fefb4b0 is pointing into the stack for thread: 0x00007f9ed4111000
RBP=0x00007f9e8fefb500 is pointing into the stack for thread: 0x00007f9ed4111000
RSI=0x0000000000000000 is an unknown value
RDI=0x00007f9ee8000020 is an unknown value
R8 =0x00007f9ee8000088 is an unknown value
R9 =0x00007f9e8fefb580 is pointing into the stack for thread: 0x00007f9ed4111000
R10=0x0000000000000048 is an unknown value
R11=

     关键内容如下:

     

    大致含义是java虚拟机在启动的时候发生了原始异常,产生问题的框架是:C  [libzip.so+0x124e2]  newEntry+0x62。于是又在网上搜索C  [libzip.so+0x124e2]  newEntry+0x62,最终搜到这篇文章:https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8214788,这是JVM的一个BUG,java在操作zip文件的时候会调用linux系统libzip.so库,如果在jvm运行的时候做了zip文件操作,而此时zip文件做了修改,就会有JVM崩溃的风险。在oracle的java bug  database中搜索,发现有很多类似的问题:

     

当然,Oracle官方也给出了解决方案:

  1、在JVM启动参数中加入:-Dsun.zip.disableMemoryMapping=true ,但该方案有一个风险,会导致jvm在磁盘读取jar文件的时候有巨大的性能风险,当然前提是你一遍又一遍的读。然而,我在网上搜索关于类似的问题的时候,发现有网友加了该启动参数却又遇到了其他问题。

  2、升级到JDK9。该问题在这个版本得到了完美解决。

  总结: 我们系统在启动的时候会加载zip文件,可能当时我们的操作人员未等其他服务器启动完毕,就立即重启了下一台服务器,碰巧两台服务器都在对同一个zip文件做操作,就导致JVM没有完全启动。

     

   

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值