系统jvm调整经验(marjo担保失败)

这个是以前的问题回顾,把以前的问题补上点。


2013-4-7,发假第一天上班,系统出现了Minor GC的promotion failed情况,gc.log出错部分如下:

2013-04-07T08:09:37.385+0800: 833169.883: [GC 833169.883: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  321434576 bytes,  321434576 total
: 3503867K->338237K(3670016K), 0.2794700 secs] 3964702K->1129238K(7864320K), 0.2798160 secs] [Times: user=6.02 sys=0.09, real=0.28 secs] 
2013-04-07T08:15:13.275+0800: 833505.773: [GC 833505.773: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  317076840 bytes,  317076840 total
: 3483965K->318520K(3670016K), 0.2949600 secs] 4274966K->1418110K(7864320K), 0.2953000 secs] [Times: user=7.15 sys=0.07, real=0.29 secs] 
2013-04-07T08:21:10.953+0800: 833863.451: [GC 833863.451: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  320728160 bytes,  320728160 total
: 3464248K->319616K(3670016K), 0.3284790 secs] 4563838K->1723198K(7864320K), 0.3288180 secs] [Times: user=7.85 sys=0.18, real=0.33 secs] 
2013-04-07T08:26:50.551+0800: 834203.049: [GC 834203.049: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:   16506744 bytes,   16506744 total
: 3465344K->24223K(3670016K), 0.3425360 secs] 4868926K->1734913K(7864320K), 0.3428830 secs] [Times: user=8.00 sys=0.18, real=0.34 secs] 
2013-04-07T08:32:13.423+0800: 834525.921: [GC 834525.921: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  320113704 bytes,  320113704 total
- age   2:   10293776 bytes,  330407480 total
: 3169951K->329032K(3670016K), 0.2273480 secs] 4880641K->2039722K(7864320K), 0.2276750 secs] [Times: user=5.62 sys=0.06, real=0.23 secs] 
2013-04-07T08:36:47.349+0800: 834799.847: [GC 834799.847: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  316340320 bytes,  316340320 total
: 3474760K->324097K(3670016K), 0.3457390 secs] 5185450K->2348522K(7864320K), 0.3460950 secs] [Times: user=8.01 sys=0.23, real=0.34 secs] 
2013-04-07T08:42:03.167+0800: 835115.665: [GC 835115.665: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:   14631544 bytes,   14631544 total
: 3469825K->19313K(3670016K), 0.3222880 secs] 5494250K->2347456K(7864320K), 0.3226280 secs] [Times: user=7.91 sys=0.20, real=0.32 secs] 
2013-04-07T08:46:04.353+0800: 835356.851: [GC 835356.851: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:   12043952 bytes,   12043952 total
- age   2:   10291272 bytes,   22335224 total
: 3165041K->27367K(3670016K), 0.0304040 secs] 5493184K->2355511K(7864320K), 0.0307580 secs] [Times: user=0.46 sys=0.00, real=0.03 secs] 
2013-04-07T08:50:30.606+0800: 835623.103: [GC 835623.104: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  320838872 bytes,  320838872 total
- age   2:    8439048 bytes,  329277920 total
- age   3:    8783736 bytes,  338061656 total
: 3173095K->340297K(3670016K), 0.2662150 secs] 5501239K->2668441K(7864320K), 0.2665650 secs] [Times: user=6.58 sys=0.06, real=0.26 secs] 
2013-04-07T08:55:04.642+0800: 835897.140: [GC 835897.140: [ParNew (promotion failed)
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  319886912 bytes,  319886912 total
: 3479337K->3469211K(3670016K), 2.7858560 secs]835899.926: [CMS: 2390293K->2806333K(4194304K), 7.2327670 secs] 5807481K->2806333K(7864320K), [CMS Perm : 142232K->139274K(262144K)], 10.0190850 secs] [Times: user=14.94 sys=0.08, real=10.02 secs] 
2013-04-07T09:00:05.172+0800: 836197.669: [GC 836197.670: [ParNew
Desired survivor size 268435456 bytes, new threshold 1 (max 7)
- age   1:  347800688 bytes,  347800688 total
: 3145728K->345613K(3670016K), 0.2649090 secs] 5952061K->3151947K(7864320K), 0.2652750 secs] [Times: user=6.55 sys=0.06, real=0.26 secs] 
2013-04-07T09:05:05.820+0800: 836498.318: [GC 836498.319: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:   21618496 bytes,   21618496 total
: 3491341K->32357K(3670016K), 0.3228680 secs] 6297675K->3175375K(7864320K), 0.3232160 secs] [Times: user=6.47 sys=0.11, real=0.32 secs] 
2013-04-07T09:05:06.145+0800: 836498.643: [GC [1 CMS-initial-mark: 3143018K(4194304K)] 3179072K(7864320K), 0.0915790 secs] [Times: user=0.09 sys=0.00, real=0.10 secs] 
2013-04-07T09:05:06.238+0800: 836498.735: [CMS-concurrent-mark-start]
2013-04-07T09:05:06.570+0800: 836499.068: [CMS-concurrent-mark: 0.332/0.332 secs] [Times: user=4.09 sys=0.02, real=0.33 secs] 
2013-04-07T09:05:06.570+0800: 836499.068: [CMS-concurrent-preclean-start]
2013-04-07T09:05:06.609+0800: 836499.107: [CMS-concurrent-preclean: 0.039/0.039 secs] [Times: user=0.04 sys=0.00, real=0.04 secs] 
2013-04-07T09:05:06.609+0800: 836499.107: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2013-04-07T09:05:08.112+0800: 836500.610: [CMS-concurrent-abortable-preclean: 1.403/1.503 secs] [Times: user=1.66 sys=0.03, real=1.50 secs] 
2013-04-07T09:05:08.114+0800: 836500.612: [GC[YG occupancy: 63468 K (3670016 K)]2013-04-07T09:05:08.114+0800: 836500.612: [GC 836500.612: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:     361232 bytes,     361232 total
- age   2:   17726288 bytes,   18087520 total
: 63468K->24161K(3670016K), 0.0197520 secs] 3206486K->3167179K(7864320K), 0.0200000 secs] [Times: user=0.37 sys=0.01, real=0.02 secs] 
836500.632: [Rescan (parallel) , 0.0154310 secs]836500.648: [weak refs processing, 0.0003580 secs] [1 CMS-remark: 3143018K(4194304K)] 3167179K(7864320K), 0.0360780 secs] [Times: user=0.74 sys=0.02, real=0.04 secs] 
2013-04-07T09:05:08.151+0800: 836500.649: [CMS-concurrent-sweep-start]
2013-04-07T09:05:08.856+0800: 836501.354: [CMS-concurrent-sweep: 0.705/0.705 secs] [Times: user=0.79 sys=0.01, real=0.71 secs] 
2013-04-07T09:05:08.857+0800: 836501.354: [CMS-concurrent-reset-start]
2013-04-07T09:05:08.885+0800: 836501.383: [CMS-concurrent-reset: 0.029/0.029 secs] [Times: user=0.03 sys=0.00, real=0.02 secs] 
2013-04-07T09:05:10.887+0800: 836503.385: [GC [1 CMS-initial-mark: 3098897K(4194304K)] 3155998K(7864320K), 0.1923090 secs] [Times: user=0.19 sys=0.00, real=0.19 secs] 
2013-04-07T09:05:11.080+0800: 836503.578: [CMS-concurrent-mark-start]
2013-04-07T09:05:11.446+0800: 836503.944: [CMS-concurrent-mark: 0.366/0.366 secs] [Times: user=4.15 sys=0.03, real=0.37 secs] 
2013-04-07T09:05:11.446+0800: 836503.944: [CMS-concurrent-preclean-start]
2013-04-07T09:05:11.490+0800: 836503.987: [CMS-concurrent-preclean: 0.042/0.043 secs] [Times: user=0.05 sys=0.00, real=0.04 secs] 
2013-04-07T09:05:11.490+0800: 836503.987: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2013-04-07T09:05:12.889+0800: 836505.386: [CMS-concurrent-abortable-preclean: 1.198/1.399 secs] [Times: user=1.27 sys=0.02, real=1.40 secs] 
2013-04-07T09:05:12.890+0800: 836505.388: [GC[YG occupancy: 76717 K (3670016 K)]2013-04-07T09:05:12.891+0800: 836505.388: [GC 836505.388: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:     469248 bytes,     469248 total
- age   2:     109400 bytes,     578648 total
- age   3:   17610152 bytes,   18188800 total
: 76717K->24207K(3670016K), 0.0173870 secs] 3175615K->3123105K(7864320K), 0.0176370 secs] [Times: user=0.38 sys=0.00, real=0.02 secs] 
836505.406: [Rescan (parallel) , 0.0168270 secs]836505.423: [weak refs processing, 0.0000440 secs] [1 CMS-remark: 3098897K(4194304K)] 3123105K(7864320K), 0.0348030 secs] [Times: user=0.72 sys=0.01, real=0.03 secs] 
2013-04-07T09:05:12.926+0800: 836505.424: [CMS-concurrent-sweep-start]
2013-04-07T09:05:13.600+0800: 836506.098: [CMS-concurrent-sweep: 0.674/0.674 secs] [Times: user=0.77 sys=0.01, real=0.68 secs] 
2013-04-07T09:05:13.601+0800: 836506.098: [CMS-concurrent-reset-start]
2013-04-07T09:05:13.629+0800: 836506.127: [CMS-concurrent-reset: 0.029/0.029 secs] [Times: user=0.03 sys=0.00, real=0.03 secs] 
2013-04-07T09:05:15.631+0800: 836508.129: [GC [1 CMS-initial-mark: 3098862K(4194304K)] 3144523K(7864320K), 0.1552790 secs] [Times: user=0.16 sys=0.00, real=0.15 secs] 
2013-04-07T09:05:15.787+0800: 836508.285: [CMS-concurrent-mark-start]
2013-04-07T09:05:16.093+0800: 836508.591: [CMS-concurrent-mark: 0.306/0.306 secs] [Times: user=3.67 sys=0.00, real=0.30 secs] 
2013-04-07T09:05:16.094+0800: 836508.591: [CMS-concurrent-preclean-start]
2013-04-07T09:05:16.135+0800: 836508.633: [CMS-concurrent-preclean: 0.041/0.041 secs] [Times: user=0.05 sys=0.01, real=0.04 secs] 
2013-04-07T09:05:16.135+0800: 836508.633: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2013-04-07T09:05:17.278+0800: 836509.776: [CMS-concurrent-abortable-preclean: 1.043/1.143 secs] [Times: user=1.11 sys=0.01, real=1.15 secs] 
2013-04-07T09:05:17.280+0800: 836509.778: [GC[YG occupancy: 56675 K (3670016 K)]2013-04-07T09:05:17.280+0800: 836509.778: [GC 836509.778: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:     583600 bytes,     583600 total
- age   2:     167936 bytes,     751536 total
- age   3:      89224 bytes,     840760 total
- age   4:   17586936 bytes,   18427696 total
: 56675K->23067K(3670016K), 0.0174440 secs] 3155537K->3121929K(7864320K), 0.0177180 secs] [Times: user=0.38 sys=0.01, real=0.02 secs] 
836509.796: [Rescan (parallel) , 0.0180150 secs]836509.814: [weak refs processing, 0.0000430 secs] [1 CMS-remark: 3098862K(4194304K)] 3121929K(7864320K), 0.0360570 secs] [Times: user=0.73 sys=0.01, real=0.03 secs] 
2013-04-07T09:05:17.317+0800: 836509.815: [CMS-concurrent-sweep-start]
2013-04-07T09:05:17.993+0800: 836510.491: [CMS-concurrent-sweep: 0.676/0.676 secs] [Times: user=0.73 sys=0.01, real=0.67 secs] 
2013-04-07T09:05:17.993+0800: 836510.491: [CMS-concurrent-reset-start]
2013-04-07T09:05:18.022+0800: 836510.520: [CMS-concurrent-reset: 0.029/0.029 secs] [Times: user=0.05 sys=0.00, real=0.03 secs] 
2013-04-07T09:05:20.024+0800: 836512.521: [GC [1 CMS-initial-mark: 3098786K(4194304K)] 3166510K(7864320K), 0.2469520 secs] [Times: user=0.25 sys=0.00, real=0.25 secs] 
2013-04-07T09:05:20.271+0800: 836512.769: [CMS-concurrent-mark-start]
2013-04-07T09:05:20.586+0800: 836513.084: [CMS-concurrent-mark: 0.314/0.315 secs] [Times: user=3.87 sys=0.01, real=0.31 secs] 
2013-04-07T09:05:20.587+0800: 836513.084: [CMS-concurrent-preclean-start]
2013-04-07T09:05:20.630+0800: 836513.128: [CMS-concurrent-preclean: 0.043/0.044 secs] [Times: user=0.05 sys=0.00, real=0.05 secs] 
2013-04-07T09:05:20.630+0800: 836513.128: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2013-04-07T09:05:21.915+0800: 836514.413: [CMS-concurrent-abortable-preclean: 1.284/1.284 secs] [Times: user=1.45 sys=0.03, real=1.28 secs] 
2013-04-07T09:05:21.917+0800: 836514.414: [GC[YG occupancy: 96886 K (3670016 K)]2013-04-07T09:05:21.917+0800: 836514.415: [GC 836514.415: [ParNew
Desired survivor size 268435456 bytes, new threshold 7 (max 7)
- age   1:    1175104 bytes,    1175104 total
- age   2:     196944 bytes,    1372048 total
- age   3:     155496 bytes,    1527544 total
- age   4:      84448 bytes,    1611992 total
- age   5:   17459744 bytes,   19071736 total
: 96886K->26168K(3670016K), 0.0182610 secs] 3195673K->3124954K(7864320K), 0.0185400 secs] [Times: user=0.38 sys=0.01, real=0.02 secs] 
836514.433: [Rescan (parallel) , 0.0157270 secs]836514.449: [weak refs processing, 0.0000550 secs] [1 CMS-remark: 3098786K(4194304K)] 3124954K(7864320K), 0.0346020 secs] [Times: user=0.74 sys=0.02, real=0.04 secs] 
2013-04-07T09:05:21.952+0800: 836514.450: [CMS-concurrent-sweep-start]
2013-04-07T09:05:22.645+0800: 836515.142: [CMS-concurrent-sweep: 0.692/0.693 secs] [Times: user=0.87 sys=0.01, real=0.69 secs] 
2013-04-07T09:05:22.645+0800: 836515.142: [CMS-concurrent-reset-start]
2013-04-07T09:05:22.674+0800: 836515.171: [CMS-concurrent-reset: 0.029/0.029 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 


jvm进行如了没完没了的CMS GC,直到11点后(负载小了)才开始进行正常的Minor GC。


总结了问题的原因:

由于Minjo分配过大,Minjo GC是暂停所有的线程,回收时锁定了现在的正在运行的线程对象,而这些线程对象时正在被使用的因此不允许回收。Minjo越大不能被回收的对象就越多。再由于使用了Minjo的eden和Survivor的SurvivorRatio=6和默认的Desired survivor size 50%=256m设置,因此不断的age=1为300m进入到marjo,marjo担保失败,进了传说的悲观策略。

悲观策略文章引用:http://tmalltesting.com/archives/663  说的很明白,major的cms进入了2秒的不间断循环(这时的负载也比较大)。

调整方案:

eden设置的小一点,有3072m调整到2048m,Desired survivor size为1024m。minor回收时锁定的对象少了,回收频繁了,大的Survivor提供足够的空间,threshold有7调整到默认的4。系统恢复正常了。









  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值