引言
1、垃圾回收器需要做三件事:
分配内存:垃圾回收算法的设计往往制约了内存分配的方式;
确保存活对象不会被回收
回收垃圾对象(垃圾是指那些不再被使用的对象)
2、对于垃圾回收器的回收来说,不管算法怎么样,基本思路都是基于以下流程:
扫描得到根节点——>从根节点扫描被引用存活对象——>删除不再应用到的对象
一、G1垃圾回收器特点
1、G1的设计原则是首先收集尽可能多的垃圾(Garbage First),也就是说不是一次把所有垃圾都回收掉
2、G1采用内存分区(Region)的思路,将内存划分为一个个相等大小的内存分区
3、属于分代收集器,年轻代与老年代垃圾都会回收。回收模式有YGC 和Mix收集(ygc+部分老年代回收)。算法分两个阶段Marking cycle phase与Evacuation phase
二、G1名词概念
1、Region
G1垃圾回收器把堆划分成一个个大小相同的Region。在HotSpot的实现中,整个堆被划分成2048左右个Region。每个Region的大小在1-32MB之间,具体多大取决于堆的大小。Region的大小可以通过-XX:G1HeapRegionSize
参数指定。每个Region会标记成E、S、O、H中的一种
2、Card Table*(卡表)
每一个Region被分成了固定大小的若干张卡(Card)。每一张卡,都用一个Byte来记录是否修改过。卡表即这些byte的集合
3、Remember Set (RSet)
用于记录其他Regin对本Region的引用。每一次老年代对象引用了年轻代的对象,都会被记录下来。因此在年轻代回收的时候,就可以避免扫描整个老年代来查找根。key 为region的起始地址,value为card索引的集合。当要回收该分区时,通过扫描分区的RSet,来确定引用本分区内的对象是否存活,进而确定本分区内的对象存活情况,避免整堆扫描
PS:这里我理解为Remember Set只需要在年轻代的Region中记录(网上一些文章说是在老年代记录的,我不认同)。因为在做YGC时,必须知道年轻代有多少对象是被老年代引用的,有了Remember Set,就不需要扫描整个老年代了。而在做Mix GC时,会先执行YGC,然后在清理部分老年代对象。YGC之后,年轻代对象已经很少了。在收集老年代的时候,扫描整个年轻代是很快的。
4、CSet
最终需要回收的region的集合。CSet所有分区都会被释放,内部存活的对象都会被转移到分配的空闲分区中
5、TLAB
线程本地分配缓存,每个线程申请内存的时候,先从TLAB获取,空间不够了就在从Region中获取新的一块buffer,减少多线程竞争。
6、SATB
snapshot-at-the-beginning,记录在初始标记阶段存活对象的快照。在并发标记阶段,应用可能会修改原来的引用,所以G1会在每次修改之前将修改记录下来(log buffer),在最终标记阶段读取记录并修改SATB
三、G1垃圾回收流程
整个流程可以分成两大部分:
Marking cycle phase:标记阶段,该阶段是不断循环进行的;
Evacuation phase:该阶段是负责把一部分region的活对象拷贝到空Region里面去,然后回收原本的Region空间,该阶段是STW(stop-the-world)的;
1、Marking cycle phase过程
1.1、initial-mark:G1收集器扫描所有的根。该过程是和young GC的暂停过程一起的(STW)
1.2、concurrent-root-region-scan:扫描Survivor Regions,查找出有哪些年轻代引用了老年代对象。该过程要在下一个YGC开始之前结束,因为进行一次YGC,会导致Survivor Regions改变。
PS:一个老年代的存活对象,可能只被年轻代的对象引用。在一次YGC中,这些存活的年轻代的对象会被复制到Survivor Region,因此需要扫描这些Survivor region来查找这些指向老年代的对象的引用,作为并发标记阶段扫描老年代的根的一部分。
1.3、concurrent-mark:标记整个堆的存活对象。该过程可以被YGC所打断。并发阶段产生的新的引用(或者引用的更新)会被SATB的write barrier记录下来;
1.4、remark:该阶段只需要扫描SATB(Snapshot At The Beginning)的buffer,处理在并发阶段产生的新的存活对象的引用(STW)
1.5、cleanup:该阶段会计算每一个region里面存活的对象,并把完全没有存活对象的Region直接放到空闲列表中。在该阶段还会重置Remember Set。该阶段在计算Region中存活对象的时候,是STW(Stop-the-world)的,而在重置Remember Set的时候,却是可以并行的。
PS:该阶段主要利用RS和bitmap来完成统计存活对象、重置RSet、把空闲region放到空闲region列表中。其中存活对象的统计结果将会用来排序region,以用于下一次的CSet的选择。Clean阶段并不会清理垃圾对象,也不会执行存活对象的拷贝
2、Evacuation phase(STW)
主要步骤:第一个步骤是从Region中选出若干个Region进行回收,这些被选中的Region称为Collect Set(简称CSet);而第二个步骤则是把这些Region中存活的对象复制到空闲的Region中去,同时把这些已经被回收的Region放到空闲Region列表中
这两个步骤又可以被分解成三个任务:
- 根据RS的日志更新RS:只有在处理完了RS的日志之后,RS才能够保证是准确的,完整的,这也是Evacuation是STW的重要原因;
- 扫描RS和其余的根来确定存活对象:该阶段实际上最主要依赖于RS;
- 拷贝存活对象:该阶段只要从2中确定的根触发,沿着引用链一直追溯下去,将存活对象复制到新的region就可以。这个过程中,可能有一部分的年轻代对象会被提升到老年代;
四、G1垃圾回收期日志分析
1、执行init-mark的YGC日志
2019-05-12T12:22:53.142+0800: 232724.880: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.0261880 secs]
[Parallel Time: 21.7 ms, GC Workers: 8]
[GC Worker Start (ms): Min: 232724880.2, Avg: 232724880.3, Max: 232724880.4, Diff: 0.2]
[Ext Root Scanning (ms): Min: 0.5, Avg: 0.9, Max: 2.7, Diff: 2.2, Sum: 6.9]
[Code Root Marking (ms): Min: 0.7, Avg: 2.3, Max: 6.8, Diff: 6.1, Sum: 18.7]
[Update RS (ms): Min: 3.7, Avg: 7.5, Max: 8.4, Diff: 4.7, Sum: 59.8]
[Processed Buffers: Min: 10, Avg: 15.1, Max: 21, Diff: 11, Sum: 121]
[Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 0.4, Diff: 0.3, Sum: 2.6]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[Object Copy (ms): Min: 9.3, Avg: 10.2, Max: 10.5, Diff: 1.2, Sum: 82.0]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 1.0]
[GC Worker Total (ms): Min: 21.2, Avg: 21.4, Max: 21.6, Diff: 0.4, Sum: 171.2]
[GC Worker End (ms): Min: 232724901.6, Avg: 232724901.7, Max: 232724901.9, Diff: 0.2]
[Code Root Fixup: 0.1 ms]
[Code Root Migration: 0.2 ms]
[Clear CT: 0.5 ms]
[Other: 3.7 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.7 ms]
[Ref Enq: 0.0 ms]
[Free CSet: 2.0 ms]
[Eden: 889.0M(889.0M)->0.0B(888.0M) Survivors: 16.0M->16.0M Heap: 1826.3M(2048.0M)->938.2M(2048.0M)]
[Times: user=0.16 sys=0.01, real=0.03 secs]
(Marking cycle phase阶段日志)
2019-05-12T12:22:53.169+0800: 232724.906: [GC concurrent-root-region-scan-start]
2019-05-12T12:22:53.183+0800: 232724.921: [GC concurrent-root-region-scan-end, 0.0145378 secs]
2019-05-12T12:22:53.183+0800: 232724.921: [GC concurrent-mark-start]
2019-05-12T12:22:53.266+0800: 232725.004: [GC concurrent-mark-end, 0.0827598 secs]
2019-05-12T12:22:53.267+0800: 232725.004: [GC remark 232725.005: [GC ref-proc, 0.0044491 secs], 0.0111185 secs]
[Times: user=0.01 sys=0.01, real=0.01 secs]
2019-05-12T12:22:53.278+0800: 232725.016: [GC cleanup 944M->774M(2048M), 0.0098818 secs]
[Times: user=0.06 sys=0.00, real=0.01 secs]
2019-05-12T12:22:53.289+0800: 232725.026: [GC concurrent-cleanup-start]
2019-05-12T12:22:53.290+0800: 232725.027: [GC concurrent-cleanup-end, 0.0010002 secs]
2、没有init-mark的YGC日志
2019-05-09T19:44:10.108+0800: 1.846: [GC pause (G1 Evacuation Pause) (young), 0.0382501 secs]
[Parallel Time: 16.5 ms, GC Workers: 8]
[GC Worker Start (ms): Min: 1846.6, Avg: 1854.5, Max: 1860.1, Diff: 13.5]
[Ext Root Scanning (ms): Min: 0.0, Avg: 0.6, Max: 2.1, Diff: 2.1, Sum: 4.5]
[Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.8, Diff: 0.8, Sum: 1.1]
[Object Copy (ms): Min: 0.0, Avg: 4.6, Max: 11.3, Diff: 11.3, Sum: 36.5]
[Termination (ms): Min: 0.0, Avg: 0.5, Max: 1.8, Diff: 1.8, Sum: 4.1]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[GC Worker Total (ms): Min: 0.0, Avg: 5.8, Max: 13.5, Diff: 13.5, Sum: 46.7]
[GC Worker End (ms): Min: 1860.1, Avg: 1860.3, Max: 1861.7, Diff: 1.6]
[Code Root Fixup: 0.8 ms]
[Code Root Migration: 1.5 ms]
[Clear CT: 0.3 ms]
[Other: 19.1 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 18.7 ms]
[Ref Enq: 0.1 ms]
[Free CSet: 0.2 ms]
[Eden: 102.0M(102.0M)->0.0B(89.0M) Survivors: 0.0B->13.0M Heap: 102.0M(2048.0M)->15.4M(2048.0M)]
(这里heap减少的内存基本上等于Eden的大小)
3、MixedGC日志
2019-05-12T12:26:04.184+0800: 232915.921: [GC pause (G1 Evacuation Pause) (mixed), 0.0193298 secs]
[Parallel Time: 14.5 ms, GC Workers: 8]
[GC Worker Start (ms): Min: 232915922.5, Avg: 232915923.7, Max: 232915931.5, Diff: 9.0]
[Ext Root Scanning (ms): Min: 0.0, Avg: 0.5, Max: 1.2, Diff: 1.2, Sum: 3.8]
[Update RS (ms): Min: 0.0, Avg: 4.4, Max: 5.2, Diff: 5.2, Sum: 35.1]
[Processed Buffers: Min: 0, Avg: 10.1, Max: 18, Diff: 18, Sum: 81]
[Scan RS (ms): Min: 0.0, Avg: 1.8, Max: 2.1, Diff: 2.1, Sum: 14.4]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[Object Copy (ms): Min: 4.5, Avg: 5.7, Max: 6.1, Diff: 1.6, Sum: 45.9]
[Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, Sum: 2.2]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[GC Worker Total (ms): Min: 4.8, Avg: 12.7, Max: 14.1, Diff: 9.3, Sum: 101.9]
[GC Worker End (ms): Min: 232915936.4, Avg: 232915936.4, Max: 232915936.8, Diff: 0.4]
[Code Root Fixup: 0.1 ms]
[Code Root Migration: 0.2 ms]
[Clear CT: 0.5 ms]
[Other: 4.0 ms]
[Choose CSet: 0.9 ms]
[Ref Proc: 0.7 ms]
[Ref Enq: 0.0 ms]
[Free CSet: 2.1 ms]
[Eden: 93.0M(93.0M)->0.0B(93.0M) Survivors: 9216.0K->9216.0K Heap: 592.6M(2048.0M)->295.2M(2048.0M)]
[Times: user=0.09 sys=0.02, real=0.02 secs] (这里回收了老年代,看Heap减少的内存远远大于当前Eden被回收的93M)
从上面的日志中可以看出,标记阶段跟清理阶段是分开的。因为G1会尽量在用户指定的停顿时间内完成GC,所以每次的GC不会一下子把标记的垃圾对象都清理
主要参考文献:
https://www.jianshu.com/p/aef0f4765098