1. 简介
YGC整个过程都在STW下进行,出于减少停顿时间的考量,对于老年代的回收显然需要与Mutator同时进行,G1引入了混合式GC,与CMS算法类似,均采用了并发标记。
混合式回收主要分为如下子阶段:
- 初始标记子阶段
- 并发标记子阶段
- 再标记子阶段
- 清理子阶段
- 垃圾回收
2. 算法概览
2.1 标记算法概览
由于混合式GC使用的是并发标记,Mutator可能会随时改变对象引用关系,从而导致漏标和错标。
错标仅导致浮动垃圾,并不会导致运行错误。而漏标会导致对象被错误的回收,进而产生严重错误;为了避免漏标,G1引入了三色标记法。
- 白色:垃圾收集器未探测到的对象
- 灰色:活着的对象,但是依然没有被垃圾收集器扫描过
- 黑色:活着的对象,并且已经被垃圾收集器扫描过
2.2 STAB机制简介
SATB(start at the beginning),在并发标记时,如果对象引用关系发生变化,G1会通过putfield字节码中的写屏障将这一引用关系的变化写入G1SATBMarkQueueSet和G1SATBMarkQueue中。并在并发标记子阶段和再标记子阶段处理G1SATBMarkQueueSet和G1SATBMarkQueue中的数据。
3. 源码分析
3.1 是否进入并发标记判定
g1Policy.cpp
G1IHOPControl* G1Policy::create_ihop_control(const G1Predictions* predictor){
if (G1UseAdaptiveIHOP) {
return new G1AdaptiveIHOPControl(InitiatingHeapOccupancyPercent,
predictor,
G1ReservePercent,
G1HeapWastePercent);
} else {
return new G1StaticIHOPControl(InitiatingHeapOccupancyPercent);
}
}
bool G1Policy::need_to_start_conc_mark(const char* source, size_t alloc_word_size) {
if (about_to_start_mixed_phase()) {
return false;
}
size_t marking_initiating_used_threshold = _ihop_control->get_conc_mark_start_threshold();
size_t cur_used_bytes = _g1h->non_young_capacity_bytes();
size_t alloc_byte_size = alloc_word_size * HeapWordSize;
size_t marking_request_bytes = cur_used_bytes + alloc_byte_size;
bool result = false;
if (marking_request_bytes > marking_initiating_used_threshold) {
result = collector_state()->in_young_only_phase() && !collector_state()->in_young_gc_before_mixed();
log_debug(gc, ergo, ihop)("%s occupancy: " SIZE_FORMAT "B allocation request: " SIZE_FORMAT "B threshold: " SIZE_FORMAT "B (%1.2f) source: %s",
result ? "Request concurrent cycle initiation (occupancy higher than threshold)" : "Do not request concurrent cycle initiation (still doing mixed collections)",
cur_used_bytes, alloc_byte_size, marking_initiating_used_threshold, (double) marking_initiating_used_threshold / _g1h->capacity() * 100, source);
}
return result;
}
void G1Policy::record_new_heap_size(uint new_number_of_regions) {
// re-calculate the necessary reserve
double reserve_regions_d = (double) new_number_of_regions * _reserve_factor;
// We use ceiling so that if reserve_regions_d is > 0.0 (but
// smaller than 1.0) we'll get 1.
_reserve_regions = (uint) ceil(reserve_regions_d);
_young_gen_sizer->heap_size_changed(new_number_of_regions);
_ihop_control->update_target_occupancy(new_number_of_regions * HeapRegion::GrainBytes);
}
g1Policy.hpp
size_t get_conc_mark_start_threshold() {
guarantee(_target_occupancy > 0, "Target occupancy must have been initialized.");
return (size_t) (_initial_ihop_percent * _target_occupancy / 100.0);
}
- YGC最后阶段判断是否启动并发标记
- 判断的依据是分配和即将分配的内存占比是否大于阈值
- 阈值受JVM参数InitiatingHeapOccupancyPercent控制,默认45
如果需要进行并发标记,则通知并发标记线程
g1CollectedHeap.cpp
void G1CollectedHeap::do_concurrent_mark() {
MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
if (!_cm_thread->in_progress()) {
_cm_thread->set_started();
CGC_lock->notify();
}
}
3.2 初始标记子阶段
初始标记子阶段需要STW。
混合式GC的根GC就是YGC的Survivor Region。
扫描根Region的入口在g1ConcurrentMark.cpp中
void G1ConcurrentMark::scan_root_regions() {
// scan_in_progress() will have been set to true only if there was
// at least one root region to scan. So, if it's false, we
// should not attempt to do any further work.
if (root_regions()->scan_in_progress()) {
assert(!has_aborted(), "Aborting before root region scanning is finished not supported.");
_num_concurrent_workers = MIN2(calc_active_marking_workers(),
// We distribute work on a per-region basis, so starting
// more threads than that is useless.
root_regions()->num_root_regions());
assert(_num_concurrent_workers <= _max_concurrent_workers,
"Maximum number of marking threads exceeded");
G1CMRootRegionScanTask task(this);
log_debug(gc, ergo)("Running %s using %u workers for %u work units.",
task.name(), _num_concurrent_workers, root_regions()->num_root_regions());
_concurrent_workers->run_task(&task, _num_concurrent_workers);
// It's possible that has_aborted() is true here without actually
// aborting the survivor scan earlier. This is OK as it's
// mainly used for sanity checking.
root_regions()->scan_finished();
}
}
- 在GC并发线程组中,调用G1CMRootRegionScanTask
class G1CMRootRegionScanTask : public AbstractGangTask {
G1ConcurrentMark* _cm;
public:
G1CMRootRegionScanTask(G1ConcurrentMark* cm) :
AbstractGangTask("G1 Root Region Scan"), _cm(cm) { }
void work(uint worker_id) {
assert(Thread::current()->is_ConcurrentGC_thread(),
"this should only be done by a conc GC thread");
G1CMRootRegions* root_regions = _cm->root_regions();
HeapRegion* hr = root_regions->claim_next();
while (hr != NULL) {
_cm->scan_root_region(hr, worker_id);
hr = root_regions->claim_next();
}
}
};
- while循环遍历根Region列表
- 调用scan_root_region,扫描每个根Region
void G1ConcurrentMark::scan_root_region(HeapRegion* hr, uint worker_id) {
assert(hr->is_old() || (hr->is_survivor() && hr->next_top_at_mark_start() == hr->bottom()),
"Root regions must be old or survivor but region %u is %s", hr->hrm_index(), hr->get_type_str());
G1RootRegionScanClosure cl(_g1h, this, worker_id);
const uintx interval = PrefetchScanIntervalInBytes;
HeapWord* curr = hr->next_top_at_mark_start();
const HeapWord* end = hr->top();
while (curr < end) {
Prefetch::read(curr, interval);
oop obj = oop(curr);
int size = obj->oop_iterate_size(&cl);
assert(size == obj->size(), "sanity");
curr += size;
}
}
- 执行闭包G1RootRegionScanClosure,遍历整个Region中的对象
G1RootRegionScanClosure具体逻辑在g1OopClosures.inline.hpp
inline void G1RootRegionScanClosure::do_oop_work(T* p) {
T heap_oop = RawAccess<MO_VOLATILE>::oop_load(p);
if (CompressedOops::is_null(heap_oop)) {
return;
}
oop obj = CompressedOops::decode_not_null(heap_oop);
_cm->mark_in_next_bitmap(_worker_id, obj);
}
- 调用mark_in_next_bitmap标记根Region中的对象
3.3 并发标记子阶段
并发标记子阶段与Mutator同时进行。
并发标记的入口在G1CMConcurrentMarkingTask的work方法
void work(uint worker_id) {
assert(Thread::current()->is_ConcurrentGC_thread(), "Not a concurrent GC thread");
ResourceMark rm;
double start_vtime = os::elapsedVTime();
{
SuspendibleThreadSetJoiner sts_join;
assert(worker_id < _cm->active_tasks(), "invariant");
G1CMTask* task = _cm->task(worker_id);
task->record_start_time();
if (!_cm->has_aborted()) {
do {
task->do_marking_step(G1ConcMarkStepDurationMillis,
true /* do_termination */,
false /* is_serial*/);
_cm->do_yield_check();
} while (!_cm->has_aborted() && task->has_aborted());
}
task->record_end_time();
guarantee(!task->has_aborted() || _cm->has_aborted(), "invariant");
}
double end_vtime = os::elapsedVTime();
_cm->update_accum_task_vtime(worker_id, end_vtime - start_vtime);
}
- 调用do_marking_step进行并发标记
- G1ConcMarkStepDurationMillis JVM参数定义了每次并发标记的最大时长,默认10毫秒
void G1CMTask::do_marking_step(double time_target_ms,
bool do_termination,
bool is_serial)
do_marking_step函数的代码非常长且复杂,这里不再贴出,该函数主要功能如下:
- 处理STAB队列,STAB的处理模式与DCQS类似
- 扫描全部的灰色对象,并对它们的每一个field进行递归并发标记
- 当前任务完成后,窃取其他队列的任务
3.4 再标记子阶段
由于并发标记子阶段与Mutator同时执行,对象引用关系仍然有可能发生变化,因此需要再标记阶段STW后处理完成全部STAB。
再标记子阶段入口在G1CMRemarkTask
class G1CMRemarkTask : public AbstractGangTask {
G1ConcurrentMark* _cm;
public:
void work(uint worker_id) {
G1CMTask* task = _cm->task(worker_id);
task->record_start_time();
{
ResourceMark rm;
HandleMark hm;
G1RemarkThreadsClosure threads_f(G1CollectedHeap::heap(), task);
Threads::threads_do(&threads_f);
}
do {
task->do_marking_step(1000000000.0 /* something very large */,
true /* do_termination */,
false /* is_serial */);
} while (task->has_aborted() && !_cm->has_overflown());
// If we overflow, then we do not want to restart. We instead
// want to abort remark and do concurrent marking again.
task->record_end_time();
}
G1CMRemarkTask(G1ConcurrentMark* cm, uint active_workers) :
AbstractGangTask("Par Remark"), _cm(cm) {
_cm->terminator()->reset_for_reuse(active_workers);
}
};
- 仍然调用do_marking_step函数处理,但是target time为1000000000毫秒,表示任何情况下都要执行完成
3.5 清理子阶段
清理子阶段是指RSet清理、选择回收的Region等,但并不会复制对象和回收Region。清理子阶段仍然需要STW,入口在cleanup方法:
void G1ConcurrentMark::cleanup() {
assert_at_safepoint_on_vm_thread();
// If a full collection has happened, we shouldn't do this.
if (has_aborted()) {
return;
}
G1Policy* g1p = _g1h->g1_policy();
g1p->record_concurrent_mark_cleanup_start();
double start = os::elapsedTime();
verify_during_pause(G1HeapVerifier::G1VerifyCleanup, VerifyOption_G1UsePrevMarking, "Cleanup before");
{
GCTraceTime(Debug, gc, phases) debug("Update Remembered Set Tracking After Rebuild", _gc_timer_cm);
G1UpdateRemSetTrackingAfterRebuild cl(_g1h);
_g1h->heap_region_iterate(&cl);
}
if (log_is_enabled(Trace, gc, liveness)) {
G1PrintRegionLivenessInfoClosure cl("Post-Cleanup");
_g1h->heap_region_iterate(&cl);
}
verify_during_pause(G1HeapVerifier::G1VerifyCleanup, VerifyOption_G1UsePrevMarking, "Cleanup after");
// We need to make this be a "collection" so any collection pause that
// races with it goes around and waits for Cleanup to finish.
_g1h->increment_total_collections();
// Local statistics
double recent_cleanup_time = (os::elapsedTime() - start);
_total_cleanup_time += recent_cleanup_time;
_cleanup_times.add(recent_cleanup_time);
{
GCTraceTime(Debug, gc, phases) debug("Finalize Concurrent Mark Cleanup", _gc_timer_cm);
_g1h->g1_policy()->record_concurrent_mark_cleanup_end();
}
}
- G1UpdateRemSetTrackingAfterRebuild中将Region的RSet状态置为Complete
- 调用record_concurrent_mark_cleanup_end选择哪些Region需要回收
G1UpdateRemSetTrackingAfterRebuild
class G1UpdateRemSetTrackingAfterRebuild : public HeapRegionClosure {
G1CollectedHeap* _g1h;
public:
G1UpdateRemSetTrackingAfterRebuild(G1CollectedHeap* g1h) : _g1h(g1h) { }
virtual bool do_heap_region(HeapRegion* r) {
_g1h->g1_policy()->remset_tracker()->update_after_rebuild(r);
return false;
}
};
- 调用G1RemSetTrackingPolicy的update_after_rebuild方法
update_after_rebuild在G1RemSetTrackingPolicy类中
void G1RemSetTrackingPolicy::update_after_rebuild(HeapRegion* r) {
assert(SafepointSynchronize::is_at_safepoint(), "should be at safepoint");
if (r->is_old_or_humongous_or_archive()) {
if (r->rem_set()->is_updating()) {
assert(!r->is_archive(), "Archive region %u with remembered set", r->hrm_index());
r->rem_set()->set_state_complete();
}
//略去部分代码
}
- 将RSet状态置为Complete
g1Policy.cpp
void G1Policy::record_concurrent_mark_cleanup_end() {
cset_chooser()->rebuild(_g1h->workers(), _g1h->num_regions());
bool mixed_gc_pending = next_gc_should_be_mixed("request mixed gcs", "request young-only gcs");
if (!mixed_gc_pending) {
clear_collection_set_candidates();
abort_time_to_mixed_tracking();
}
collector_state()->set_in_young_gc_before_mixed(mixed_gc_pending);
collector_state()->set_mark_or_rebuild_in_progress(false);
double end_sec = os::elapsedTime();
double elapsed_time_ms = (end_sec - _mark_cleanup_start_sec) * 1000.0;
_analytics->report_concurrent_mark_cleanup_times_ms(elapsed_time_ms);
_analytics->append_prev_collection_pause_end_ms(elapsed_time_ms);
record_pause(Cleanup, _mark_cleanup_start_sec, end_sec);
}
- 调用CollectionSetChooser rebuild方法选择CSet
- 调用record_concurrent_mark_cleanup_end,判断CSet中可回收空间占比是否小于阈值
collectionSetChooser.cpp
void CollectionSetChooser::rebuild(WorkGang* workers, uint n_regions) {
clear();
uint n_workers = workers->active_workers();
uint chunk_size = calculate_parallel_work_chunk_size(n_workers, n_regions);
prepare_for_par_region_addition(n_workers, n_regions, chunk_size);
ParKnownGarbageTask par_known_garbage_task(this, chunk_size, n_workers);
workers->run_task(&par_known_garbage_task);
sort_regions();
}
void CollectionSetChooser::sort_regions() {
// First trim any unused portion of the top in the parallel case.
if (_first_par_unreserved_idx > 0) {
assert(_first_par_unreserved_idx <= regions_length(),
"Or we didn't reserved enough length");
regions_trunc_to(_first_par_unreserved_idx);
}
_regions.sort(order_regions);
assert(_end <= regions_length(), "Requirement");
#ifdef ASSERT
for (uint i = 0; i < _end; i++) {
assert(regions_at(i) != NULL, "Should be true by sorting!");
}
#endif // ASSERT
if (log_is_enabled(Trace, gc, liveness)) {
G1PrintRegionLivenessInfoClosure cl("Post-Sorting");
for (uint i = 0; i < _end; ++i) {
HeapRegion* r = regions_at(i);
cl.do_heap_region(r);
}
}
verify();
}
static int order_regions(HeapRegion* hr1, HeapRegion* hr2) {
if (hr1 == NULL) {
if (hr2 == NULL) {
return 0;
} else {
return 1;
}
} else if (hr2 == NULL) {
return -1;
}
double gc_eff1 = hr1->gc_efficiency();
double gc_eff2 = hr2->gc_efficiency();
if (gc_eff1 > gc_eff2) {
return -1;
} if (gc_eff1 < gc_eff2) {
return 1;
} else {
return 0;
}
}
- 使用ParKnownGarbageTask并行判断分区的垃圾情况
- 对Region继续排序,从order_regions函数可以看出,排序依据是gc_efficiency
计算分区gc_efficiency逻辑在heapRegion.cpp
void HeapRegion::calc_gc_efficiency() {
// GC efficiency is the ratio of how much space would be
// reclaimed over how long we predict it would take to reclaim it.
G1CollectedHeap* g1h = G1CollectedHeap::heap();
G1Policy* g1p = g1h->g1_policy();
// Retrieve a prediction of the elapsed time for this region for
// a mixed gc because the region will only be evacuated during a
// mixed gc.
double region_elapsed_time_ms =
g1p->predict_region_elapsed_time_ms(this, false /* for_young_gc */);
_gc_efficiency = (double) reclaimable_bytes() / region_elapsed_time_ms;
}
- gc_efficiency=可回收的字节数 / 预计的回收毫秒数
record_concurrent_mark_cleanup_end
bool G1Policy::next_gc_should_be_mixed(const char* true_action_str,
const char* false_action_str) const {
if (cset_chooser()->is_empty()) {
log_debug(gc, ergo)("%s (candidate old regions not available)", false_action_str);
return false;
}
// Is the amount of uncollected reclaimable space above G1HeapWastePercent?
size_t reclaimable_bytes = cset_chooser()->remaining_reclaimable_bytes();
double reclaimable_percent = reclaimable_bytes_percent(reclaimable_bytes);
double threshold = (double) G1HeapWastePercent;
if (reclaimable_percent <= threshold) {
log_debug(gc, ergo)("%s (reclaimable percentage not over threshold). candidate old regions: %u reclaimable: " SIZE_FORMAT " (%1.2f) threshold: " UINTX_FORMAT,
false_action_str, cset_chooser()->remaining_regions(), reclaimable_bytes, reclaimable_percent, G1HeapWastePercent);
return false;
}
log_debug(gc, ergo)("%s (candidate old regions available). candidate old regions: %u reclaimable: " SIZE_FORMAT " (%1.2f) threshold: " UINTX_FORMAT,
true_action_str, cset_chooser()->remaining_regions(), reclaimable_bytes, reclaimable_percent, G1HeapWastePercent);
return true;
}
- 判断CSet中可回收空间占比是否小于阈值
- 阈值受JVM参数 G1HeapWastePercent控制,默认5。只有当可回收空间占比大于阈值时,才会启动混合式GC回收
4. 新一代GC算法的探讨
JDK11和JDK12中加入了Shenandoah和ZGC,对停顿时长做了进一步的优化,达到了毫秒级。
G1混合式GC中对象复制时,仍然需要STW,而Shenandoah和ZGC通过在读屏障和写屏障中的处理,使得对象复制也可以和Mutator并发执行了。
Shenandoah和ZGC尚处于实验阶段,目前谈论替代G1为时尚早。
5. 引用
jdk12源代码[https://hg.openjdk.java.net/jdk/jdk12]