目录
5、fill_with_objects / fill_with_object
6、pre_full_gc_dump / post_full_gc_dump
7、obj_allocate / array_allocate /array_allocate_nozero
在《Hotspot Java对象创建和TLAB源码解析》中讲解了如何从CollectedHeap管理的Java堆内存中创建对象的,在《Hotspot 内存管理之Universe 源码解析》中讲解了根据不同的参数配置如何初始化CollectedHeap的,从本篇博客开始会有多篇文章详细的探讨CollectedHeap及其不同子类即不同垃圾回收算法的实现细节,整体的垃圾回收流程和相关基础类的使用,本篇只讲解CollectedHeap的定义和非虚方法的实现。
1、定义
CollectedHeap是一个抽象类,表示一个Java堆,定义了各种垃圾回收算法必须实现的公共接口,这些接口就是上层类用来分配Java对象,分配TLAB,获取Java堆使用情况等的统一API。CollectedHeap定义位于hotspot src/share/vm/gc_interface/collectedHeap.hpp中,其包含的属性如下:
- static size_t _filler_array_max_size; //填充数组的最大值
- GCHeapLog* _gc_heap_log;//用来打印GC日志
- bool _defer_initial_card_mark; //开启C2编译时使用,支持ReduceInitialCardMarks
- MemRegion _reserved; //Java堆对应的一段连续内存空间
- BarrierSet* _barrier_set; //卡表(CardTable)的基类
- bool _is_gc_active; //是否正在执行GC
- uint _n_par_threads; //并行执行GC任务的线程数
- unsigned int _total_collections; //从JVM启动至今的GC次数
- unsigned int _total_full_collections; //从JVM启动至今的Full GC次数
- GCCause::Cause _gc_cause; //当前GC被触发的原因,Cause是GCCause定义的枚举
- GCCause::Cause _gc_lastcause; //上一次GC被触发的原因
- PerfStringVariable* _perf_gc_cause; //开启UsePerfData时用来保存_gc_cause
- PerfStringVariable* _perf_gc_lastcause; //开启UsePerfData时用来保存_gc_lastcause
CollectedHeap的类继承关系如下:
其中ParallelScavengeHeap就是开启UseParallelGC时的GC实现,G1CollectedHeap是开启UseG1GC时的GC实现,GenCollectedHeap是开启UseSerialGC或者UseConcMarkSweepGC的GC实现,分别对应两种不同的GenCollectorPolicy。
上述类继承关系与CollectedHeap定义的表示GC实现类名称的枚举Name相对应,如下:
虚方法kind会返回当前GC实现的类型,如下:
CollectedHeap定义的方法多数都是虚方法,需要结合具体的实现类来看,这里重点关注非虚方法的实现。
2、GCHeapLog
GCHeapLog的定义在同一个文件collectedHeap.hpp中,如下:
GCHeapLog继承自EventLogBase,用来打印GC日志的。EventLogBase是一个模板类,主要定义了打印日志所需的基础属性。 GCMessage继承自FormatBuffer,实际就是一个指定长度的char数组,用来存储GC的日志信息,其定义如下:
其log_heap方法的实现如下:
void GCHeapLog::log_heap(bool before) {
if (!should_log()) {
return;
}
double timestamp = fetch_timestamp();
MutexLockerEx ml(&_mutex, Mutex::_no_safepoint_check_flag);
int index = compute_log_index();
//_records表示一个包含多条日志信息的数组,每个元素都包含一个GCMessage实例
_records[index].thread = NULL; // Its the GC thread so it's not that interesting.
_records[index].timestamp = timestamp;
//data属性实际就是这条日志对应的GCMessage实例
_records[index].data.is_before = before;
//stringStream实际将日志写入到GCMessage实例的char数组中保存
stringStream st(_records[index].data.buffer(), _records[index].data.size());
if (before) {
Universe::print_heap_before_gc(&st, true);
} else {
Universe::print_heap_after_gc(&st, true);
}
}
void Universe::print_heap_before_gc(outputStream* st, bool ignore_extended) {
//Universe最终是调用heap的方法
st->print_cr("{Heap before GC invocations=%u (full %u):",
heap()->total_collections(),
heap()->total_full_collections());
//PrintHeapAtGCExtended表示是否打印额外的更详细的有关堆结构的信息,当PrintHeapAtGC为true时使用
if (!PrintHeapAtGCExtended || ignore_extended) {
heap()->print_on(st);
} else {
//打印额外的GC信息
heap()->print_extended_on(st);
}
}
void Universe::print_heap_after_gc(outputStream* st, bool ignore_extended) {
st->print_cr("Heap after GC invocations=%u (full %u):",
heap()->total_collections(),
heap()->total_full_collections());
if (!PrintHeapAtGCExtended || ignore_extended) {
heap()->print_on(st);
} else {
heap()->print_extended_on(st);
}
st->print_cr("}");
}
其调用方如下:
CollectedHeap::print_heap_before_gc和CollectedHeap::print_heap_after_gc方法的实现如下:
void CollectedHeap::print_heap_before_gc() {
//PrintHeapAtGC表示是否在每次GC前后打印出堆结构,默认为false。
if (PrintHeapAtGC) {
Universe::print_heap_before_gc();
}
if (_gc_heap_log != NULL) {
_gc_heap_log->log_heap_before();
}
}
void CollectedHeap::print_heap_after_gc() {
if (PrintHeapAtGC) {
Universe::print_heap_after_gc();
}
if (_gc_heap_log != NULL) {
_gc_heap_log->log_heap_after();
}
}
//跟log_heap最终调用的方法相同,就是传递的outputStream类型不同
//第二个参数ignore_extended没传,使用默认值false
static void print_heap_before_gc() { print_heap_before_gc(gclog_or_tty); }
static void print_heap_after_gc() { print_heap_after_gc(gclog_or_tty); }
3、GCCause /GCCauseString
GCCause的定义在同目录下的gcCause.hpp中,表示导致GC的原因,GCCause专门定义了一个枚举表示所有可能的原因,如下图:
与之对应的to_string(GCCause::Cause cause)方法用来返回给定Cause的描述字符串,参考gcCause.cpp中的实现。
GCCauseString是一个工具类,用来打印包含GCCause::Cause的日志,其实现如下:
class GCCauseString : StackObj {
private:
static const int _length = 128;
char _buffer[_length];
int _position;
public:
GCCauseString(const char* prefix, GCCause::Cause cause) {
//PrintGCCause表示是否打印GC原因,默认为true
if (PrintGCCause) {
//jio_snprintf返回char数组中被占用的元素的个数,_buffer是起始空闲位置,_length是空闲的元素个数
_position = jio_snprintf(_buffer, _length, "%s (%s) ", prefix, GCCause::to_string(cause));
} else {
_position = jio_snprintf(_buffer, _length, "%s ", prefix);
}
assert(_position >= 0 && _position <= _length,
err_msg("Need to increase the buffer size in GCCauseString? %d", _position));
}
GCCauseString& append(const char* str) {
//将str插入到_buffer中已有的字符串的后面
int res = jio_snprintf(_buffer + _position, _length - _position, "%s", str);
_position += res;
assert(res >= 0 && _position <= _length,
err_msg("Need to increase the buffer size in GCCauseString? %d", res));
return *this;
}
//运算符重载,将其强转成const char*时返回_buffer
operator const char*() {
return _buffer;
}
};
以G1CollectedHeap::log_gc_header()中的用法为例说明,如下:
4、 collect_as_vm_thread
collect_as_vm_thread该方法用于执行特定GCCause::Cause下的垃圾回收,其实现如下:
要求调用此方法的线程必须是JVM线程,必须已经获取了Heap_lock。GCCauseSetter是一个辅助类,通过其构造函数临时的改变当前CollectedHeap的_gc_cause, 通过析构函数将_gc_cause恢复成原来的,其实现如下:
do_full_collection是一个虚方法, CollectedHeap没有给出默认实现。该方法的调用链如下:
VM开头的这三个类都表示JVM线程执行的某一类动作。
5、fill_with_objects / fill_with_object
fill_with_objects和fill_with_object都是内存分配结束后,往分配的内存中填充数据的,填充的目的是为了让操作系统完成真实的内存分配,两者的区别在于fill_with_objects用于填充大块内存,fill_with_object填充小块内存。两方法的实现如下:
void CollectedHeap::fill_with_object(HeapWord* start, size_t words, bool zap)
{
DEBUG_ONLY(fill_args_check(start, words);)
HandleMark hm; // Free handles before leaving.
fill_with_object_impl(start, words, zap);
}
void
CollectedHeap::fill_with_object_impl(HeapWord* start, size_t words, bool zap)
{
assert(words <= filler_array_max_size(), "too big for a single object");
//如果大于int数组的最低大小
if (words >= filler_array_min_size()) {
//用int数组填充
fill_with_array(start, words, zap);
} else if (words > 0) {
//如果小于int数组的最低大小,则用java.lang.Object来填充,Object本身没有任何属性
assert(words == min_fill_size(), "unaligned size");
post_allocation_setup_common(SystemDictionary::Object_klass(), start);
}
}
static inline size_t filler_array_max_size() {
return _filler_array_max_size;
}
size_t CollectedHeap::filler_array_min_size() {
return align_object_size(filler_array_hdr_size()); // align to MinObjAlignment
}
size_t CollectedHeap::filler_array_hdr_size() {
return size_t(align_object_offset(arrayOopDesc::header_size(T_INT))); // align to Long
}
void
CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool zap)
{
assert(words >= filler_array_min_size(), "too small for an array");
assert(words <= filler_array_max_size(), "too big for a single object");
const size_t payload_size = words - filler_array_hdr_size();
//计算int数组的长度
const size_t len = payload_size * HeapWordSize / sizeof(jint);
assert((int)len >= 0, err_msg("size too large " SIZE_FORMAT " becomes %d", words, (int)len));
//设置数组长度
((arrayOop)start)->set_length((int)len);
//执行分配结束后的公共动作,设置对象头和对应的klass
post_allocation_setup_common(Universe::intArrayKlassObj(), start);
DEBUG_ONLY(zap_filler_array(start, words, zap);)
}
void CollectedHeap::post_allocation_setup_common(KlassHandle klass,
HeapWord* obj) {
post_allocation_setup_no_klass_install(klass, obj);
post_allocation_install_obj_klass(klass, oop(obj));
}
void CollectedHeap::post_allocation_setup_no_klass_install(KlassHandle klass,
HeapWord* objPtr) {
oop obj = (oop)objPtr;
//设置对象头
assert(obj != NULL, "NULL object pointer");
if (UseBiasedLocking && (klass() != NULL)) {
obj->set_mark(klass->prototype_header());
} else {
// May be bootstrapping
obj->set_mark(markOopDesc::prototype());
}
}
void CollectedHeap::post_allocation_install_obj_klass(KlassHandle klass,
oop obj) {
assert(klass() != NULL || !Universe::is_fully_initialized(), "NULL klass");
assert(klass() == NULL || klass()->is_klass(), "not a klass");
assert(obj != NULL, "NULL object pointer");
//设置oop所属的klass
obj->set_klass(klass());
assert(!Universe::is_fully_initialized() || obj->klass() != NULL,
"missing klass");
}
void CollectedHeap::fill_with_objects(HeapWord* start, size_t words, bool zap)
{
DEBUG_ONLY(fill_args_check(start, words);)
HandleMark hm; // Free handles before leaving.
#ifdef _LP64
const size_t min = min_fill_size();
const size_t max = filler_array_max_size();
while (words > max) {
//如果words - max大于min则按照max分配,小于min则一次性分配
const size_t cur = words - max >= min ? max : max - min;
fill_with_array(start, cur, zap);
start += cur;
words -= cur;
}
#endif
fill_with_object_impl(start, words, zap);
}
_filler_array_max_size表示使用int数组填充时能够填充的最大的int数组的字宽数,其初始化在CollectedHeap的构造函数中,如下:
6、pre_full_gc_dump / post_full_gc_dump
pre_full_gc_dump / post_full_gc_dump这两方法分别是在Full GC前后执行的动作,其源码如下:
void CollectedHeap::pre_full_gc_dump(GCTimer* timer) {
//HeapDumpBeforeFullGC表示执行FullGC前把当前Heap dump到文件中,默认为false
if (HeapDumpBeforeFullGC) {
GCTraceTime tt("Heap Dump (before full gc): ", PrintGCDetails, false, timer, GCId::create());
// We are doing a "major" collection and a heap dump before
// major collection has been requested.
HeapDumper::dump_heap();
}
//PrintClassHistogramBeforeFullGC表示在执行FullGC后打印当前Heap的类直方图,即统计所有已加载的类的oop的数量
if (PrintClassHistogramBeforeFullGC) {
GCTraceTime tt("Class Histogram (before full gc): ", PrintGCDetails, true, timer, GCId::create());
VM_GC_HeapInspection inspector(gclog_or_tty, false /* ! full gc */);
inspector.doit();
}
}
void CollectedHeap::post_full_gc_dump(GCTimer* timer) {
//同HeapDumpBeforeFullGC,不过是GC后执行
if (HeapDumpAfterFullGC) {
GCTraceTime tt("Heap Dump (after full gc): ", PrintGCDetails, false, timer, GCId::create());
HeapDumper::dump_heap();
}
//同PrintClassHistogramBeforeFullGC,不过是GC后执行
if (PrintClassHistogramAfterFullGC) {
GCTraceTime tt("Class Histogram (after full gc): ", PrintGCDetails, true, timer, GCId::create());
VM_GC_HeapInspection inspector(gclog_or_tty, false /* ! full gc */);
inspector.doit();
}
}
这里的HeapDumper就是内存Dump的实现,VM_GC_HeapInspection就是打印类直方图的实现,重点关注其调用链,如下:
7、obj_allocate / array_allocate /array_allocate_nozero
obj_allocate用来分配某个Klass的oop的,其实现在《Hotspot Java对象创建和TLAB源码解析》中已经详细探讨过,这里补充下该方法的调用链,如下图:
array_allocate是与之对应的用来分配某个Klass的oop数组的,该方法的实现如下:
//注意这里的size就是已经计算过的目标数组需要的内存大小
oop CollectedHeap::array_allocate(KlassHandle klass,
int size,
int length,
TRAPS) {
debug_only(check_for_valid_allocation_state());
assert(!Universe::heap()->is_gc_active(), "Allocation during gc not allowed");
assert(size >= 0, "int won't convert to size_t");
//跟obj_allocate调用一样的方法申请指定大小的内存,如果klass未完成初始化则初始化klass
HeapWord* obj = common_mem_allocate_init(klass, size, CHECK_NULL);
//跟obj_allocate不一样,obj_allocate调用的是post_allocation_setup_obj
post_allocation_setup_array(klass, obj, length);
NOT_PRODUCT(Universe::heap()->check_for_bad_heap_word_value(obj, size));
return (oop)obj;
}
HeapWord* CollectedHeap::common_mem_allocate_init(KlassHandle klass, size_t size, TRAPS) {
HeapWord* obj = common_mem_allocate_noinit(klass, size, CHECK_NULL);
init_obj(obj, size);
return obj;
}
void CollectedHeap::init_obj(HeapWord* obj, size_t size) {
assert(obj != NULL, "cannot initialize NULL object");
const size_t hs = oopDesc::header_size();
assert(size >= hs, "unexpected object size");
//设置gap年龄为0
((oop)obj)->set_klass_gap(0);
//将请求头以外的地方填充
Copy::fill_to_aligned_words(obj + hs, size - hs);
}
void CollectedHeap::post_allocation_setup_array(KlassHandle klass,
HeapWord* obj,
int length) {
assert(length >= 0, "length should be non-negative");
//设置数组长度
((arrayOop)obj)->set_length(length);
//设置对象头和klass
post_allocation_setup_common(klass, obj);
oop new_obj = (oop)obj;
//校验obj是数组
assert(new_obj->is_array(), "must be an array");
//发布JVMTI事件,打印dtrace日志
post_allocation_notify(klass, new_obj, new_obj->size());
}
inline void post_allocation_notify(KlassHandle klass, oop obj, int size) {
// support low memory notifications (no-op if not enabled)
LowMemoryDetector::detect_low_memory_for_collected_pools();
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj);
if (DTraceAllocProbes) {
// support for Dtrace object alloc event (no-op most of the time)
if (klass() != NULL && klass()->name() != NULL) {
SharedRuntime::dtrace_object_alloc(obj, size);
}
}
}
array_allocate的调用链如下:
以InstanceKlass::allocate_objArray的调用为例说明,如下:
该方法的参数n表示维度,常见的一维数组就传1,二维数组传2。
array_allocate_nozero方法的实现array_allocate基本一致,最大的区别在于array_allocate_nozero申请到的内存是未完成初始化的,即还未完成实际的内存分配,更适合一些大数组的分配,在数组元素的填充即实际的使用过程中再逐步完成实际内存的分配,其实现如下:
该方法的调用链如下:
8、align_allocation_or_fail
align_allocation_or_fail表示将某个地址按照内存分配的粒度向上对齐,其实现如下:
//addr就是待对齐的地址,alignment_in_bytes是内存分配的粒度,end表示向上对齐时允许的内存最大地址
inline HeapWord* CollectedHeap::align_allocation_or_fail(HeapWord* addr,
HeapWord* end,
unsigned short alignment_in_bytes) {
if (alignment_in_bytes <= ObjectAlignmentInBytes) {
return addr;
}
//校验addr已经按照HeapWordSize对齐了
assert(is_ptr_aligned(addr, HeapWordSize),
err_msg("Address " PTR_FORMAT " is not properly aligned.", p2i(addr)));
//校验alignment_in_bytes是按照HeapWordSize取整过了
assert(is_size_aligned(alignment_in_bytes, HeapWordSize),
err_msg("Alignment size %u is incorrect.", alignment_in_bytes));
//将addr按照alignment_in_bytes向上对齐,地址变大
HeapWord* new_addr = (HeapWord*) align_pointer_up(addr, alignment_in_bytes);
//获取新地址和原来地址的差异
size_t padding = pointer_delta(new_addr, addr);
//如果已经对齐则返回
if (padding == 0) {
return addr;
}
if (padding < CollectedHeap::min_fill_size()) {
//如果padding过小则加上一段,方便下面填充
padding += alignment_in_bytes / HeapWordSize;
assert(padding >= CollectedHeap::min_fill_size(),
err_msg("alignment_in_bytes %u is expect to be larger "
"than the minimum object size", alignment_in_bytes));
new_addr = addr + padding;
}
assert(new_addr > addr, err_msg("Unexpected arithmetic overflow "
PTR_FORMAT " not greater than " PTR_FORMAT, p2i(new_addr), p2i(addr)));
if(new_addr < end) {
//对齐后在end的范围内则填充,否则返回NULL
CollectedHeap::fill_with_object(addr, padding);
return new_addr;
} else {
return NULL;
}
}
该方法的调用链如下: