JVM G1 源码分析（二）- 对象的堆空间分配

最新推荐文章于 2023-10-01 11:10:13 发布

860MHz

最新推荐文章于 2023-10-01 11:10:13 发布

阅读量1.9k

点赞数 1

分类专栏： JVM G1源码分析

本文链接：https://blog.csdn.net/a860MHz/article/details/96992475

版权

JVM 同时被 2 个专栏收录

44 篇文章 17 订阅

订阅专栏

G1源码分析

7 篇文章 14 订阅

订阅专栏

1. 简介

当开发者使用JAVA语言实例化一个对象时，排除JIT的标量替换等优化手段，该对象会在JAVA Heap上分配存储空间。

分配空间时，为了提高JVM的运行效率，应当尽量减少临界区范围，避免全局锁。G1的通常的应用场景中，会存在大量的Mutator同时执行，为减少锁冲突，引入了TLAB（线程本地分配缓冲区 Thread Local Allocation Buffer）机制。

G1支持基于TLAB的快速分配，当TLAB快速分配失败时，使用TLAB外慢速分配。

2. 分配流程

空间分配的入口在instanceKlass.cpp

instanceOop InstanceKlass::allocate_instance(TRAPS) {
  bool has_finalizer_flag = has_finalizer(); // Query before possible GC
  int size = size_helper();  // Query before forming handle.

  instanceOop i;

  i = (instanceOop)Universe::heap()->obj_allocate(this, size, CHECK_NULL);
  if (has_finalizer_flag && !RegisterFinalizersAtInit) {
    i = register_finalizer(i, CHECK_NULL);
  }
  return i;
}

判断是否重写了finalize方法，如是，则注册finalizer
调用CollectedHeap::obj_allocate分配

collectedHeap.cpp

oop CollectedHeap::obj_allocate(Klass* klass, int size, TRAPS) {
  ObjAllocator allocator(klass, size, THREAD);
  return allocator.allocate();
}

调用ObjAllocator::allocate分配

memAllocator.cpp

HeapWord* MemAllocator::mem_allocate(Allocation& allocation) const {
  if (UseTLAB) {
    HeapWord* result = allocate_inside_tlab(allocation);
    if (result != NULL) {
      return result;
    }
  }

  return allocate_outside_tlab(allocation);
}

如果开启了UseTLAB选项，则先尝试在TLAB中分配；JVM参数-XX:+UseTLAB、-XX:-UseTLAB开启或关闭TLAB
如果TLAB分配失败，则在TLAB外分配

对象堆空间分配的流程图如下：

3. TLAB分配

Eden区域是所有线程都可以访问的区域，为了快速分配内存，TLAB机制是通过给每个线程分配一个线程独享的缓冲区来减少锁的。

TLAB位于Eden区域中，所有的TLAB对于其他线程都是可见的，但是只有本地线程可以在其TLAB中分配空间。

另外，TLAB分配时，为了线程安全仍然需要加锁。

memAllocator.cpp

HeapWord* MemAllocator::allocate_inside_tlab(Allocation& allocation) const {
  assert(UseTLAB, "should use UseTLAB");

  // Try allocating from an existing TLAB.
  HeapWord* mem = _thread->tlab().allocate(_word_size);
  if (mem != NULL) {
    return mem;
  }

  // Try refilling the TLAB and allocating the object in it.
  return allocate_inside_tlab_slow(allocation);
}

在已有的TLAB中分配，逻辑较为简单，调用本地线程的tlab的allocate方法
如果在已有的TLAB中分配失败，则调用allocate_inside_tlab_slow，执行TLAB慢分配

threadLocalAllocBuffer.inline.hpp

inline HeapWord* ThreadLocalAllocBuffer::allocate(size_t size) {
  invariants();
  HeapWord* obj = top();
  if (pointer_delta(end(), obj) >= size) {
    // successful thread-local allocation
#ifdef ASSERT
    // Skip mangling the space corresponding to the object header to
    // ensure that the returned space is not considered parsable by
    // any concurrent GC thread.
    size_t hdr_size = oopDesc::header_size();
    Copy::fill_to_words(obj + hdr_size, size - hdr_size, badHeapWordVal);
#endif // ASSERT
    // This addition is safe because we know that top is
    // at least size below end, so the add can't wrap.
    set_top(obj + size);

    invariants();
    return obj;
  }
  return NULL;
}

分配成功后，TLAB的top修改为当前的top+size

TLAB慢分配逻辑主要在allocate_inside_tlab_slow中：

HeapWord* MemAllocator::allocate_inside_tlab_slow(Allocation& allocation) const {
  HeapWord* mem = NULL;
  ThreadLocalAllocBuffer& tlab = _thread->tlab();

  if (JvmtiExport::should_post_sampled_object_alloc()) {
    // Try to allocate the sampled object from TLAB, it is possible a sample
    // point was put and the TLAB still has space.
    tlab.set_back_allocation_end();
    mem = tlab.allocate(_word_size);
    if (mem != NULL) {
      allocation._tlab_end_reset_for_sample = true;
      return mem;
    }
  }

  // Retain tlab and allocate object in shared space if
  // the amount free in the tlab is too large to discard.
  if (tlab.free() > tlab.refill_waste_limit()) {
    tlab.record_slow_allocation(_word_size);
    return NULL;
  }

  // Discard tlab and allocate a new one.
  // To minimize fragmentation, the last TLAB may be smaller than the rest.
  size_t new_tlab_size = tlab.compute_size(_word_size);

  tlab.retire_before_allocation();

  if (new_tlab_size == 0) {
    return NULL;
  }

  // Allocate a new TLAB requesting new_tlab_size. Any size
  // between minimal and new_tlab_size is accepted.
  size_t min_tlab_size = ThreadLocalAllocBuffer::compute_min_size(_word_size);
  mem = _heap->allocate_new_tlab(min_tlab_size, new_tlab_size, &allocation._allocated_tlab_size);
  if (mem == NULL) {
    assert(allocation._allocated_tlab_size == 0,
           "Allocation failed, but actual size was updated. min: " SIZE_FORMAT
           ", desired: " SIZE_FORMAT ", actual: " SIZE_FORMAT,
           min_tlab_size, new_tlab_size, allocation._allocated_tlab_size);
    return NULL;
  }
  assert(allocation._allocated_tlab_size != 0, "Allocation succeeded but actual size not updated. mem at: "
         PTR_FORMAT " min: " SIZE_FORMAT ", desired: " SIZE_FORMAT,
         p2i(mem), min_tlab_size, new_tlab_size);

  if (ZeroTLAB) {
    // ..and clear it.
    Copy::zero_to_words(mem, allocation._allocated_tlab_size);
  } else {
    // ...and zap just allocated object.
#ifdef ASSERT
    // Skip mangling the space corresponding to the object header to
    // ensure that the returned space is not considered parsable by
    // any concurrent GC thread.
    size_t hdr_size = oopDesc::header_size();
    Copy::fill_to_words(mem + hdr_size, allocation._allocated_tlab_size - hdr_size, badHeapWordVal);
#endif // ASSERT
  }

  tlab.fill(mem, mem + _word_size, allocation._allocated_tlab_size);
  return mem;
}

判断是否TLAB未分配空间大于阈值，该阈值可以通过JVM参数-XX:TLABRefillWasteFraction指定，默认64分之一。如果TLAB未分配空间大于阈值，则保留此TLAB，进入TLAB外慢分配；如小于，则抛弃此TLAB重新申请TLAB
动态计算TLAB的size，并在将要抛弃的TLAB中填充dummy对象
申请一个新的TLAB
在新的TLAB中分配对象并返回

计算TLAB size逻辑如下：

inline size_t ThreadLocalAllocBuffer::compute_size(size_t obj_size) {
  // Compute the size for the new TLAB.
  // The "last" tlab may be smaller to reduce fragmentation.
  // unsafe_max_tlab_alloc is just a hint.
  const size_t available_size = Universe::heap()->unsafe_max_tlab_alloc(thread()) / HeapWordSize;
  size_t new_tlab_size = MIN3(available_size, desired_size() + align_object_size(obj_size), max_size());

  // Make sure there's enough room for object and filler int[].
  if (new_tlab_size < compute_min_size(obj_size)) {
    // If there isn't enough room for the allocation, return failure.
    log_trace(gc, tlab)("ThreadLocalAllocBuffer::compute_size(" SIZE_FORMAT ") returns failure",
                        obj_size);
    return 0;
  }
  log_trace(gc, tlab)("ThreadLocalAllocBuffer::compute_size(" SIZE_FORMAT ") returns " SIZE_FORMAT,
                      obj_size, new_tlab_size);
  return new_tlab_size;
}

可以通过JVM参数TLABSize设置TLAB使用的空间；如果未设置，则动态计算的，受JVM参数TLABWasteTargetPercent（默认1%）和当前线程数影响，约等于Eden空间 * 2 * 1%

填充dummy对象的逻辑如下：

void ThreadLocalAllocBuffer::retire(ThreadLocalAllocStats* stats) {
  if (stats != NULL) {
    accumulate_and_reset_statistics(stats);
  }

  if (end() != NULL) {
    invariants();
    thread()->incr_allocated_bytes(used_bytes());
    insert_filler();
    initialize(NULL, NULL, NULL);
  }
}

void ThreadLocalAllocBuffer::insert_filler() {
  assert(end() != NULL, "Must not be retired");
  Universe::heap()->fill_with_dummy_object(top(), hard_end(), true);
}

填充dummy对象的目的是为了遍历HR时，不再需要一个字节一个字节的遍历，dummy对象是一个int[]。

分配新的TLAB逻辑在g1CollectedHeap.hpp和g1CollectedHeap.cpp中，如下：

  static size_t humongous_threshold_for(size_t region_size) {
    return (region_size / 2);
  }

HeapWord* G1CollectedHeap::allocate_new_tlab(size_t min_size,
                                             size_t requested_size,
                                             size_t* actual_size) {
  assert_heap_not_locked_and_not_at_safepoint();
  assert(!is_humongous(requested_size), "we do not allow humongous TLABs");

  return attempt_allocation(min_size, requested_size, actual_size);
}

首先判断是否大对象，如否则分配
如果size > region_size / 2，则判定为大对象

4. 慢分配

当TLAB分配失败时，进入慢速分配阶段。慢速分配首先需要尝试对Heap加锁，加锁成功后在TLAB外的YHR或HHR分配。

memAllocator.cpp

HeapWord* MemAllocator::allocate_outside_tlab(Allocation& allocation) const {
  allocation._allocated_outside_tlab = true;
  HeapWord* mem = _heap->mem_allocate(_word_size, &allocation._overhead_limit_exceeded);
  if (mem == NULL) {
    return mem;
  }

  NOT_PRODUCT(_heap->check_for_non_bad_heap_word_value(mem, _word_size));
  size_t size_in_bytes = _word_size * HeapWordSize;
  _thread->incr_allocated_bytes(size_in_bytes);

  return mem;
}

调用mem_allocate方法分配，如果成功则更新线程占用内存数

g1CollectedHeap.cpp

HeapWord* G1CollectedHeap::mem_allocate(size_t word_size,
                              bool*  gc_overhead_limit_was_exceeded) {
  assert_heap_not_locked_and_not_at_safepoint();

  if (is_humongous(word_size)) {
    return attempt_allocation_humongous(word_size);
  }
  size_t dummy = 0;
  return attempt_allocation(word_size, word_size, &dummy);
}

如果是大对象，则进入attempt_allocation_humongous；否则进入attempt_allocation

大对象分配逻辑如下，代码较长，仅保留关键代码

HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) {
  if (g1_policy()->need_to_start_conc_mark("concurrent humongous allocation",
                                           word_size)) {
    collect(GCCause::_g1_humongous_allocation);
  }


  HeapWord* result = NULL;
  for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
    bool should_try_gc;
    uint gc_count_before;


    {
      MutexLockerEx x(Heap_lock);

      result = humongous_obj_allocate(word_size);
      if (result != NULL) {
        size_t size_in_regions = humongous_obj_size_in_regions(word_size);
        g1_policy()->add_bytes_allocated_in_old_since_last_gc(size_in_regions * HeapRegion::GrainBytes);
        return result;
      }

      should_try_gc = !GCLocker::needs_gc();
      gc_count_before = total_collections();
    }

    if (should_try_gc) {
      bool succeeded;
      result = do_collection_pause(word_size, gc_count_before, &succeeded,
                                   GCCause::_g1_humongous_allocation);
      if (result != NULL) {
        assert(succeeded, "only way to get back a non-NULL result");
        log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT,
                             Thread::current()->name(), p2i(result));
        return result;
      }

      if (succeeded) {
        log_trace(gc, alloc)("%s: Successfully scheduled collection failing to allocate "
                             SIZE_FORMAT " words", Thread::current()->name(), word_size);
        return NULL;
      }
      log_trace(gc, alloc)("%s: Unsuccessfully scheduled collection allocating " SIZE_FORMAT "",
                           Thread::current()->name(), word_size);
    } else {
      // Failed to schedule a collection.
      if (gclocker_retry_count > GCLockerRetryAllocationCount) {
        log_warning(gc, alloc)("%s: Retried waiting for GCLocker too often allocating "
                               SIZE_FORMAT " words", Thread::current()->name(), word_size);
        return NULL;
      }

      GCLocker::stall_until_clear();
      gclocker_retry_count += 1;
    }
  }
}

由于大对象分配可能导致堆占用快速增长，因此在分配之前先判断是否满足GC标记的条件
加heap lock锁
尝试分配，如果成功则返回
如果需要gc，则触发gc，gc成功后，回到加锁逻辑，重新开始分配直到成功，或者gc重试次数大于GCLockerRetryAllocationCount

attempt_allocation逻辑在heapRegion.inline.hpp中，如下：

inline HeapWord* G1ContiguousSpace::par_allocate_impl(size_t min_word_size,
                                                      size_t desired_word_size,
                                                      size_t* actual_size) {
  do {
    HeapWord* obj = top();
    size_t available = pointer_delta(end(), obj);
    size_t want_to_allocate = MIN2(available, desired_word_size);
    if (want_to_allocate >= min_word_size) {
      HeapWord* new_top = obj + want_to_allocate;
      HeapWord* result = Atomic::cmpxchg(new_top, top_addr(), obj);
      // result can be one of two:
      //  the old top value: the exchange succeeded
      //  otherwise: the new value of the top is returned.
      if (result == obj) {
        assert(is_aligned(obj) && is_aligned(new_top), "checking alignment");
        *actual_size = want_to_allocate;
        return obj;
      }
    } else {
      return NULL;
    }
  } while (true);
}

使用CAS操作分配，如果可用空间大于对象容量则持续重试，否则退出重新选择region或触发GC。

5. 总结

综上所述，JVM G1的对象的堆空间分配首先在TLAB中进行；当TLAB分配失败后，则在HR中直接进行分配；仍然失败后，则触发GC。

6. 引用

jdk12源代码[https://hg.openjdk.java.net/jdk/jdk12]

860MHz

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
1
评论
JVM G1 源码分析（二）- 对象的堆空间分配

1. 简介当开发者使用JAVA语言实例化一个对象时，排除JIT的标量替换等优化手段，该对象会在JAVA Heap上分配存储空间。分配空间时，为了提高JVM的运行效率，应当尽量减少临界区范围，避免全局锁。G1的通常的应用场景中，会存在大量的Mutator同时执行，为减少锁冲突，引入了TLAB（线程本地分配缓冲区 Thread Local Allocation Buffer）机制。G1支持基于T...
复制链接

扫一扫

专栏目录