GenCollectedHeap的Gc策略MarkSweepPolicy

最新推荐文章于 2020-03-07 13:28:50 发布

读程序的手艺人

最新推荐文章于 2020-03-07 13:28:50 发布

阅读量2.7k

点赞数

分类专栏： Hotspot 源码解析

本文链接：https://blog.csdn.net/xhh198781/article/details/41909285

版权

Hotspot 源码解析专栏收录该内容

10 篇文章 4 订阅

订阅专栏

当内存堆管理器响应应用线程的一次内存分配请求失败时(就是没有足够的空闲内存),就会触发一次Gc来回收部分或所有的垃圾对象好腾出足够的空间来满足应用的所需内存,如果还不够则抛出OOM. MarkSweepPolicy的大体思路就是标记active的对象, 清理未被标记(非active)的对象MarkSweepPolicy作为内存堆管理器GenCollectedHeap的默认配置Gc策略, 针对这种内存堆分代管理的内存堆管理器, 它的优化思路就是把Gc的范围尽量控制在年青代,这是因为清理年青代的非active对象很容易:将年青代的Eden/From区中所有active对象转存到它的To区或旧生代中,然后清理整个Eden/From区就可以;而清理旧生代的非active对象就没有那么容易了,毕竟没有其它的内存空间来给旧生代中的active对象转存储了,也就只能通过移动压缩的方式来腾出空闲空间了(所谓的内存碎片处理).

年青代Gc(Minor Gc)示意图

旧生代Gc示意图

先来回顾一下GenCollectedHeap是如何触发Gc的:

    //触发一次Gc操作,将GC型JVM操作加入VMThread的操作队列中
    //Gc的真正执行是由VMThread或特型GC线程来完成的
    VM_GenCollectForAllocation op(size, is_tlab, gc_count_before);
    VMThread::execute(&op);

    if (op.prologue_succeeded()) {	//一次Gc操作已完成
      result = op.result();
      if (op.gc_locked()) {	//当前线程没有成功触发GC(可能刚被其它线程触发了),则继续重试分配
         assert(result == NULL, "must be NULL if gc_locked() is true");
         continue;  // retry and/or stall as necessary
      }

      //到目前为止由于内存分配失败触发了一次Gc,并且Gc已被做完.如果此时GC超时(尽管Gc之后分配成功了)则给上层返回分配失败,
      //由上层给应用程序的调用线程抛出OOM.同时清理掉GC超时的标记

      //本次Gc耗时是否超过了设置的GC时间上限
      const bool limit_exceeded = size_policy()->gc_overhead_limit_exceeded();
      const bool softrefs_clear = all_soft_refs_clear();

      //本次GC超时一定是进行了清除软引用的操作
      assert(!limit_exceeded || softrefs_clear, "Should have been cleared");

      //Gc超时
      if (limit_exceeded && softrefs_clear) {
        *gc_overhead_limit_was_exceeded = true;
        //清理Gc超时标记
        size_policy()->set_gc_overhead_limit_exceeded(false);
        if (op.result() != NULL) {
          CollectedHeap::fill_with_object(op.result(), size);
        }
        //Gc超时，给上层调用返回NULL,让其抛出内存溢出错误
        return NULL;
      }

      //分配成功则确保该内存块一定在内存堆中
      assert(result == NULL || gch->is_in_reserved(result), "result not in heap");
      return result;

这里值得一提的是, 所有的Java级应用线程向内存堆管理器申请内存空间失败,会创建一个JVM操作命令VM_GenCollectForAllocation并将其丢给VMThread线程去调度执行(本文暂时还不会讲解VMThread线程是如何调度这个JVM操作命令的),而自己并不去执行Gc,只是一直监听等待其完成.VM_GenCollectForAllocation这个操作命令所代表的含义就是一次内存分配失败需要作特殊处理(要么执行一次Gc,要么扩展内存堆或内存代的内存空间,反正就要倒腾出空间来满足上层的内存需求),其具体操作逻辑如下:

void VM_GenCollectForAllocation::doit() {
  SvcGCMarker sgcm(SvcGCMarker::MINOR);

  GenCollectedHeap* gch = GenCollectedHeap::heap();
  GCCauseSetter gccs(gch, _gc_cause);

  //通知内存堆管理器处理一次内存分配失败
  _res = gch->satisfy_failed_allocation(_size, _tlab);

  //确保分配的内存块在内存堆中
  assert(gch->is_in_reserved_or_null(_res), "result not in heap");

  if (_res == NULL && GC_locker::is_active_and_needs_gc()) {
    set_gc_locked();
  }
}

最终,内存堆管理器 GenCollectedHeap还是会调用l垃圾回收器来处理, 而MarkSweepPolicy处理一次内存申请失败的总体思路就是选来一个合适的Gc来倒腾出足够的内存空间满足出发本次Gc的应用线程所需的存储空间即可,至于内存堆管理器是如何执行这个Gc的,它并不关心.MarkSweepPolicy处理一次内存申请失败的核心策略是:

1. Gc类型选择
   1).如果Gc操作已被触发但还无法被执行,则放弃本次Gc操作
   2).如果执行增量式Gc(就是只对年青代进行垃圾回收)安全,则执行一次MinorGc
   3).只能执行一次Full Gc
2. 从年青代-老年代依次尝试分配内存块
3. 从老年代-年青代依次扩展内存容量尝试分配内存块
4. 执行一次彻底的Full Gc(清理所有的软引用)
5. 从年青代-老年代依次尝试分配内存块

具体实现细节参考代码:

/**
 * 处理上层应用线程(Java级线程)的一次内存申请失败
 */
HeapWord* GenCollectorPolicy::satisfy_failed_allocation(size_t size,
                                                        bool   is_tlab) {
  GenCollectedHeap *gch = GenCollectedHeap::heap();
  GCCauseSetter x(gch, GCCause::_allocation_failure);
  HeapWord* result = NULL;

  assert(size != 0, "Precondition violated");

  if (GC_locker::is_active_and_needs_gc()) {	//Gc操作已被触发但还无法被执行

    if (!gch->is_maximal_no_gc()) { // 当前有内存代允许扩展内存容量,则试图通过扩展内存代的容量来分配内存块
      result = expand_heap_and_allocate(size, is_tlab);
    }

    return result;   // could be null if we are out of space

  } else if (!gch->incremental_collection_will_fail(false /* don't consult_young */)) { //如果当前增量式可行,则只触发一个Minor Gc
    //增量式GC()
    gch->do_collection(false            /* full */,
                       false            /* clear_all_soft_refs */,
                       size             /* size */,
                       is_tlab          /* is_tlab */,
                       number_of_generations() - 1 /* max_level */);
  } else {	//执行一次Full Gc
    if (Verbose && PrintGCDetails) {
      gclog_or_tty->print(" :: Trying full because partial may fail :: ");
    }
    // Try a full collection; see delta for bug id 6266275
    // for the original code and why this has been simplified
    // with from-space allocation criteria modified and
    // such allocation moved out of the safepoint path.
    gch->do_collection(true             /* full */,
                       false            /* clear_all_soft_refs */,
                       size             /* size */,
                       is_tlab          /* is_tlab */,
                       number_of_generations() - 1 /* max_level */);
  }

  //执行一次Gc之后，再次从内存堆的各个内存代中依次分配指定大小的内存块
  result = gch->attempt_allocation(size, is_tlab, false /*first_only*/);

  if (result != NULL) {
    assert(gch->is_in_reserved(result), "result not in heap");
    return result;
  }

  //执行一次Gc之后可能有剩余的空间来扩展各内存代的容量，
  //所以再次尝试通过允许扩展内存代容量的方式来试图分配指定大小的内存块
  result = expand_heap_and_allocate(size, is_tlab);
  if (result != NULL) {
    return result;
  }

  // If we reach this point, we're really out of memory. Try every trick
  // we can to reclaim memory. Force collection of soft references. Force
  // a complete compaction of the heap. Any additional methods for finding
  // free memory should be here, especially if they are expensive. If this
  // attempt fails, an OOM exception will be thrown.
  {
    IntFlagSetting flag_change(MarkSweepAlwaysCompactCount, 1); // Make sure the heap is fully compacted

    //最后再进行一次彻底的Gc: 回收所有的内存代+清除软引用
    gch->do_collection(true             /* full */,
                       true             /* clear_all_soft_refs */,
                       size             /* size */,
                       is_tlab          /* is_tlab */,
                       number_of_generations() - 1 /* max_level */);
  }

  //经过一次彻底的Gc之后，最后一次尝试依次从各内存代分配指定大小的内存块
  result = gch->attempt_allocation(size, is_tlab, false /* first_only */);
  if (result != NULL) {
    assert(gch->is_in_reserved(result), "result not in heap");
    return result;
  }

  assert(!should_clear_all_soft_refs(), "Flag should have been handled and cleared prior to this point");

  // What else?  We might try synchronous finalization later.  If the total
  // space available is large enough for the allocation, then a more
  // complete compaction phase than we've tried so far might be
  // appropriate.
  return NULL;
}

对于GenCollectedHeap这种基于内存分代管理的内存堆管理器而言,它们回收内存堆中垃圾对象时所追求的永恒主题就是将垃圾对象的回收尽量控制在年青的内存代,因为这样做代价小,成效高,所以会优先回收年青代中的垃圾对象.具体策略是:

1.确定本次Gc是否清除软/弱引用(java.lang.ref包)
2.确定本次GC参与的内存代
3.按照从老到青的顺序对这些的内存代进行Gc
4.调整这些内存代的大小

GenCollectedHeap实现do_collection方法的细节如下:

/**
 * 执行一次GC
 *
 * @param full 执行Full Gc还是Minor Gc(增量式Gc)
 * @param clear_all_soft_refs 本次Gc是否需要清理所有的软引用(也由内存堆分配回收策略决定)
 * @param size 本次Gc之后待分配的内存块大小
 * @param is_tlab 本次Gc之后是否从线程的本地缓冲区中分配内存块
 * @param max_level 本次Gc中允许回收的最老内存代
 */
void GenCollectedHeap::do_collection(bool  full,
                                     bool   clear_all_soft_refs,
                                     size_t size,
                                     bool   is_tlab,
                                     int    max_level) {
  bool prepared_for_verification = false;
  ResourceMark rm;
  DEBUG_ONLY(Thread* my_thread = Thread::current();)

  /**
   * 执行GC操作的线程必须满足四个条件
   * 	1.在一个同步安全点
   * 	2.VM线程或专用GC线程
   * 	3.内存堆的全局锁被GC操作的请求线程取得了
   * 	4.其它合法线程还没有开始进行GC处理
   */
  assert(SafepointSynchronize::is_at_safepoint(), "should be at safepoint");
  assert(my_thread->is_VM_thread() || my_thread->is_ConcurrentGC_thread(), "incorrect thread type capability");
  assert(Heap_lock->is_locked(), "the requesting thread should have the Heap_lock");
  guarantee(!is_gc_active(), "collection is not reentrant");

  assert(max_level < n_gens(), "sanity check");

  //GC操作当前被禁止
  if (GC_locker::check_active_before_gc()) {
    return; // GC is disabled (e.g. JNI GetXXXCritical operation)
  }

  //本次Gc是否要清理所有的软引用
  const bool do_clear_all_soft_refs = clear_all_soft_refs || collector_policy()->should_clear_all_soft_refs();

  ClearedAllSoftRefs casr(do_clear_all_soft_refs, collector_policy());

  //当前永久代的使用量
  const size_t perm_prev_used = perm_gen()->used();

  print_heap_before_gc();
  if (Verbose) {
    gclog_or_tty->print_cr("GC Cause: %s", GCCause::to_string(gc_cause()));
  }

  {
    FlagSetting fl(_is_gc_active, true);	//当前线程正式开始GC

    //当前是否要进行一个Full Gc
    bool complete = full && (max_level == (n_gens()-1));

    //Gc类型(Minor/Full GC)
    const char* gc_cause_str = "GC ";
    if (complete) {
      GCCause::Cause cause = gc_cause();
      if (cause == GCCause::_java_lang_system_gc) {	//应用程序主动调用System.gc()触发
        gc_cause_str = "Full GC (System) ";
      } else {
        gc_cause_str = "Full GC ";
      }
    }
    gclog_or_tty->date_stamp(PrintGC && PrintGCDateStamps);

    //统计本次Gc的CPU时间
    TraceCPUTime tcpu(PrintGCDetails, true, gclog_or_tty);
    TraceTime t(gc_cause_str, PrintGCDetails, false, gclog_or_tty);

    //一次Gc操作的前置处理
    gc_prologue(complete);

    increment_total_collections(complete);	//更新Gc计数器

    //当前内存堆的总使用量
    size_t gch_prev_used = used();

    //确定收集那些内存代
    int starting_level = 0;
    if (full) {
      // Search for the oldest generation which will collect all younger
      // generations, and start collection loop there.
      //如果是当前进行的是Full GC,则从最老的内存代开始向前搜索，找到第一个可收集所有年青代的内存代
      for (int i = max_level; i >= 0; i--) {
        if (_gens[i]->full_collects_younger_generations()) {
          starting_level = i;
          break;
        }
      }
    }

    bool must_restore_marks_for_biased_locking = false;

    //本次Gc回收的最年青的内存代
    int max_level_collected = starting_level;

    for (int i = starting_level; i <= max_level; i++) {
      if (_gens[i]->should_collect(full, size, is_tlab)) {	//是否回收当前的内存代
    	//如果当前回收的内存代是最老代，则本次GC升级为Full Gc
        if (i == n_gens() - 1) {  // a major collection is to happen
          if (!complete) {
            // The full_collections increment was missed above.
            increment_total_full_collections();
          }
          pre_full_gc_dump();    // do any pre full gc dumps
        }

        //统计当前内存代本次Gc的时间消耗信息,Gc次数,内存变化信息
        TraceTime t1(_gens[i]->short_name(), PrintGCDetails, false, gclog_or_tty);
        TraceCollectorStats tcs(_gens[i]->counters());
        TraceMemoryManagerStats tmms(_gens[i]->kind(),gc_cause());

        //Gc之前该内存代的使用量
        size_t prev_used = _gens[i]->used();
        _gens[i]->stat_record()->invocations++;
        _gens[i]->stat_record()->accumulated_time.start();

        // Must be done a new before each collection because
        // a previous collection will do mangling and will
        // change top of some spaces.
        record_gen_tops_before_GC();

        if (PrintGC && Verbose) {
          gclog_or_tty->print("level=%d invoke=%d size=" SIZE_FORMAT,
                     i,
                     _gens[i]->stat_record()->invocations,
                     size*HeapWordSize);
        }

        if (VerifyBeforeGC && i >= VerifyGCLevel && total_collections() >= VerifyGCStartAt) {
          HandleMark hm;  // Discard invalid handles created during verification
          if (!prepared_for_verification) {
            prepare_for_verify();
            prepared_for_verification = true;
          }
          gclog_or_tty->print(" VerifyBeforeGC:");
          Universe::verify(true);
        }
        COMPILER2_PRESENT(DerivedPointerTable::clear());

        if (!must_restore_marks_for_biased_locking &&
            _gens[i]->performs_in_place_marking()) {
          // We perform this mark word preservation work lazily
          // because it's only at this point that we know whether we
          // absolutely have to do it; we want to avoid doing it for
          // scavenge-only collections where it's unnecessary
          must_restore_marks_for_biased_locking = true;
          BiasedLocking::preserve_marks();
        }

        //正式开始回收当前的内存代
        {
          // Note on ref discovery: For what appear to be historical reasons,
          // GCH enables and disabled (by enqueing) refs discovery.
          // In the future this should be moved into the generation's
          // collect method so that ref discovery and enqueueing concerns
          // are local to a generation. The collect method could return
          // an appropriate indication in the case that notification on
          // the ref lock was needed. This will make the treatment of
          // weak refs more uniform (and indeed remove such concerns
          // from GCH). XXX

          printf("%s[%d] [tid: %lu]: 开始回收内存代[%d: %s]...\n", __FILE__, __LINE__, pthread_self(), i, _gens[i]->name());

          HandleMark hm;  // Discard invalid handles created during gc
          save_marks();   // save marks for all gens

          // We want to discover references, but not process them yet.
          // This mode is disabled in process_discovered_references if the
          // generation does some collection work, or in
          // enqueue_discovered_references if the generation returns
          // without doing any work.
          ReferenceProcessor* rp = _gens[i]->ref_processor();

          // If the discovery of ("weak") refs in this generation is
          // atomic wrt other collectors in this configuration, we
          // are guaranteed to have empty discovered ref lists.
          if (rp->discovery_is_atomic()) {
            rp->enable_discovery(true /*verify_disabled*/, true /*verify_no_refs*/);
            rp->setup_policy(do_clear_all_soft_refs);
          } else {
            // collect() below will enable discovery as appropriate
          }

          printf("%s[%d] [tid: %lu]: 开始回收内存代[%d: %s]的内存垃圾(full=%s, do_clear_all_soft_refs=%s)...\n", __FILE__, __LINE__, pthread_self(),
        		  i, _gens[i]->name(), full? "true":"false", do_clear_all_soft_refs? "true":"false");

          //正式回收当前的内存代
          _gens[i]->collect(full, do_clear_all_soft_refs, size, is_tlab);

          if (!rp->enqueuing_is_done()) {
            rp->enqueue_discovered_references();
          } else {
            rp->set_enqueuing_is_done(false);
          }
          rp->verify_no_references_recorded();
        }

        max_level_collected = i;

        //当前内存代的GC之后,能否满足内存分配请求
        if (size > 0) {
          if (!is_tlab || _gens[i]->supports_tlab_allocation()) {
            if (size*HeapWordSize <= _gens[i]->unsafe_max_alloc_nogc()) {
              size = 0;
            }
          }
        }

        COMPILER2_PRESENT(DerivedPointerTable::update_pointers());

        _gens[i]->stat_record()->accumulated_time.stop();

        update_gc_stats(i, full);

        if (VerifyAfterGC && i >= VerifyGCLevel && total_collections() >= VerifyGCStartAt) {
          HandleMark hm;  // Discard invalid handles created during verification
          gclog_or_tty->print(" VerifyAfterGC:");
          Universe::verify(false);
        }

        if (PrintGCDetails) {
          gclog_or_tty->print(":");
          _gens[i]->print_heap_change(prev_used);
        }
      }
    }//for

    //本次是否是一次Full Gc
    complete = complete || (max_level_collected == n_gens() - 1);

    if (complete) { // We did a "major" collection
      post_full_gc_dump();   // do any post full gc dumps
    }

    //打印出本次Gc之后,内存堆的变化情况,如果本次是一次Full Gc,则打印出永久代的内存变化情况
    if (PrintGCDetails) {
      print_heap_change(gch_prev_used);

      if (complete) {
        print_perm_heap_change(perm_prev_used);
      }
    }

    //一次Gc之后调整内存堆中各内存代的大小
    for(int j = max_level_collected; j >= 0; j -= 1) {
      // Adjust generation sizes.
      printf("%s[%d] [tid: %lu]: 试图调整内存代[%d: %s]的大小.\n", __FILE__, __LINE__, pthread_self(), j, _gens[j]->name());
      _gens[j]->compute_new_size();
    }

    //一次Full Gc之后调整永久代大小
    if (complete) {
      // Ask the permanent generation to adjust size for full collections
      printf("%s[%d] [tid: %lu]: 试图调整永久代[%s]的大小(只有在一次Full Gc之后才会调整永久代大小).\n", __FILE__, __LINE__, pthread_self(), perm()->as_gen()->name());
      perm()->compute_new_size();
      update_full_collections_completed();
    }

    // Track memory usage and detect low memory after GC finishes
    MemoryService::track_memory_usage();

    //一次Gc操作的后置处理
    gc_epilogue(complete);

    if (must_restore_marks_for_biased_locking) {
      BiasedLocking::restore_marks();
    }
  }

  //打印各内存代大小的调整信息
  AdaptiveSizePolicy* sp = gen_policy()->size_policy();
  AdaptiveSizePolicyOutput(sp, total_collections());

  print_heap_after_gc();

#ifdef TRACESPINNING
  ParallelTaskTerminator::print_termination_counts();
#endif

  //Gc的总次数超过配置则终止整个JVM进程
  if (ExitAfterGCNum > 0 && total_collections() == ExitAfterGCNum) {
    tty->print_cr("Stopping after GC #%d", ExitAfterGCNum);
    vm_exit(-1);
  }
}

至于每个内存代管理器是如何回收内存代内垃圾对象以及又是如何调整本代内存大小的, 后文会分别作详细的介绍.