ART视角 | 如何在native内存增长过多时自动触发GC？如何在Java对象回收时触发native内存回收？

最新推荐文章于 2024-03-08 17:13:09 发布

沈页

最新推荐文章于 2024-03-08 17:13:09 发布

阅读量1.2k

点赞数

分类专栏： android Android进阶文章标签： android

本文链接：https://blog.csdn.net/Androiddddd/article/details/109678554

版权

android 同时被 2 个专栏收录

164 篇文章 13 订阅

订阅专栏

Android进阶

77 篇文章 1 订阅

订阅专栏

本文分析基于Android R(11)

前言

GC用于Java堆内存的回收，这是人尽皆知的事实。然而现在有些Java类被设计成牵线木偶，Java对象只存储一些“线”，其真实的内存消耗全都放到了native内存中。譬如Bitmap。对它们而言，如何自动回收操纵的native内存成为一个亟须解决的问题。

想要自动回收，必须依赖GC机制。但仅仅依靠现有的GC机制还不够。我们还需要考虑以下两点：

如何在native内存增长过多的时候自动触发GC
如何在GC回收Java对象时 同步回收 native资源

Android从N开始引入了NativeAllocationRegistry类。早期的版本可以保证在GC回收Java对象时同步回收native资源（上述第2点），其内部用到的正是上一篇博客介绍过的Cleaner机制。

利用早期版本的NativeAllocationRegistry，native资源虽然可以回收，但仍然有些缺陷。譬如被设计成牵线木偶的Java类所占空间很小，但其间接引用的native资源占用很大。因此就会导致Java堆的增长很慢，而native堆的增长很快。在某些场景下，Java堆的增长还没有达到下一次GC触发的水位，而native堆中的垃圾已经堆积成山。由程序主动调用System.gc()当然可以缓解这个问题，但开发者如何控制这个频率？频繁的话就会降低运行性能，稀疏的话就会导致native垃圾无法及时释放。因此新版本的NativeAllocationRegistry连同GC一起做了调整，使得进程在native内存增长过多的时候可以自动触发GC，也即上述的第1点。相当于以前的GC触发只考虑Java堆的使用大小，现在连同native堆一起考虑进去了。

native垃圾堆积成山的问题会导致一些严重的问题，譬如最近国内很多32位APK上碰到过的native内存OOM问题，其中字节跳动专门发过博客介绍他们的解决方案。在链接的博客里，字节跳动团队提供了应用层的解决方案，由应用层来主动释放native资源。但这个问题的根本解决还得依赖底层设计的修改。看完字节跳动的博客后，我专门联系过Android团队，建议他们在CameraMetadataNative类中使用NativeAllocationRegistry。他们很快接受了这个提议，并提供了新的实现。相信字节跳动遇到的这个问题在S上将不会存在。

1. 如何在native内存增长过多时自动触发GC

当Java类被设计成牵线木偶时，其native内存的分配通常有两种方式。一种是malloc(new的内部通常也是调用malloc)分配堆内存，另一种是mmap分配匿名页。二者最大的区别是malloc通常用于小内存分配，而mmap通常用于大内存分配。

当我们使用NativeAllocationRegistry为该Java对象自动释放native内存时，首先需要调用registerNativeAllocation，一方面告知GC本次native分配的资源大小，另一方面检测是否达到GC的触发条件。根据内存分配方式的不同，处理方式也不太一样。

libcore/luni/src/main/java/libcore/util/NativeAllocationRegistry.java

     // Inform the garbage collector of the allocation. We do this differently for
     // malloc-based allocations.
     private static void registerNativeAllocation(long size) {
         VMRuntime runtime = VMRuntime.getRuntime();
         if ((size & IS_MALLOCED) != 0) {     <==================如果native内存是通过malloc方式分配的，则走这个if分支
             final long notifyImmediateThreshold = 300000;
             if (size >= notifyImmediateThreshold) {   <=========如果native内存大于等于300000bytes(~300KB),则走这个分支
                 runtime.notifyNativeAllocationsInternal();
             } else {                         <==================如果native内存小于300000bytes，则走这个分支
                 runtime.notifyNativeAllocation();
             }
         } else {
             runtime.registerNativeAllocation(size);
         }
     }

1.1 Malloc内存

Malloc分配的内存会有两个判断条件。

此次分配是否大于等于300,000bytes。大于的话则走VIP通道直接执行CheckGCForNative函数。该函数内部会统计native内存分配的总量，判断其是否达到GC触发的阈值。如果达到的话则触发一次GC。
此次分配是否是300次分配的整数倍。这个判定条件用于限定CheckGCForNative的执行次数，每300次malloc才去执行一次检测。

接下来看看CheckGCForNative函数内部的逻辑。

首先计算当前native内存的总大小，然后计算当前内存大小和阈值之间的比值，如果比值≥1，则请求一次新的GC。

art/runtime/gc/heap.cc

 inline void Heap::CheckGCForNative(Thread* self) {
   bool is_gc_concurrent = IsGcConcurrent();
   size_t current_native_bytes = GetNativeBytes();    <================获取native内存的总大小
   float gc_urgency = NativeMemoryOverTarget(current_native_bytes, is_gc_concurrent); <============计算当前内存大小和阈值之间的比值，大于等于1则表明需要一次新的GC
   if (UNLIKELY(gc_urgency >= 1.0)) {
     if (is_gc_concurrent) {
       RequestConcurrentGC(self, kGcCauseForNativeAlloc, /*force_full=*/true);   <=================请求一次新的GC
       if (gc_urgency > kStopForNativeFactor
           && current_native_bytes > stop_for_native_allocs_) {
         // We're in danger of running out of memory due to rampant native allocation.
         if (VLOG_IS_ON(heap) || VLOG_IS_ON(startup)) {
           LOG(INFO) << "Stopping for native allocation, urgency: " << gc_urgency;
         }
         WaitForGcToComplete(kGcCauseForNativeAlloc, self);
       }
     } else {
       CollectGarbageInternal(NonStickyGcType(), kGcCauseForNativeAlloc, false);
     }
   }
 }

获取当前native内存的总大小需要调用GetNativeBytes函数。其内部统计也分为两部分，一部分是通过mallinfo获取的当前malloc的总大小。由于系统有专门的API获取这个信息，所以在NativeAllocationRegistry.registerNativeAllocation的时候不需要专门去存储单次malloc的大小。另一部分是native_bytes_registered_字段记录的所有注册过的mmap大小。二者相加，基本上反映了当前进程native内存的整体消耗。

art/runtime/gc/heap.cc

 size_t Heap::GetNativeBytes() {
   size_t malloc_bytes;
 #if defined(__BIONIC__) || defined(__GLIBC__)
   IF_GLIBC(size_t mmapped_bytes;)
   struct mallinfo mi = mallinfo();
   // In spite of the documentation, the jemalloc version of this call seems to do what we want,
   // and it is thread-safe.
   if (sizeof(size_t) > sizeof(mi.uordblks) && sizeof(size_t) > sizeof(mi.hblkhd)) {
     // Shouldn't happen, but glibc declares uordblks as int.
     // Avoiding sign extension gets us correct behavior for another 2 GB.
     malloc_bytes = (unsigned int)mi.uordblks;
     IF_GLIBC(mmapped_bytes = (unsigned int)mi.hblkhd;)
   } else {
     malloc_bytes = mi.uordblks;
     IF_GLIBC(mmapped_bytes = mi.hblkhd;)
   }
   // From the spec, it appeared mmapped_bytes <= malloc_bytes. Reality was sometimes
   // dramatically different. (b/119580449 was an early bug.) If so, we try to fudge it.
   // However, malloc implementations seem to interpret hblkhd differently, namely as
   // mapped blocks backing the entire heap (e.g. jemalloc) vs. large objects directly
   // allocated via mmap (e.g. glibc). Thus we now only do this for glibc, where it
   // previously helped, and which appears to use a reading of the spec compatible
   // with our adjustment.
 #if defined(__GLIBC__)
   if (mmapped_bytes > malloc_bytes) {
     malloc_bytes = mmapped_bytes;
   }
 #endif  // GLIBC
 #else  // Neither Bionic nor Glibc
   // We should hit this case only in contexts in which GC triggering is not critical. Effectively
   // disable GC triggering based on malloc().
   malloc_bytes = 1000;
 #endif
   return malloc_bytes + native_bytes_registered_.load(std::memory_order_relaxed);
   // An alternative would be to get RSS from /proc/self/statm. Empirically, that's no
   // more expensive, and it would allow us to count memory allocated by means other than malloc.
   // However it would change as pages are unmapped and remapped due to memory pressure, among
   // other things. It seems risky to trigger GCs as a result of such changes.
 }

得到当前进程native内存的总大小之后，便需要抉择是否需要一次新的GC。

决策的过程如下，源码下面是详细解释。

art/runtime/gc/heap.cc

 // Return the ratio of the weighted native + java allocated bytes to its target value.
 // A return value > 1.0 means we should collect. Significantly larger values mean we're falling
 // behind.
 inline float Heap::NativeMemoryOverTarget(size_t current_native_bytes, bool is_gc_concurrent) {
   // Collection check for native allocation. Does not enforce Java heap bounds.
   // With adj_start_bytes defined below, effectively checks
   // <java bytes allocd> + c1*<old native allocd> + c2*<new native allocd) >= adj_start_bytes,
   // where c3 > 1, and currently c1 and c2 are 1 divided by the values defined above.
   size_t old_native_bytes = old_native_bytes_allocated_.load(std::memory_order_relaxed);
   if (old_native_bytes > current_native_bytes) {
     // Net decrease; skip the check, but update old value.
     // It's OK to lose an update if two stores race.
     old_native_bytes_allocated_.store(current_native_bytes, std::memory_order_relaxed);
     return 0.0;
   } else {
     size_t new_native_bytes = UnsignedDifference(current_native_bytes, old_native_bytes);   <=======(1)
     size_t weighted_native_bytes = new_native_bytes / kNewNativeDiscountFactor              <=======(2)
         + old_native_bytes / kOldNativeDiscountFactor;
     size_t add_bytes_allowed = static_cast<size_t>(                                         <=======(3)
         NativeAllocationGcWatermark() * HeapGrowthMultiplier());
     size_t java_gc_start_bytes = is_gc_concurrent                                           <=======(4)
         ? concurrent_start_bytes_
         : target_footprint_.load(std::memory_order_relaxed);
     size_t adj_start_bytes = UnsignedSum(java_gc_start_bytes,                               <=======(5)
                                          add_bytes_allowed / kNewNativeDiscountFactor);
     return static_cast<float>(GetBytesAllocated() + weighted_native_bytes)                  <=======(6)
          / static_cast<float>(adj_start_bytes);
   }
 }

首先将本次native内存总大小和上一次GC完成后的native内存总大小进行比较。如果小于上次的总大小，则表明native内存的使用水平降低了，因此完全没有必要进行一次新的GC。

但如果这次native内存使用增长的话，则需要进一步计算当前值和阈值之间的比例关系，大于等于1的话就需要进行GC。下面详细介绍源码中的(1)~(6)。

（1）计算本次native内存和上次之间的差值，这个差值反映了native内存中新增长部分的大小。

（2）给不同部分的native内存以不同的权重，新增长部分除以2，旧的部分除以65536。之所以给旧的部分权重如此之低，是因为native堆本身是没有上限的。这套机制的初衷并不是限制native堆的大小，而只是防止两次GC间native内存垃圾积累过多。

（3）所谓的阈值并不是为native内存单独设立的，而是为(Java堆大小+native内存大小)整体设立的。add_bytes_allowed表示在原有Java堆阈值的基础上，还可以允许的native内存大小。NativeAllocationGcWatermark根据Java堆阈值计算出允许的native内存大小，Java堆阈值越大，允许的值也越大。HeapGrowthMultipiler对于前台应用是2，表明前台应用的内存管控更松，GC触发频率更低。

（4）同等条件下，同步GC的触发水位要低于非同步GC，原因是同步GC在垃圾回收时也会有新的对象分配，因此加上这些新分配的对象最好也不要超过阈值。

（5）将Java堆阈值和允许的native内存相加，作为新的阈值。

（6）将Java堆已分配的大小和调整权重后的native内存大小相加，并将相加后的结果除以阈值，得到一个比值来判定是否需要GC。

通过如下代码可知，当比值≥1时，将请求一次新的GC。

art/runtime/gc/heap.cc

   if (UNLIKELY(gc_urgency >= 1.0)) {
     if (is_gc_concurrent) {
       RequestConcurrentGC(self, kGcCauseForNativeAlloc, /*force_full=*/true);   <=================请求一次新的GC

1.2 MMap内存

mmap的处理方式和malloc基本相当，大于300,000 bytes或mmap三百次都执行CheckGCForNative。唯一的区别在于mmap需要将每一次的大小都计入native_bytes_registered中，因为mallinfo中并不会记录这个信息（针对bionic库而言）。

art/runtime/gc/heap.cc

 void Heap::RegisterNativeAllocation(JNIEnv* env, size_t bytes) {
   // Cautiously check for a wrapped negative bytes argument.
   DCHECK(sizeof(size_t) < 8 || bytes < (std::numeric_limits<size_t>::max() / 2));
   native_bytes_registered_.fetch_add(bytes, std::memory_order_relaxed);
   uint32_t objects_notified =
       native_objects_notified_.fetch_add(1, std::memory_order_relaxed);
   if (objects_notified % kNotifyNativeInterval == kNotifyNativeInterval - 1
       || bytes > kCheckImmediatelyThreshold) {
     CheckGCForNative(ThreadForEnv(env));
   }
 }

2. 如何在Java对象回收时触发native内存回收

NativeAllocationRegistry中主要依靠Cleaner机制完成了这个过程。关于Cleaner的细节，可以参考我的上篇博客。

3. 实际案例

Bitmap类就是通过NativeAllocationRegistry来实现native资源自动释放的。以下是Bitmap构造方法的一部分。

frameworks/base/graphics/java/android/graphics/Bitmap.java

         mNativePtr = nativeBitmap;         <=========================== 通过指针值间接持有native资源
 
         final int allocationByteCount = getAllocationByteCount(); <==== 获取native资源的大小，如果是mmap方式，这个大小最终会计入native_bytes_registered中
         NativeAllocationRegistry registry;
         if (fromMalloc) {
             registry = NativeAllocationRegistry.createMalloced(   <==== 根据native资源分配方式的不同，构造不同的NativeAllocationRegistry对象，nativeGetNativeFinalizer()返回的是native资源释放函数的函数指针
                     Bitmap.class.getClassLoader(), nativeGetNativeFinalizer(), allocationByteCount);
         } else {
             registry = NativeAllocationRegistry.createNonmalloced(
                     Bitmap.class.getClassLoader(), nativeGetNativeFinalizer(), allocationByteCount);
         }
         registry.registerNativeAllocation(this, nativeBitmap);   <===== 检测是否需要GC

通过上述案例可知，当我们使用NativeAllocationRegistry来为Java类自动释放native内存资源时，首先需要创建NativeAllocationRegistry对象，接着调用registerNativeAllocation方法。只此两步，便可实现native资源的自动释放。

既然需要两步，那为什么registerNativeAllocation不放进NativeAllocationRegistry的构造方法，这样一步岂不是更好？原因是registerNativeAllocation独立出来，便可以在native资源真正申请后再去告知GC，灵活性更高。此外，NativeAllocationRegistry中还有一个registerNativeFree方法与之对应，可以让应用层在自己提前释放native资源后告知GC。

笔记

【360°全方位性能调优】

这份笔记我将Android-360°全方位性能优化知识点，以及微信、淘宝、抖音、头条、高德地图、优酷等等亿万级用户APP在性能优化方面的实践经验，整合成了一套系统的知识笔记PDF，从理论到实践，涉及Android性能优化的所有知识点，长达721页电子书！相信看完这份文档，你会对Android性能调优知识体系及各种方案有更系统、更深入的理解。

需要的小伙伴点赞+关注后在我的GitHub即可直接免费下载获取~

在这里插入图片描述

文末

感谢大家关注我，分享Android干货，交流Android技术。
对文章有何见解，或者有何技术问题，都可以在评论区一起留言讨论，我会虔诚为你解答。
也欢迎大家来我的B站找我玩，有各类Android架构师进阶技术难点的视频讲解，助你早日升职加薪。
B站直通车：https://space.bilibili.com/544650554

作者：芦半山
链接：https://juejin.im/post/6894153239907237902