guava cache原理分析

清风镰月

于 2024-08-15 20:26:36 发布

阅读量773

点赞数 10

分类专栏： java源码解析文章标签： guava spring java

本文链接：https://blog.csdn.net/cj19852005/article/details/141230945

版权

java源码解析专栏收录该内容

6 篇文章 0 订阅

订阅专栏

背景

在信息系统的开发和维护过程中，为保证系统的响应速度，通常会引入缓存。而本地缓存作为一种读取效率更高，维护成本更低的组件，通常会被作为缓存的首选。而google 开源的通用开发套件guava中，本身包含cache组件，这是很多项目选择guava的主要原因。任何系统的核心组件，都得熟悉它的运行原理。

使用

guava cache的使用比较简单，首先使用CacheBuilder构建一个Cache对象，指定了最大对象数目，对象写入1小时后过期，然后使用cache.put放入缓存数据，使用get或getIfPresent方法获取数据。get方法在缓存中没有数据的时候会使用CacheLoader去加载数据，而getIfPresent不会。

//使用CacheBuilder构建缓存，缓存最大数量100000,缓存写入1小时后自动过期
        Cache<String, Object> cache = CacheBuilder.newBuilder().maximumSize(100000).expireAfterWrite(1, TimeUnit.HOURS)
                .build();
        //向缓存中放入key=key1,value=122的数据
        cache.put("key1", 122);
        //getIfPresent方法：从缓存中获取数据,缓存中没有则返回null
        Object value = cache.getIfPresent("key1");
        try {
            //get方法: 如果缓存中没有数据，则会使用get方法的第二个参数cacheLoader来加载数据返回
            value = cache.get("key2", () -> {
                //加载数据放入缓存并返回
                return null;
            });
        } catch (ExecutionException e) {
            e.printStackTrace();
        }
        cache.invalidate("key1");

原理

从上面的代码来看，guava cache使用了builder模式构造出cache对象，然而cache只是一个接口，具体是构造出来的是哪个对象呢？

如果使用无参的build方法构造出来的是LocalManualCache对象；如果是带有CacheLoader参数的build方法，构造出来的是LocalLoadingCache对象。
guava cache类关系图
guava cache 各个核心类之前的关系图如上，可以看到不论是LocalManualCache还是LocalLoadingCache都是继承LocalCache，由此可以推测cache 的核心逻辑应该在LocalCache类中。

为了解cache的实现原理，以LocalManualCache为例，先从cache.put方法切入。从下面的源码可以看到，LocalManualCache的put方法会调用LocalCache的put方法。LocalCache的put方法先根据key计算hash值，然后通过hash值定位到segement，然后实际上会调用segment的put方法。这个设计类似如ConcurrentHashMap，将map按key进行分段，在段内进行put、get操作，避免高并发情况下加剧锁竞争，Segment内的数据结构跟HashMap类似，也是数组链表的方式，hash冲突的数据放入链表，每次都放到链表头。但为保证线程安全Segment使用AtomicReferenceArray来保存数组元素，并不像HashMap直接使用对象数组。

//=======================LocalManualCache===============
public void put(K key, V value) {
      localCache.put(key, value);
}

//====================LocalCache========================
public V put(K key, V value) {
    checkNotNull(key);
    checkNotNull(value);
    int hash = hash(key);
    return segmentFor(hash).put(key, hash, value, false);
}
//===================Segment===========================
V put(K key, int hash, V value, boolean onlyIfAbsent) {
      lock();
      try {
        long now = map.ticker.read();
        /**
        * preWriteCleanup方法预清理keyReference和valueReference，同时清理掉写过期(指定了expireAfterWrite的情况)和读过期(指定了expireAfterAccess的情况)的数据。
        * 这里有个背景知识是在创建SoftReference、WeakReference的时候，可以指定ReferenceQueue。像下面的
        * SoftReference构造函数的第二个参数就是ReferenceQueue。这种方式创建的SoftReference或
        * WeakReference在引用的对象被垃圾收集器回收时，引用会被放入ReferenceQueue中。
        * 前面提到的清理keyReference和valueReference，就是清理ReferenceQueue中的引用。
        * public SoftReference(T referent, ReferenceQueue<? super T> q) {
        *    super(referent, q);
        *    this.timestamp = clock;
        * }
        **/
        preWriteCleanup(now);

        //这里可以看到如果count超过threshold，没有超过最大容量，会将容量扩大1倍，跟HashMap类似。
        int newCount = this.count + 1;
        if (newCount > this.threshold) { // ensure capacity
          expand();
          newCount = this.count + 1;
        }

        AtomicReferenceArray<ReferenceEntry<K, V>> table = this.table;
        int index = hash & (table.length() - 1);
        //根据hash从AtomicReferenceArray中获取到ReferenceEntry
        ReferenceEntry<K, V> first = table.get(index);
        /**
         * 然后遍历ReferenceEntry chain，找到跟当前key相同的entry，并更新value。
         * 可以看到在更新value的时候会调用evictEntries删除超过了最大weight的数据。
         */
        for (ReferenceEntry<K, V> e = first; e != null; e = e.getNext()) {
          K entryKey = e.getKey();
          if (e.getHash() == hash
              && entryKey != null
              && map.keyEquivalence.equivalent(key, entryKey)) {
            // We found an existing entry.

            ValueReference<K, V> valueReference = e.getValueReference();
            V entryValue = valueReference.get();

            if (entryValue == null) {
              ++modCount;
              if (valueReference.isActive()) {
                enqueueNotification(
                    key, hash, entryValue, valueReference.getWeight(), RemovalCause.COLLECTED);
                setValue(e, key, value, now);
                newCount = this.count; // count remains unchanged
              } else {
                setValue(e, key, value, now);
                newCount = this.count + 1;
              }
              this.count = newCount; // write-volatile
              evictEntries(e);
              return null;
            } else if (onlyIfAbsent) {
              // Mimic
              // "if (!map.containsKey(key)) ...
              // else return map.get(key);
              recordLockedRead(e, now);
              return entryValue;
            } else {
              // clobber existing entry, count remains unchanged
              ++modCount;
              enqueueNotification(
                  key, hash, entryValue, valueReference.getWeight(), RemovalCause.REPLACED);
              setValue(e, key, value, now);
              evictEntries(e);
              return entryValue;
            }
          }
        }

        // Create a new entry.
        ++modCount;
        ReferenceEntry<K, V> newEntry = newEntry(key, hash, first);
        setValue(newEntry, key, value, now);
        table.set(index, newEntry);
        newCount = this.count + 1;
        this.count = newCount; // write-volatile
        evictEntries(newEntry);
        return null;
      } finally {
        unlock();
        postWriteCleanup();
      }
    }

从上面的代码可以看到过期数据的清理并不是通过定时任务来清理的，是在put或get的时候去做的。这种方式的好处是避免了定时任务空跑及无法及时清理过期数据问题，但会对get、put方法有些许的性能损耗。

下面来看看geIfPressent方法逻辑，从下面代码可以看到想通过hash定位到Segment，再调用Segment的get方法获取数据。

//===========================LocalCache====================================
@CheckForNull
  public V getIfPresent(Object key) {
    int hash = hash(checkNotNull(key));
    V value = segmentFor(hash).get(key, hash);
    if (value == null) {
      globalStatsCounter.recordMisses(1);
    } else {
      globalStatsCounter.recordHits(1);
    }
    return value;
  }
//==========================Segment==========================================
V get(Object key, int hash) {
      try {
        if (count != 0) { // read-volatile
          long now = map.ticker.read();
          ReferenceEntry<K, V> e = getLiveEntry(key, hash, now);
          if (e == null) {
            return null;
          }

          V value = e.getValueReference().get();
          if (value != null) {
            recordRead(e, now);
            //如果指定了refreshNanos，并且写入时间超过了refreshNanos，则会同步调用cacheLoader去加载数据，加载的数据放入到LoadingValueReference中。同时在缓存的老值或新值为null时，是直接返回设置了新值的ImmediateFuture。否则，将新值放入到LoadingValueReference中。同步请求CacheLoader的意义是保证能获取到最新的数据，但同时会影响数据获取性能。
            return scheduleRefresh(e, e.getKey(), hash, value, now, map.defaultLoader);
          }
          tryDrainReferenceQueues();
        }
        return null;
      } finally {
        postReadCleanup();
      }
    }

@CheckForNull
    ReferenceEntry<K, V> getLiveEntry(Object key, int hash, long now) {
      ReferenceEntry<K, V> e = getEntry(key, hash);
      if (e == null) {
        return null;
      } else if (map.isExpired(e, now)) {
        //这里会清理过期的entry
        tryExpireEntries(now);
        return null;
      }
      return e;
    }

//根据hash定位到AtomicReferenceArray的下标，然后遍历该位置的链表，找到指定key的entry。如果在遍历entry过程中发现entry的key为null，则会开始清理referenceQueue和entry。为什么key会是null呢？？因为放入entry的key不能为null，所以这是这个entry对应的key是弱引用或虚引用，被gc掉了。
@CheckForNull
    ReferenceEntry<K, V> getEntry(Object key, int hash) {
      for (ReferenceEntry<K, V> e = getFirst(hash); e != null; e = e.getNext()) {
        if (e.getHash() != hash) {
          continue;
        }

        K entryKey = e.getKey();
        if (entryKey == null) {
          tryDrainReferenceQueues();
          continue;
        }

        if (map.keyEquivalence.equivalent(key, entryKey)) {
          return e;
        }
      }

      return null;
    }

从get、put方法的代码逻辑来看，guava cache仅对数据变更(写入、修改、删除)的逻辑做了加锁操作，避免数据写入的线程安全性问题。但数据读取并没有加锁，这就可能存在脏读、幻读的问题。

最后看下invalidate方法源码，先根据hash定位到Segment，然后调用Segment的remove方法。Segment方法先进行预清理逻辑同put方法，然后根据hash找到数组下标，然后遍历链表找到跟key相等的数据，并删掉。

//==================================LocalCache=======================================
public void invalidate(Object key) {
      checkNotNull(key);
      localCache.remove(key);
}
  @Override
public V remove(@CheckForNull Object key) {
    if (key == null) {
      return null;
    }
    int hash = hash(key);
    return segmentFor(hash).remove(key, hash);
  }

//==================================Segment=======================================
V remove(Object key, int hash) {
      lock();
      try {
        long now = map.ticker.read();
        //这里跟put方法类似,进行keyReference和valueReference、entry的预清理。
        preWriteCleanup(now);

        int newCount = this.count - 1;
        AtomicReferenceArray<ReferenceEntry<K, V>> table = this.table;
        int index = hash & (table.length() - 1);
        ReferenceEntry<K, V> first = table.get(index);

        for (ReferenceEntry<K, V> e = first; e != null; e = e.getNext()) {
          K entryKey = e.getKey();
          if (e.getHash() == hash
              && entryKey != null
              && map.keyEquivalence.equivalent(key, entryKey)) {
            ValueReference<K, V> valueReference = e.getValueReference();
            V entryValue = valueReference.get();

            RemovalCause cause;
            if (entryValue != null) {
              cause = RemovalCause.EXPLICIT;
            } else if (valueReference.isActive()) {
              cause = RemovalCause.COLLECTED;
            } else {
              // currently loading
              return null;
            }

            ++modCount;
            ReferenceEntry<K, V> newFirst =
                removeValueFromChain(first, e, entryKey, hash, entryValue, valueReference, cause);
            newCount = this.count - 1;
            table.set(index, newFirst);
            this.count = newCount; // write-volatile
            return entryValue;
          }
        }

        return null;
      } finally {
        unlock();
        postWriteCleanup();
}