构建高效可扩展缓存小案例

最新推荐文章于 2024-08-26 23:57:38 发布

lang20150928

最新推荐文章于 2024-08-26 23:57:38 发布

阅读量384

点赞数

分类专栏：基础文章标签： java

本文链接：https://blog.csdn.net/m0_37607945/article/details/124093545

版权

基础专栏收录该内容

55 篇文章 2 订阅

订阅专栏

基本上每个应用都会用到缓存，通过缓存之前的计算结果可以有效的提高系统的吞吐率和减少延迟。

下面代码中Computable接口用于抽象一个计算任务，具体的实现为ExpensiveFunction，在compute方法中假设通过复杂的计算，然后返回了接口。

public interface Computable<A, V> {
	V compute(A arg) throws InterruptedException;
}

public class ExpensiveFunction implements Computable<String, BigInteger> {
	@Override
	public BigInteger compute(String arg) {
		// after deep thought...
		return new BigInteger(arg);
	}
}

为了能够使用已经计算过的结果，我们创建一个Computable的包装器可以在计算时直接使用之前已经计算过的结果。

public class Memoizer1<A, V> implements Computable<A, V> {
	private final Map<A, V> cache = new HashMap<A, V>();
	private final Computable<A, V> c;

	public Memoizer1(Computable<A, V> c) {
		this.c = c;
	}

	@Override
	public synchronized V compute(A arg) throws InterruptedException {
		V result = cache.get(arg);
		if (result == null) {
			result = c.compute(arg);
			cache.put(arg, result);
		}
		return result;
	}
}

以上的实现通过一个HashMap来保存计算结果，计算完成之后会把结果存放到缓存中。计算之前先查询缓存，如果有值，就直接返回，这样能有效减少相同值请求的响应。因为HashMap不是线程安全的，所以在方法上面使用了synchronized关键字，保证每次只有一个线程会调用这个方法。如果传入的参数都是一样的，上面这个实现是没有问题的。但是在实际场景中，传入的参数肯定是会不同的。由于方法是同步的，会导致在实际场景中，响应会比不使用缓存还要差。
在这里插入图片描述
在上面这个场景中，首先线程A计算了值为1的结果并缓存，然后线程B请求计算2，而线程C请求计算1，由于整个方法完全同步，虽然线程C可以重用缓存，但是必须等待线程B执行完毕才能执行，在没有使用缓存之前，线程C需要的时间是计算值为1的时间，而现在是计算值为2加上从缓存中获取值为1的时间。如果计算2比计算1时间还长，那么对于该线程C来说，响应更慢了。

HashMap is not thread-safe, so to ensure that two threads do not access the
HashMap at the same time, Memoizer1 takes the conservative approach of synchronizing the entire compute method. This ensures thread safety but has an obvious
scalability problem: only one thread at a time can execute compute at all. If another thread is busy computing a result, other threads calling compute may be
blocked for a long time.

由于HashMap不是线程安全的，所以替换为线程安全的ConcurrentHashMap，并且方法去掉同步修饰符synchronized 。代码如下所示

public class Memoizer2<A, V> implements Computable<A, V> {
	private final Map<A, V> cache = new ConcurrentHashMap<A, V>();
	private final Computable<A, V> c;
	public Memoizer2(Computable<A, V> c) { this.c = c; }
	public V compute(A arg) throws InterruptedException {
		V result = cache.get(arg);
		if (result == null) {
			result = c.compute(arg);
			cache.put(arg, result);
		}
		return result;
	}
}

此时看起来不像有问题。但其实这里还是存在竞态问题。因为从cache中获取结果判断为空与将result存放到缓存是一个复合操作，而且compute方法执行时间比较长，更加加重了竞态的发生。如下图所示
在这里插入图片描述
线程A正在执行计算值为1的任务中，线程B同样请求执行相同的任务，最后这个任务被执行了两次，虽然在满足幂等性的场合下，对最后的结果没有影响。但是这还不是我们希望的结果。在上面的场合中，判断是否存在于最后添加到缓存当中的计算时间比较长，在Java中，可以通过同步转异步的方式，将以上的时间缩短。如下所示

public class Memoizer3<A, V> implements Computable<A, V> {
	private final Map<A, Future<V>> cache
			= new ConcurrentHashMap<A, Future<V>>();
	private final Computable<A, V> c;
	public Memoizer3(Computable<A, V> c) { this.c = c; }
	public V compute(final A arg) throws InterruptedException {
		Future<V> f = cache.get(arg);
		if (f == null) {
			Callable<V> eval = new Callable<V>() {
				public V call() throws InterruptedException {
					return c.compute(arg);
				}
			};
			FutureTask<V> ft = new FutureTask<V>(eval);
			f = ft;
			cache.put(arg, ft);
			ft.run(); // call to c.compute happens here
		}
		try {
			return f.get();
		} catch (ExecutionException e) {
			throw launderThrowable(e.getCause());
		}
	}
}

以上代码在绝大多数并发不是很高的场景没什么问题。因为复合操作（判断缓存是否存在和将计算结果放到缓存）时间比较短，大大减少了竞态。

Memoizer3 is vulnerable to this problem because a compound action (putif-absent) is performed on the backing map that cannot be made atomic using locking.

如果能将复合操作替换为原子操作，那么以上的问题就解决了。如下所示，通过ConcurrentMap的putIfAbsent方法保证了判断和存放操作的原子性。

public class Memoizer<A, V> implements Computable<A, V> {
	private final ConcurrentMap<A, Future<V>> cache
			= new ConcurrentHashMap<A, Future<V>>();
	private final Computable<A, V> c;
	public Memoizer(Computable<A, V> c) { this.c = c; }
	public V compute(final A arg) throws InterruptedException {
		while (true) {
			Future<V> f = cache.get(arg);
			if (f == null) {
				Callable<V> eval = new Callable<V>() {
					public V call() throws InterruptedException {
						return c.compute(arg);
					}
				};
				FutureTask<V> ft = new FutureTask<V>(eval);
				f = cache.putIfAbsent(arg, ft);
				if (f == null) { f = ft; ft.run(); }
			}
			try {
				return f.get();
			} catch (CancellationException e) {
				cache.remove(arg, f);
			} catch (ExecutionException e) {
				throw launderThrowable(e.getCause());
			}
		}
	}
}

缓存一个 Future 而不是一个值会产生缓存污染的可能性：如果计算被取消或失败，未来计算结果的尝试也将指示取消或失败。为了避免这种情况，如果 Memoizer 检测到计算被取消，它会从缓存中删除 Future；如果计算可能在未来的尝试中成功，则可能还需要在检测到 RuntimeException 时删除 Future。 Memoizer 也没有解决缓存过期问题，但这可以通过使用 FutureTask 的子类来完成，该子类将过期时间与每个结果相关联并定期扫描缓存以查找过期条目。（类似地，它没有解决缓存清理问题，即删除旧条目以为新条目腾出空间，这样缓存就不会消耗太多内存。）

lang20150928

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
构建高效可扩展缓存小案例

基本上每个应用都会用到缓存，通过缓存之前的计算结果可以有效的提高系统的吞吐率和减少延迟。下面代码中Computable接口用于抽象一个计算任务，具体的实现为ExpensiveFunction，在compute方法中假设通过复杂的计算，然后返回了接口。public interface Computable<A, V> { V compute(A arg) throws InterruptedException;}public class ExpensiveFunction implem
复制链接

扫一扫