@CacheEvict的坑不知大家踩过吗
起因
前几天每到中午时间段,生产的接口有大量超时的情况,elk监控情况如下
同时看到grafana监控下同一时间下有大量keys命令的调用
每个半个钟就会有一次redis超时的情况,接口也频繁超时,关键是到了生产高峰期了,又得有人祭天了,一群大佬在群里议论纷纷,后面顺着半个钟一次的规律各部门排查了一些定时器,终于有了结果,大概情况就是,某同学用定时器去执行了一个方法,用**@CacheEvict**去删除缓存,这个也是热点缓存,后面也紧急处理了,把定时器关了
思考
为啥一个@CacheEvict会调用keys命令呢,一般大家都知道keys是会阻塞线程的,更何况redis是单线程执行,一个keys就可以搞崩整个redis,通常情况下都用scan命令代替keys命令,那就一看究竟吧
从@Cacheable开始入手,设置一个缓存,aop注解通过拦截器进入,Cache用的CacheInterceptor
public Object invoke(final MethodInvocation invocation) throws Throwable {
Method method = invocation.getMethod();
CacheOperationInvoker aopAllianceInvoker = () -> {
try {
return invocation.proceed();
}
catch (Throwable ex) {
throw new CacheOperationInvoker.ThrowableWrapper(ex);
}
};
Object target = invocation.getThis();
Assert.state(target != null, "Target must not be null");
try {
return execute(aopAllianceInvoker, target, method, invocation.getArguments());
}
catch (CacheOperationInvoker.ThrowableWrapper th) {
throw th.getOriginal();
}
}
调用execute执行具体方法,通过判断缓存信息是否存在才去执行 org.springframework.cache.interceptor.CacheAspectSupport#execute(org.springframework.cache.interceptor.CacheOperationInvoker, java.lang.reflect.Method, org.springframework.cache.interceptor.CacheAspectSupport.CacheOperationContexts) 方法
执行的关键代码在这
@Nullable
private Object execute(final CacheOperationInvoker invoker, Method method, CacheOperationContexts contexts) {
// 开启sync同步调用配置
if (contexts.isSynchronized()) {
CacheOperationContext context = contexts.get(CacheableOperation.class).iterator().next();
if (isConditionPassing(context, CacheOperationExpressionEvaluator.NO_RESULT)) {
Object key = generateKey(context, CacheOperationExpressionEvaluator.NO_RESULT);
Cache cache = context.getCaches().iterator().next();
try {
return wrapCacheValue(method, handleSynchronizedGet(invoker, key, cache));
}
catch (Cache.ValueRetrievalException ex) {
ReflectionUtils.rethrowRuntimeException(ex.getCause());
}
}
else {
return invokeOperation(invoker);
}
}
// beforeInvocation开启后执行此前置缓存驱逐方法(beforeInvocation默认false)
processCacheEvicts(contexts.get(CacheEvictOperation.class), true,
CacheOperationExpressionEvaluator.NO_RESULT);
// 校验是否存在条件通过的缓存命中项
Cache.ValueWrapper cacheHit = findCachedItem(contexts.get(CacheableOperation.class));
// 如果未找到缓存项,将空结果放入cachePutRequests
List<CachePutRequest> cachePutRequests = new ArrayList<>();
if (cacheHit == null) {
collectPutRequests(contexts.get(CacheableOperation.class),
CacheOperationExpressionEvaluator.NO_RESULT, cachePutRequests);
}
Object cacheValue;
Object returnValue;
if (cacheHit != null && !hasCachePut(contexts)) {
// 如果没有CachePut请求,只使用缓存命中
cacheValue = cacheHit.get();
returnValue = wrapCacheValue(method, cacheValue);
}
else {
// 没有命中缓存,执行原方法
returnValue = invokeOperation(invoker);
cacheValue = unwrapReturnValue(returnValue);
}
// 从@CachePut 收集缓存
collectPutRequests(contexts.get(CachePutOperation.class), cacheValue, cachePutRequests);
// 处理收集的请求
for (CachePutRequest cachePutRequest : cachePutRequests) {
cachePutRequest.apply(cacheValue);
}
// 执行后置缓存驱逐方法
processCacheEvicts(contexts.get(CacheEvictOperation.class), false, cacheValue);
return returnValue;
}
如果加了@CacheEvict方法后,会执行org.springframework.cache.interceptor.CacheAspectSupport#processCacheEvicts
这个方法
private void processCacheEvicts(
Collection<CacheOperationContext> contexts, boolean beforeInvocation, @Nullable Object result) {
for (CacheOperationContext context : contexts) {
CacheEvictOperation operation = (CacheEvictOperation) context.metadata.operation;
if (beforeInvocation == operation.isBeforeInvocation() && isConditionPassing(context, result)) {
performCacheEvict(context, operation, result);
}
}
}
继续往下走
private void performCacheEvict(
CacheOperationContext context, CacheEvictOperation operation, @Nullable Object result) {
Object key = null;
for (Cache cache : context.getCaches()) {
// allEntries属性为true的情况下执行此方法
if (operation.isCacheWide()) {
logInvalidating(context, operation, null);
doClear(cache, operation.isBeforeInvocation());
}
else {
if (key == null) {
key = generateKey(context, result);
}
logInvalidating(context, operation, key);
doEvict(cache, key, operation.isBeforeInvocation());
}
}
}
于是发现走到了这个方法,里面就藏着神奇的keys命令
protected void doClear(Cache cache, boolean immediate) {
try {
// beforeInvocation 属性为true
if (immediate) {
cache.invalidate();
}
else {
cache.clear();
}
}
catch (RuntimeException ex) {
getErrorHandler().handleCacheClearError(ex, cache);
}
}
clear有多个子类执行,我这里是用redis来缓存
继续往下走到了spring-data-redis的源码
org.springframework.data.redis.cache.DefaultRedisCacheWriter#clean
@Override
public void clean(String name, byte[] pattern) {
Assert.notNull(name, "Name must not be null!");
Assert.notNull(pattern, "Pattern must not be null!");
execute(name, connection -> {
boolean wasLocked = false;
try {
if (isLockingCacheWriter()) {
doLock(name, connection);
wasLocked = true;
}
// 就是这步执行了keys命令
long deleteCount = batchStrategy.cleanCache(connection, name, pattern);
while (deleteCount > Integer.MAX_VALUE) {
statistics.incDeletesBy(name, Integer.MAX_VALUE);
deleteCount -= Integer.MAX_VALUE;
}
statistics.incDeletesBy(name, (int) deleteCount);
} finally {
if (wasLocked && isLockingCacheWriter()) {
doUnlock(name, connection);
}
}
return "OK";
});
}
最后发生keys的地方,原来罪魁祸首就是在这里
org.springframework.data.redis.cache.BatchStrategies.Keys
static class Keys implements BatchStrategy {
static Keys INSTANCE = new Keys();
@Override
public long cleanCache(RedisConnection connection, String name, byte[] pattern) {
byte[][] keys = Optional.ofNullable(connection.keys(pattern)).orElse(Collections.emptySet())
.toArray(new byte[0][]);
if (keys.length > 0) {
connection.del(keys);
}
return keys.length;
}
}
调用大致流程图如下
总结
使用**@CacheEvict** 时如果环境对redis集群有这强依赖,并且热点key比较多的情况下,还是不要随便加allEntries = true属性了,除非你准备提桶跑路,当然如果不加allEntries,或者allEntries = false,也不会触发keys命令,执行代码就会走到这一步
以上就是本次故障的个人记录,有说错的地方还请大佬指点,最后祝大家工作顺利,远离bug