重试机制优雅的处理
1. curator 进行数据请求时,通过StandardConnectionHandlingPolicy类中的callWithRetry进行处理。
@Override
public <T> T callWithRetry(CuratorZookeeperClient client, Callable<T> proc) throws Exception
{
T result = null;
// 创建一个循环重试
RetryLoop retryLoop = client.newRetryLoop();
//判断循环重试是否继续(isDone的状态)
while ( retryLoop.shouldContinue() )
{
try
{
result = proc.call();
// 调用服务成功后,将重试循环标志位完成(isDone=true)
retryLoop.markComplete();
}
catch ( Exception e )
{
ThreadUtils.checkInterrupted(e);
// 在重试机制类里处理异常,判断是否还继续进行重试操作
retryLoop.takeException(e);
}
}
return result;
}
2. 重试机制的处理逻辑主要在RetryLoop中,该类定义了重试所需要的属性,isDone 用于该重试是否还继续,retryCount 重试次数,RetryPolicy 重试策略,curator提供了几种的重试策略方式,比如说一次重试策略,多次重试策略,指数型重试策略等。
public class RetryLoop
{
// 重试状态标识
private boolean isDone = false;
// 重试次数
private int retryCount = 0;
private final Logger log = LoggerFactory.getLogger(getClass());
// 开始重试的时间点
private final long startTimeMs = System.currentTimeMillis();
// 重试策略
private final RetryPolicy retryPolicy;
/**
* Returns the default retry sleeper
*
* @return sleeper
*/
public static RetrySleeper getDefaultRetrySleeper()
{
return sleeper;
}
/**
* 创建一个重试机制,参数为curator客户端,回调函数
* 重试机制接口类为:ConnectionHandlingPolicy 实现类为:StandardConnectionHandlingPolicy
*/
public static<T> T callWithRetry(CuratorZookeeperClient client, Callable<T> proc) throws Exception
{
return client.getConnectionHandlingPolicy().callWithRetry(client, proc);
}
curator 重试机制,通过与服务器返回的keeperException状态code来判断是否进行重试处理
public static boolean shouldRetry(int rc)
{
return (rc == KeeperException.Code.CONNECTIONLOSS.intValue()) ||
(rc == KeeperException.Code.OPERATIONTIMEOUT.intValue()) ||
(rc == KeeperException.Code.SESSIONMOVED.intValue()) ||
(rc == KeeperException.Code.SESSIONEXPIRED.intValue()) ||
(rc == -13); // KeeperException.Code.NEWCONFIGNOQUORUM.intValue()) - using hard coded value for ZK 3.4.x compatibility
}
3. 客户端在捕获到异常的时候,会将自己捕获到的异常Exception,传递给RetryLoop类中的takeException(Exception)方法,takeException会对Exception进行验证,是否是重试异常类,如果是,在根据客户端自定义的重试策略方式,通过retryPolicy.allowRetry方法进行重试。
public void takeException(Exception exception) throws Exception
{
boolean rethrow = true;
if ( isRetryException(exception) )
{
if ( !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
{
log.debug("Retry-able exception received", exception);
}
if ( retryPolicy.allowRetry(retryCount++, System.currentTimeMillis() - startTimeMs, sleeper) )
{
new EventTrace("retries-allowed", tracer.get()).commit();
if ( !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
{
log.debug("Retrying operation");
}
rethrow = false;
}
else
{
new EventTrace("retries-disallowed", tracer.get()).commit();
if ( !Boolean.getBoolean(DebugUtils.PROPERTY_DONT_LOG_CONNECTION_ISSUES) )
{
log.debug("Retry policy not allowing retry");
}
}
}
if ( rethrow )
{
throw exception;
}
}
curator 重试策略
curator 提供了一个重试机制接口类
public interface RetryPolicy
{
/** retryCount -> 重试次数
* elapsedTimeMs -> 开始尝试重试起,允许最大的重试时间范围内
* RetrySleeper -> 重试休眠
*
*/
public boolean allowRetry(int retryCount, long elapsedTimeMs, RetrySleeper sleeper);
}
该类有两个实现类,
- RetryForever 一直重试,直到成功
- SleepingRetry (休眠重试)
public class RetryForever implements RetryPolicy
{
private static final Logger log = LoggerFactory.getLogger(RetryForever.class);
// 每一个重试的间隔时间
private final int retryIntervalMs;
// 提供了一个构造函数,只有一个参数值,间隔时间
public RetryForever(int retryIntervalMs)
{
checkArgument(retryIntervalMs > 0);
this.retryIntervalMs = retryIntervalMs;
}
@Override
public boolean allowRetry(int retryCount, long elapsedTimeMs, RetrySleeper sleeper)
{
try
{
sleeper.sleepFor(retryIntervalMs, TimeUnit.MILLISECONDS);
}
catch (InterruptedException e)
{
Thread.currentThread().interrupt();
log.warn("Error occurred while sleeping", e);
return false;
}
return true;
}
}
SleepingRetry 是一个抽象类,有几个子类
比如:RetryNTimes(重试多次,提供一个构造函数,传入重试次数参数)
ExponentialBackoffRetry(重试间隔休眠时间指数递增策略)
RetryUntilElapsed(重试直到最大重试时长为止策略)
还有 RetryOneTime继承了RetryNTimes 提供了一次性重试机制
abstract class SleepingRetry implements RetryPolicy
{
// 重试次数
private final int n;
protected SleepingRetry(int n)
{
this.n = n;
}
// made public for testing
public int getN()
{
return n;
}
public boolean allowRetry(int retryCount, long elapsedTimeMs, RetrySleeper sleeper)
{
if ( retryCount < n )
{
try
{
sleeper.sleepFor(getSleepTimeMs(retryCount, elapsedTimeMs), TimeUnit.MILLISECONDS);
}
catch ( InterruptedException e )
{
Thread.currentThread().interrupt();
return false;
}
return true;
}
return false;
}
// 提供了一个抽象方法,获取休眠时间
protected abstract long getSleepTimeMs(int retryCount, long elapsedTimeMs);
}