1.前言
上一篇,我们讲了云计算设计模式之缓存设计模式,这一篇我们来聊聊云计算设计模式之断路器模式.断路器是一个电路中的元器件,作用是当电路中出现了短路或者瞬间电流过大等问题,断路器直接跳闸,当修复电路中的问题后,必须手动拨动断路器的开关到联通状态,整个电路才恢复正常通电。在云环境中,通常都是微服务的架构,各个服务往往运行在不同的进程中,而且很可能是不同的服务器,甚至是不同的数据中心,服务调用通常会有延迟,如果某个环节的延迟不断传递下去,讲导致整个系统的崩溃。
2.断路器模式
通常的云环境的服务调用模式如下:
服务调用失败时,接下来的调用都会失败,每一次的失败都会造成很大的延迟,直到服务恢复正常。那么能否在失败时,不再调用失败的服务,而直接返回错误信息,然后服务内部自己去调用服务,如果发现成功了,再直接调用服务,返回结果,而不再返回错误信息呢?没错,断路器模式就是这样的。
3.断路器原理
断路器模式能阻止应用重复调用曾经调用失败的服务,断路器直接返回错误信息,与此同时,断路器内部能够侦测服务是否恢复可用,如果可用,应用将再次直接调用正常的服务。
断路器类似应用和服务之间的代理,这个代码将监测服务是否可用,具体做法是记录内部调用失败的次数,超过设定的次数阀值,则内部再次去调用服务,如果调用成功,则让应用直接调用服务,否则直接返回错误信息。
4.断路器的状态
Closed: The request from the application is routed through to the operation. The proxy maintains a count of the number of recent failures, and if the call to the operation is unsuccessful the proxy increments this count. If the number of recent failures exceeds a specified threshold within a given time period, the proxy is placed into theOpen state. At this point the proxy starts a timeout timer, and when this timer expires the proxy is placed into theHalf-Open state.
Open: The request from the application fails immediately and an exception is returned to the application.
Half-Open: A limited number of requests from the application are allowed to pass through and invoke the operation. If these requests are successful, it is assumed that the fault that was previously causing the failure has been fixed and the circuit breaker switches to the Closed state (the failure counter is reset). If any request fails, the circuit breaker assumes that the fault is still present so it reverts back to the Open state and restarts the timeout timer to give the system a further period of time to recover from the failure.
5.Example
首先我们需要定义一个实体类来记录服务的状态.
interface ICircuitBreakerStateStore
{
CircuitBreakerStateEnum State { get; }
Exception LastException { get; }
DateTime LastStateChangedDateUtc { get; }
void Trip(Exception ex);
void Reset();
void HalfOpen();
bool IsClosed { get; }
}
然后,我们定义CircuitBreaker类
public class CircuitBreaker
{
private readonly ICircuitBreakerStateStore stateStore =
CircuitBreakerStateStoreFactory.GetCircuitBreakerStateStore();
private readonly object halfOpenSyncObject = new object ();
...
public bool IsClosed { get { return stateStore.IsClosed; } }
public bool IsOpen { get { return !IsClosed; } }
public void ExecuteAction(Action action)
{
...
if (IsOpen)
{
// The circuit breaker is Open.
... (see code sample below for details)
}
// The circuit breaker is Closed, execute the action.
try
{
action();
}
catch (Exception ex)
{
// If an exception still occurs here, simply
// re-trip the breaker immediately.
this.TrackException(ex);
// Throw the exception so that the caller can tell
// the type of exception that was thrown.
throw;
}
}
private void TrackException(Exception ex)
{
// For simplicity in this example, open the circuit breaker on the first exception.
// In reality this would be more complex. A certain type of exception, such as one
// that indicates a service is offline, might trip the circuit breaker immediately.
// Alternatively it may count exceptions locally or across multiple instances and
// use this value over time, or the exception/success ratio based on the exception
// types, to open the circuit breaker.
this.stateStore.Trip(ex);
}
}
考虑到并发的线程访问half-open状态的breaker的问题,使用锁机制来解决.
...
if (IsOpen)
{
// The circuit breaker is Open. Check if the Open timeout has expired.
// If it has, set the state to HalfOpen. Another approach may be to simply
// check for the HalfOpen state that had be set by some other operation.
if (stateStore.LastStateChangedDateUtc + OpenToHalfOpenWaitTime < DateTime.UtcNow)
{
// The Open timeout has expired. Allow one operation to execute. Note that, in
// this example, the circuit breaker is simply set to HalfOpen after being
// in the Open state for some period of time. An alternative would be to set
// this using some other approach such as a timer, test method, manually, and
// so on, and simply check the state here to determine how to handle execution
// of the action.
// Limit the number of threads to be executed when the breaker is HalfOpen.
// An alternative would be to use a more complex approach to determine which
// threads or how many are allowed to execute, or to execute a simple test
// method instead.
bool lockTaken = false;
try
{
Monitor.TryEnter(halfOpenSyncObject, ref lockTaken)
if (lockTaken)
{
// Set the circuit breaker state to HalfOpen.
stateStore.HalfOpen();
// Attempt the operation.
action();
// If this action succeeds, reset the state and allow other operations.
// In reality, instead of immediately returning to the Open state, a counter
// here would record the number of successful operations and return the
// circuit breaker to the Open state only after a specified number succeed.
this.stateStore.Reset();
return;
}
catch (Exception ex)
{
// If there is still an exception, trip the breaker again immediately.
this.stateStore.Trip(ex);
// Throw the exception so that the caller knows which exception occurred.
throw;
}
finally
{
if (lockTaken)
{
Monitor.Exit(halfOpenSyncObject);
}
}
}
}
// The Open timeout has not yet expired. Throw a CircuitBreakerOpen exception to
// inform the caller that the caller that the call was not actually attempted,
// and return the most recent exception received.
throw new CircuitBreakerOpenException(stateStore.LastException);
}
...
下面是一个调用CircuitBreaker的例子
var breaker = new CircuitBreaker();
try
{
breaker.ExecuteAction(() =>
{
// Operation protected by the circuit breaker.
...
});
}
catch (CircuitBreakerOpenException ex)
{
// Perform some different action when the breaker is open.
// Last exception details are in the inner exception.
...
}
catch (Exception ex)
{
...
}
6.相关阅读
The following patterns may also be relevant when implementing this pattern:
Retry Pattern. The Retry pattern is a useful adjunct to the Circuit Breaker pattern. It describes how an application can handle anticipated temporary failures when it attempts to connect to a service or network resource by transparently retrying an operation that has previously failed in the expectation that the cause of the failure is transient.
Health Endpoint Monitoring Pattern. A circuit breaker may be able to test the health of a service by sending a request to an endpoint exposed by the service. The service should return information indicating its status.
MSDN:
Circuit Breaker Pattern
https://msdn.microsoft.com/en-us/library/dn589784.aspx