1. 背景
1.1 Zuul和Ribbon
Zuul和Ribbon是Spring Cloud微服务体系架构中的两个框架,分别用于构建微服务网关和实现客户端一侧的软负载均衡器。本文主要讲述在Zuul微服务网关上对其后端集群的软负载实现。本文讲述的例子基于Spring Cloud Dalston.RELEASE版本。
1.2 基于Zuul的SSO网关
当前食行内部系统的统一单点登录(SSO)网关即采用Zuul框架搭建,SSO网关代理后端网站并负责将前端页面请求转发给后端,再将后端网站的应答转给前端,如果请求没有带上身份票据(token),则请求将路由给SSO 服务器提供的统一登录页面;如果请求已有身份票据,则直接被网关发往其代理的后台网站。SSO网关可以同时给多个后台系统做代理, 在Zuul框架内部会维护一个路由表,每个后台对应一条路由信息,如下图所示:
以path中的URL根为key,每个key对应一个后台网站的HTTP URL。而Zuul的路由表可以支持单实例配置(URL方式)和多实例配置(ServiceId方式),单实例的后端服务很简单,只需配置一个普通的url即可,多实例的配置就涉及到负载均衡,由Spring Cloud体系中的Ribbon来具体实现。由于本文的重点在于介绍Ribbon实现的负载均衡,对zuul框架只在此做概略性的背景说明,下面将侧重讲解Ribbon的使用及内部机制。
2. 单实例路由配置及内部实现原理
2.1 路由配置
以HR网站为例。配置如下:
HR网站的URL根是hr,那么所有发到SSO网关上的以hr为URL根的HTTP请求都将识别为访问HR网站的请求,网关根据路由表中的path匹配到/hr/(此处/hr/**代表匹配所有以hr为根路径的URL), 找到/hr/对应的路由,就是http://sy-suz-srv31.suyiyi.com.cn:9007/hr, 将请求转发往该路由。
2.2 内部实现
单实例的路由配置其实未涉及到Ribbon框架,在此只对其内部机制做简单说明。一个请求从被网关接收,到转发给后端网站,主要有三个步骤:认证、匹配路由和转发。这些逻辑主要封装在 zuul-core和spring-cloud-netflix-core包内,是属于zuul的核心逻辑。
zuul框架主要是对内置容器Tomcat的HttpServlet基类做了扩展,有一个ZuulServlet类继承自HttpServlet类,封装了主要的逻辑,里面最重要的一个机制就是Filter过滤器机制,zuul将一系列不同的处理逻辑写成了过滤器的方式,每个过滤器对应一个过滤器类,每个过滤器类只关注自己的一部分功能,比如前置过滤器(PreFilter)负责请求的身份验证,路由过滤器(RoutingFilter)负责请求的路由转发,后置路由器(PostFilter)负责收集每个请求的性能指标,等等。ZuulServlet类中的service方法如下:
@Override
public void service(javax.servlet.ServletRequest servletRequest, javax.servlet.ServletResponse servletResponse) throws ServletException, IOException {
try {
init((HttpServletRequest) servletRequest, (HttpServletResponse) servletResponse);
// Marks this request as having passed through the "Zuul engine", as opposed to servlets
// explicitly bound in web.xml, for which requests will not have the same data attached
RequestContext context = RequestContext.getCurrentContext();
context.setZuulEngineRan();
try {
preRoute();
} catch (ZuulException e) {
error(e);
postRoute();
return;
}
try {
route();
} catch (ZuulException e) {
error(e);
postRoute();
return;
}
try {
postRoute();
} catch (ZuulException e) {
error(e);
return;
}
} catch (Throwable e) {
error(new ZuulException(e, 500, "UNHANDLED_EXCEPTION_" + e.getClass().getName()));
} finally {
RequestContext.getCurrentContext().unset();
}
}
可以很清楚的看到,首先处理preRoute()前置过滤器,再处理route()路由,再处理postRoute()后置过滤器。
对于单实例的服务,假设请求经过了前置过滤器的身份验证(这个通常由开发者自己实现作为SSO网关里的身份认证核心逻辑),到了路由阶段,通过匹配算法得到的路由是URL类型,框架使用SimpleHostRoutingFilter来完成请求的转发,核心代码如下:
@Override
public Object run() {
RequestContext context = RequestContext.getCurrentContext();
HttpServletRequest request = context.getRequest();
MultiValueMap<String, String> headers = this.helper
.buildZuulRequestHeaders(request);
MultiValueMap<String, String> params = this.helper
.buildZuulRequestQueryParams(request);
String verb = getVerb(request);
InputStream requestEntity = getRequestBody(request);
if (request.getContentLength() < 0) {
context.setChunkedRequestBody();
}
String uri = this.helper.buildZuulRequestURI(request);
this.helper.addIgnoredHeaders();
try {
CloseableHttpResponse response = forward(this.httpClient, verb, uri, request,
headers, params, requestEntity);
setResponse(response);
}
catch (Exception ex) {
throw new ZuulRuntimeException(ex);
}
return null;
}
框架内部采用了apache httpclient框架对http request对象进行解析,得到原始的uri,header,querystring参数和post包体,再重新组装url,header和包体,重新发送给后端路由。这个过程,从接收请求,到认证过滤,到路由转发,都是在同一个线程内完成,属于最为简单的方式。
3. 使用ribbon框架实现多实例的软负载均衡
3.1 路由配置
以UC网站为例,配置如下:
第一部分:
第二部分:
配置解读:第一部分还是在路由表内配置,只不过关键字url换成serviceId,而serviceId的值,对应下面第二部分的配置名称userService(在最上层,比routes高一层),其下一层写ribbon,表明是负载均衡配置,再下一层listOfServers配置多节点路由,以逗号分隔开来;第一部分的stripPrefix表示是否在listOfServers的配置中舍弃掉/uc根,retryable表示是否打开重试机制。
第三部分:
第四部分:
配置解读:第三部分规定ribbon转发请求时的隔离策略,简单说就是是否与请求处理逻辑在同一个线程中执行转发,这个配置由于涉及到hystrix框架的熔断机制,将在下面做详细解释;第四部分其实就是ribbon性能参数配置,也在下面性能优化部分做详细解释。
3.2 负载配置类代码
配置了ribbon路由表和性能参数之后,还需要在代码中写负载均衡配置类,规定集群中实例的心跳检测策略和负载策略。以UC为例,配置类代码实现如下:
@Configuration
public class UCLoadBalanceConfiguration {
/**
* 集群心跳检测策略
* @return
*/
@Bean
public IPing ribbonPing() {
return new PingUrl(false, "/uc/api/health/heartbeat");
}
/**
* 集群负载均衡策略
* @return
*/
@Bean
public IRule ribbonRule() {
return new RoundRobinRule(); //roundRobin方式轮询选择server
}
}
此处集群心跳检测策略类采用了Ribbon框架提供的PingUrl类,第一个参数代表是否是https,第二个参数填写uc的心跳检测接口,PingUrl类依据请求的心跳接口返回是否200 HTTP StatusCode来确定该实例是否健康,而心跳检测的时间则由3.1节的第四部分中的ServerListRefreshInterval(单位:毫秒)来决定,一旦检测一个实例未得到200的应答,该实例将从Ribbon路由表中标记为dead,等到下一次心跳检测得到200的时候再标记为alive,标记为dead的实例将不会转发请求; 负载均衡类也使用了Ribbon框架提供的RoundRibbonRule类,该类实现了一个轮询策略,按照路由列表中的顺序依次选择实例转发请求,Ribbon提供的负载均衡类比较齐全,基本不用自己写,包括RandomRule(随机策略)、RetryRule(重试策略)、WeightedResponseTimeRule(权重计算策略)等等。
负载均衡配置类写好以后,再网关主类上加上RibbonClient标注,如下所示:
@RibbonClients(value = {
@RibbonClient(name = "userService", configuration = UCLoadBalanceConfiguration.class)
})
public class GatewayProxyApplication {
public static void main(String[] args) {
SpringApplication.run(GatewayProxyApplication.class, args);
}
}
4. Ribbon内部实现机制
根据功能可以对Ribbon内部的实现做如下划分:路由过滤器、负载均衡与策略、节点实例维护(心跳检查和实例刷新)、请求转发(重试机制及熔断保护)。
4.1 路由过滤器
路由过滤器类RibbonRoutingFilter,也是继承自ZuulFilter,前面讲了,如果在路由表的配置中使用serviceId,HTTP请求就将由RibbonRoutingFilter类负责。核心代码如下:
@Override
public Object run() {
RequestContext context = RequestContext.getCurrentContext();
this.helper.addIgnoredHeaders();
try {
RibbonCommandContext commandContext = buildCommandContext(context);
ClientHttpResponse response = forward(commandContext);
setResponse(response);
return response;
}
catch (ZuulException ex) {
throw new ZuulRuntimeException(ex);
}
catch (Exception ex) {
throw new ZuulRuntimeException(ex);
}
}
protected ClientHttpResponse forward(RibbonCommandContext context) throws Exception {
Map<String, Object> info = this.helper.debug(context.getMethod(),
context.getUri(), context.getHeaders(), context.getParams(),
context.getRequestEntity());
RibbonCommand command = this.ribbonCommandFactory.create(context);
try {
ClientHttpResponse response = command.execute();
this.helper.appendDebug(info, response.getStatusCode().value(),
response.getHeaders());
return response;
}
catch (HystrixRuntimeException ex) {
return handleException(info, ex);
}
}
根据当前的HttpContext上下文封装一个RibbonCommandContext对象, 再由RibbonCommandFactory根据RibbonCommandContext对象生成一个RibbonCommand,由RibbonCommand完成HTTP请求的发送并的得到响应结果ClientHttpResponse。RibbonCommandContext包含如下内容:
RibbonCommandFactory是一个工厂接口,有RestClientRibbonCommandFactory、OkHttpRibbonCommandFactory、HttpCLientRibbonCommandFacotry三个工厂实现类继承自该接口,分别生产RestClientRibbonCommand、OkHttpRibbonCommand、HttpClientRibbonCommand三种RibbonCommand对象
在默认情况下,ZUUL框架将自动装配HttpCLientRibbonCommandFacotry工厂对象,也就是说在运行时生产的是HttpClientRibbonCommand对象。
HttpClientRibbonCommand类的主要功能有两个,一是获取一个当前可用的服务实例,二是转发请求,为了实现这两个主要功能,HttpClientRibbonCommand类需要同其他一些类做协同工作,这些类之间的关系图如下:
这其中,有两个抽象基类LoadBalancerContext和AbstractLoadBalancerAwareClient主要实现软负载均衡的相关功能,包括获取当前可用实例的算法,而还有一个抽象基类HystrixCommand主要管请求的转发执行的,这里面涉及到了转发、重试、熔断保护、降级等功能,将在下面详细讲述。此处重点讲解两个关键方法,其一是LoadBalancerContext.getServerFromLoadBalancer方法,实现如下:
public Server getServerFromLoadBalancer(@Nullable URI original, @Nullable Object loadBalancerKey) throws ClientException {
String host = null;
int port = -1;
if (original != null) {
host = original.getHost();
}
if (original != null) {
Pair<String, Integer> schemeAndPort = deriveSchemeAndPortFromPartialUri(original);
port = schemeAndPort.second();
}
// Various Supported Cases
// The loadbalancer to use and the instances it has is based on how it was registered
// In each of these cases, the client might come in using Full Url or Partial URL
ILoadBalancer lb = getLoadBalancer();
if (host == null) {
// Partial URI or no URI Case
// well we have to just get the right instances from lb - or we fall back
if (lb != null){
Server svc = lb.chooseServer(loadBalancerKey);
if (svc == null){
throw new ClientException(ClientException.ErrorType.GENERAL,
"Load balancer does not have available server for client: "
+ clientName);
}
host = svc.getHost();
if (host == null){
throw new ClientException(ClientException.ErrorType.GENERAL,
"Invalid Server for :" + svc);
}
logger.debug("{} using LB returned Server: {} for request {}", new Object[]{clientName, svc, original});
return svc;
} else {
// No Full URL - and we dont have a LoadBalancer registered to
// obtain a server
// if we have a vipAddress that came with the registration, we
// can use that else we
// bail out
if (vipAddresses != null && vipAddresses.contains(",")) {
throw new ClientException(
ClientException.ErrorType.GENERAL,
"Method is invoked for client " + clientName + " with partial URI of ("
+ original
+ ") with no load balancer configured."
+ " Also, there are multiple vipAddresses and hence no vip address can be chosen"
+ " to complete this partial uri");
} else if (vipAddresses != null) {
try {
Pair<String,Integer> hostAndPort = deriveHostAndPortFromVipAddress(vipAddresses);
host = hostAndPort.first();
port = hostAndPort.second();
} catch (URISyntaxException e) {
throw new ClientException(
ClientException.ErrorType.GENERAL,
"Method is invoked for client " + clientName + " with partial URI of ("
+ original
+ ") with no load balancer configured. "
+ " Also, the configured/registered vipAddress is unparseable (to determine host and port)");
}
} else {
throw new ClientException(
ClientException.ErrorType.GENERAL,
this.clientName
+ " has no LoadBalancer registered and passed in a partial URL request (with no host:port)."
+ " Also has no vipAddress registered");
}
}
} else {
// Full URL Case
// This could either be a vipAddress or a hostAndPort or a real DNS
// if vipAddress or hostAndPort, we just have to consult the loadbalancer
// but if it does not return a server, we should just proceed anyways
// and assume its a DNS
// For restClients registered using a vipAddress AND executing a request
// by passing in the full URL (including host and port), we should only
// consult lb IFF the URL passed is registered as vipAddress in Discovery
boolean shouldInterpretAsVip = false;
if (lb != null) {
shouldInterpretAsVip = isVipRecognized(original.getAuthority());
}
if (shouldInterpretAsVip) {
Server svc = lb.chooseServer(loadBalancerKey);
if (svc != null){
host = svc.getHost();
if (host == null){
throw new ClientException(ClientException.ErrorType.GENERAL,
"Invalid Server for :" + svc);
}
logger.debug("using LB returned Server: {} for request: {}", svc, original);
return svc;
} else {
// just fall back as real DNS
logger.debug("{}:{} assumed to be a valid VIP address or exists in the DNS", host, port);
}
} else {
// consult LB to obtain vipAddress backed instance given full URL
//Full URL execute request - where url!=vipAddress
logger.debug("Using full URL passed in by caller (not using load balancer): {}", original);
}
}
// end of creating final URL
if (host == null){
throw new ClientException(ClientException.ErrorType.GENERAL,"Request contains no HOST to talk to");
}
// just verify that at this point we have a full URL
return new Server(host, port);
}
方法中的Server svr = lb.chooseServer(loadBalacerKey); 就是负载均衡对象lb根据负载均衡算法获取一个可用的服务实例。其二,获取实例之后,在AbstractLoadBalancerAwareClient类的executeWithLoadBalancer()方法中(通过LoadBalancerCommand对象)执行请求转发并获取一个响应对象。executeWithLoadBalancer方法实现如下:
public T executeWithLoadBalancer(final S request, final IClientConfig requestConfig) throws ClientException {
RequestSpecificRetryHandler handler = getRequestSpecificRetryHandler(request, requestConfig);
LoadBalancerCommand<T> command = LoadBalancerCommand.<T>builder()
.withLoadBalancerContext(this)
.withRetryHandler(handler)
.withLoadBalancerURI(request.getUri())
.build();
try {
return command.submit(
new ServerOperation<T>() {
@Override
public Observable<T> call(Server server) {
URI finalUri = reconstructURIWithServer(server, request.getUri());
S requestForServer = (S) request.replaceUri(finalUri);
try {
return Observable.just(AbstractLoadBalancerAwareClient.this.execute(requestForServer, requestConfig));
}
catch (Exception e) {
return Observable.error(e);
}
}
})
.toBlocking()
.single();
} catch (Exception e) {
Throwable t = e.getCause();
if (t instanceof ClientException) {
throw (ClientException) t;
} else {
throw new ClientException(e);
}
}
}
各个关键类的调用链如下:
4.2 负载均衡与策略
Ribbon的负载均衡模块分为两部分,一部分是负载均衡器类,主要负责服务节点的实例列表的维护,及状态更新,一部分是负载均衡策略,主要是实现服务节点的选取算法, 都在ribbon-loadbalancer包中实现。
4.2.1 负载均衡器类结构
类图如下:
最上层的是一个接口定义ILoadBalancer, 定义了一批接口,如添加节点、标记节点不可用、选取可用节点、获取可用节点列表等等。
接下来是一个抽象基类AbstractLoadBalancer继承接口ILoadBalancer,主要定义了关于节点实例的分组枚举类别ServerGroup,包含下列三种不同的类型:
ALL:所有节点实例;
STATUS_UP:正常节点实例;
STATUS_NOT_UP: 不可用节点实例。
还有一个getLoadBalancerStatus()方法定义了获取各个节点实例属性和统计信息的方法,我们可以利用这些信息实时观察负载均衡的运行状况,同时也是制定负载均衡策略的重要依据。
BaseLoadBalancer类是Ribbon负载均衡类的基础基类,一些重要的服务节点管理逻辑都在该类中实现,比如维护两个节点实例的列表,一个用户存储所有的节点实例,另外一个维护正常状态的节点实例,以及检查节点实例是否存活的IPing对象,还有一个负责节点选取的IRule对象,节点检测和节点选取策略将在下面两节中讨论。选择可用节点和标记节点不可用的核心逻辑代码如下:
/*
* Get the alive server dedicated to key
*
* @return the dedicated server
*/
public Server chooseServer(Object key) {
if (counter == null) {
counter = createCounter();
}
counter.increment();
if (rule == null) {
return null;
} else {
try {
return rule.choose(key);
} catch (Exception e) {
logger.warn("LoadBalancer [{}]: Error choosing server for key {}", name, key, e);
return null;
}
}
}
public void markServerDown(Server server) {
if (server == null || !server.isAlive()) {
return;
}
logger.error("LoadBalancer [{}]: markServerDown called on [{}]", name, server.getId());
server.setAlive(false);
// forceQuickPing();
notifyServerStatusChangeListener(singleton(server));
}
private void notifyServerStatusChangeListener(final Collection<Server> changedServers) {
if (changedServers != null && !changedServers.isEmpty() && !serverStatusListeners.isEmpty()) {
for (ServerStatusChangeListener listener : serverStatusListeners) {
try {
listener.serverStatusChanged(changedServers);
} catch (Exception e) {
logger.error("LoadBalancer [{}]: Error invoking server status change listener", name, e);
}
}
}
}
还有一个用户检测节点健康状况的内部类,主要功能就是使用遍历算法去Ping节点实例:
class Pinger {
private final IPingStrategy pingerStrategy;
public Pinger(IPingStrategy pingerStrategy) {
this.pingerStrategy = pingerStrategy;
}
public void runPinger() throws Exception {
if (!pingInProgress.compareAndSet(false, true)) {
return; // Ping in progress - nothing to do
}
// we are "in" - we get to Ping
Server[] allServers = null;
boolean[] results = null;
Lock allLock = null;
Lock upLock = null;
try {
/*
* The readLock should be free unless an addServer operation is
* going on...
*/
allLock = allServerLock.readLock();
allLock.lock();
allServers = allServerList.toArray(new Server[allServerList.size()]);
allLock.unlock();
int numCandidates = allServers.length;
results = pingerStrategy.pingServers(ping, allServers);
final List<Server> newUpList = new ArrayList<Server>();
final List<Server> changedServers = new ArrayList<Server>();
for (int i = 0; i < numCandidates; i++) {
boolean isAlive = results[i];
Server svr = allServers[i];
boolean oldIsAlive = svr.isAlive();
svr.setAlive(isAlive);
if (oldIsAlive != isAlive) {
changedServers.add(svr);
logger.debug("LoadBalancer [{}]: Server [{}] status changed to {}",
name, svr.getId(), (isAlive ? "ALIVE" : "DEAD"));
}
if (isAlive) {
newUpList.add(svr);
}
}
upLock = upServerLock.writeLock();
upLock.lock();
upServerList = newUpList;
upLock.unlock();
notifyServerStatusChangeListener(changedServers);
} finally {
pingInProgress.set(false);
}
}
}
DynamicServerListLoadBalancer类继承于BaseLoadBalancer类,是对基类能力的扩展,实现了节点实例在运行期实时动态更新的能力,以及对节点实例的过滤功能,可以通过ServiceListFilter过滤器来选择性的获取节点实例。核心逻辑代码如下:
public void updateListOfServers() {
List<T> servers = new ArrayList<T>();
if (serverListImpl != null) {
servers = serverListImpl.getUpdatedListOfServers();
LOGGER.debug("List of Servers for {} obtained from Discovery client: {}",
getIdentifier(), servers);
if (filter != null) {
servers = filter.getFilteredListOfServers(servers);
LOGGER.debug("Filtered List of Servers for {} obtained from Discovery client: {}",
getIdentifier(), servers);
}
}
updateAllServerList(servers);
}
/**
* Update the AllServer list in the LoadBalancer if necessary and enabled
*
* @param ls
*/
protected void updateAllServerList(List<T> ls) {
// other threads might be doing this - in which case, we pass
if (serverListUpdateInProgress.compareAndSet(false, true)) {
try {
for (T s : ls) {
s.setAlive(true); // set so that clients can start using these
// servers right away instead
// of having to wait out the ping cycle.
}
setServersList(ls);
super.forceQuickPing();
} finally {
serverListUpdateInProgress.set(false);
}
}
}
这里serverListImpl是一个注入对象,就是通过它来获取最新的节点实例,具体实现将在节点实例管理这一节中讲述。
ZoneAwareLoadBalance类则是对DynamicServerListLoadBalancer的进一步扩展,可以实现分区域存放节点实例的功能,优选选择本地区的节点,这个类也是Ribbion默认使用的类。在类中,分区域挑选节点实例的代码如下:
public Server chooseServer(Object key) {
if (!ENABLED.get() || getLoadBalancerStats().getAvailableZones().size() <= 1) {
logger.debug("Zone aware logic disabled or there is only one zone");
return super.chooseServer(key);
}
Server server = null;
try {
LoadBalancerStats lbStats = getLoadBalancerStats();
Map<String, ZoneSnapshot> zoneSnapshot = ZoneAvoidanceRule.createSnapshot(lbStats);
logger.debug("Zone snapshots: {}", zoneSnapshot);
if (triggeringLoad == null) {
triggeringLoad = DynamicPropertyFactory.getInstance().getDoubleProperty(
"ZoneAwareNIWSDiscoveryLoadBalancer." + this.getName() + ".triggeringLoadPerServerThreshold", 0.2d);
}
if (triggeringBlackoutPercentage == null) {
triggeringBlackoutPercentage = DynamicPropertyFactory.getInstance().getDoubleProperty(
"ZoneAwareNIWSDiscoveryLoadBalancer." + this.getName() + ".avoidZoneWithBlackoutPercetage", 0.99999d);
}
Set<String> availableZones = ZoneAvoidanceRule.getAvailableZones(zoneSnapshot, triggeringLoad.get(), triggeringBlackoutPercentage.get());
logger.debug("Available zones: {}", availableZones);
if (availableZones != null && availableZones.size() < zoneSnapshot.keySet().size()) {
String zone = ZoneAvoidanceRule.randomChooseZone(zoneSnapshot, availableZones);
logger.debug("Zone chosen: {}", zone);
if (zone != null) {
BaseLoadBalancer zoneLoadBalancer = getLoadBalancer(zone);
server = zoneLoadBalancer.chooseServer(key);
}
}
} catch (Exception e) {
logger.error("Error choosing server using zone aware logic for load balancer={}", name, e);
}
if (server != null) {
return server;
} else {
logger.debug("Zone avoidance logic is not invoked.");
return super.chooseServer(key);
}
}
类内部会维护一个ConcurrentHashMap<String, BaseLoadBalancer>的结构,每个区域对应一个负载均衡器,在选择区域的时候主要根据各个区域快照ZoneSnapshot中的负载均衡统计信息来做选择,故障率和调用延迟率低的区域将优先被选择。
4.2.2 节点实例管理
节点实例管理主要包括实例维护 ,节点心跳检测, 实例过滤与刷新。主要涉及到的核心类如下:
节点维护:ServerList, ConfigurationBasedServerList, StaticServerList
节点心跳检测:PingUrl
节点列表刷新类:ServerListUpdater,PollingServerListUpdater
节点列表过滤类:ZonePreferenceServerListFilter, serverListImpl:com.netflix.loadbalancer.ConfigurationBasedServerList
节点维护相关的类图结构如下:
ConfigurationBasedServerList类就是Ribbon默认装配使用的类,代码如下:
public class ConfigurationBasedServerList extends AbstractServerList<Server> {
private IClientConfig clientConfig;
@Override
public List<Server> getInitialListOfServers() {
return getUpdatedListOfServers();
}
@Override
public List<Server> getUpdatedListOfServers() {
String listOfServers = clientConfig.get(CommonClientConfigKey.ListOfServers);
return derive(listOfServers);
}
@Override
public void initWithNiwsConfig(IClientConfig clientConfig) {
this.clientConfig = clientConfig;
}
private List<Server> derive(String value) {
List<Server> list = Lists.newArrayList();
if (!Strings.isNullOrEmpty(value)) {
for (String s: value.split(",")) {
list.add(new Server(s.trim()));
}
}
return list;
}
}
关键方法 getUpdatedListOfServers(), 是从配置文件中读取listOfServers配置的值,是一个以“,”为分隔符的节点的路由地址组成的字符串,
比如 http://sy-suz-srv31.suiyi.com.cn:9000,http://sy-suz-dev-ci.suiyi.com.cn:9000
节点心跳检测类 PingUrl, 继承自接口IPing,就是在上一节中提到过的BaseLoadBalancer基类用到的IPing对象,代码如下:
public class PingUrl implements IPing {
private static final Logger LOGGER = LoggerFactory.getLogger(PingUrl.class);
String pingAppendString = "";
boolean isSecure = false;
String expectedContent = null;
/*
*
* Send one ping only.
*
* Well, send what you need to determine whether or not the
* server is still alive. Should return within a "reasonable"
* time.
*/
public PingUrl() {
}
public PingUrl(boolean isSecure, String pingAppendString) {
this.isSecure = isSecure;
this.pingAppendString = (pingAppendString != null) ? pingAppendString : "";
}
public void setPingAppendString(String pingAppendString) {
this.pingAppendString = (pingAppendString != null) ? pingAppendString : "";
}
public String getPingAppendString() {
return pingAppendString;
}
public boolean isSecure() {
return isSecure;
}
/**
* Should the Secure protocol be used to Ping
* @param isSecure
*/
public void setSecure(boolean isSecure) {
this.isSecure = isSecure;
}
public String getExpectedContent() {
return expectedContent;
}
/**
* Is there a particular content you are hoping to see?
* If so -set this here.
* for e.g. the WCS server sets the content body to be 'true'
* Please be advised that this content should match the actual
* content exactly for this to work. Else yo may get false status.
* @param expectedContent
*/
public void setExpectedContent(String expectedContent) {
this.expectedContent = expectedContent;
}
public boolean isAlive(Server server) {
String urlStr = "";
if (isSecure){
urlStr = "https://";
}else{
urlStr = "http://";
}
urlStr += server.getId();
urlStr += getPingAppendString();
boolean isAlive = false;
HttpClient httpClient = new DefaultHttpClient();
HttpUriRequest getRequest = new HttpGet(urlStr);
String content=null;
try {
HttpResponse response = httpClient.execute(getRequest);
content = EntityUtils.toString(response.getEntity());
isAlive = (response.getStatusLine().getStatusCode() == 200);
if (getExpectedContent()!=null){
LOGGER.debug("content:" + content);
if (content == null){
isAlive = false;
}else{
if (content.equals(getExpectedContent())){
isAlive = true;
}else{
isAlive = false;
}
}
}
} catch (IOException e) {
e.printStackTrace();
}finally{
// Release the connection.
getRequest.abort();
}
return isAlive;
}
public static void main(String[] args){
PingUrl p = new PingUrl(false,"/cs/hostRunning");
p.setExpectedContent("true");
Server s = new Server("ec2-75-101-231-85.compute-1.amazonaws.com", 7101);
boolean isAlive = p.isAlive(s);
System.out.println("isAlive:" + isAlive);
}
}
主要功能就是访问构造函数中传进去的那个地址
pingAppendString,如果返回200,就判定节点健康。
节点列表刷新类图:
Ribbon会默认使用PollingServerListUpdater类,此类的Bean对象会在程序初始化的时候装配到ZoneAwareLoadBalance负载均衡器中。关键代码如下:
@Override
public synchronized void start(final UpdateAction updateAction) {
if (isActive.compareAndSet(false, true)) {
final Runnable wrapperRunnable = new Runnable() {
@Override
public void run() {
if (!isActive.get()) {
if (scheduledFuture != null) {
scheduledFuture.cancel(true);
}
return;
}
try {
updateAction.doUpdate();
lastUpdated = System.currentTimeMillis();
} catch (Exception e) {
logger.warn("Failed one update cycle", e);
}
}
};
scheduledFuture = getRefreshExecutor().scheduleWithFixedDelay(
wrapperRunnable,
initialDelayMs,
refreshIntervalMs,
TimeUnit.MILLISECONDS
);
} else {
logger.info("Already active, no-op");
}
}
在start()方法中会启动一个线程, 每隔一段时间(refreshIntervalMs)去执行 一段代码 updateAction.doUpdate(); 而具体传入的updateAction实际上就是上一节中DynamicServerListLoadBalancer类的updateServerList()核心方法的委托。
public classDynamicServerListLoadBalancer<TextendsServer>extendsBaseLoadBalancer {
...
protected finalServerListUpdater.UpdateActionupdateAction= newServerListUpdater.UpdateAction() {
@Override
public voiddoUpdate() {
updateListOfServers();
}
};
...
}
节点过滤类包括主要实现对节点实例的过滤, 类图如下:
节点过滤类主要实现的是对节点实例的过滤,通过传入的节点实例列表,根据一些规则返回过滤后的节点实例列表。其中ZonePreferenceServerListFilter是Ribbon默认启动的类,基于区域感知的方式实现节点实例的过滤,它会根据服务实例所处的区域(zone)与消费者自身所在的区域进行比较,过滤掉那些不是同处于一个区域的实例。ServerListSubsetFilter可以通过比较节点实例通信失败数据和并发连接数来过滤掉不够“健康”的实例。
4.2.3 节点选取策略
负载均衡策略类图结构
Ribbon的负载策略包含了比较多的选择,轮询、随机、重试、权重、条件、过滤、区域等等策略,所有这些类都继承自一个接口IRule,和抽象类LoadBalanceRule,核心方法只有一个,就是choose(),从节点列表中选择一个可用节点。Ribbon的默认负载策略是RoundRobinRule,即轮询策略。
轮询策略实现了按照线性轮询的方式依次选择每个可用节点的功能,其核心代码如下:
public Server choose(ILoadBalancer lb, Object key) {
if (lb == null) {
log.warn("no load balancer");
return null;
}
Server server = null;
int count = 0;
while (server == null && count++ < 10) {
List<Server> reachableServers = lb.getReachableServers();
List<Server> allServers = lb.getAllServers();
int upCount = reachableServers.size();
int serverCount = allServers.size();
if ((upCount == 0) || (serverCount == 0)) {
log.warn("No up servers available from load balancer: " + lb);
return null;
}
int nextServerIndex = incrementAndGetModulo(serverCount);
server = allServers.get(nextServerIndex);
if (server == null) {
/* Transient. */
Thread.yield();
continue;
}
if (server.isAlive() && (server.isReadyToServe())) {
return (server);
}
// Next.
server = null;
}
if (count >= 10) {
log.warn("No available alive servers after 10 tries from load balancer: "
+ lb);
}
return server;
}
轮询算法就是实现一个循环,依次从可用节点列表中获取一个server,如果一直选择不到超过10次,就会结束尝试并打印警告信息。
Ribbon官方版本(2.2.2)的RoundRobinRule类轮询算法有一个bug,就是选取Server时从allServers列表中取,
int
nextServerIndex = incrementAndGetModulo(serverCount);
server = allServers.get(nextServerIndex);
当allServer中包含不可用的节点时,请求就会被发送到不可用节点去,应该将上面两句改为:
int
nextServerIndex = incrementAndGetModulo(
upCount)
;
server =
reachableServers
.get(nextServerIndex);
4.3 断路器
4.3.1 Hystrix框架介绍
在微服务的相互调用过程中,服务消费者在调用服务提供方接口的时候可能会产生等待的情况,缓慢的响应在高并发的条件下导致后续的请求被阻塞,进而导致服务不可用,再进而导致其他分布式部署的相同的服务不可用,再进而导致服务的消费者也可用,这样蔓延开来,最终导致整个系统的瘫痪,这就是“雪崩”效应。
为了解决这个问题,马丁福勒提出了“断路器”的概念,当某个服务单元发生故障时,通过消费者一方的调用监控及时发现,中断等待,这样就使得线程不会因为调用故障服务长时间占用不释放,避免了故障在分布式系统中的蔓延。
而在SpringCloud套件中,Hystrix就是用来实现微服务调用“断路器”功能的框架,具备服务降级、服务熔断、线程和信号隔离、请求缓存、请求合并以及服务监控等功能。
4.3.2 原理分析
在4.1节中提到过,在RibbonRouteFilter的forward() 方法中,首选会使用HytrixCommand对象对转发行为进行一次封装,使之处于断路器的全程监控中。Hystrix核心类图如下:
Hystrix在执行时会根据创建的具体Command对象来选择一个执行,抽象基类Hystrix实现了下面两个执行方式:
1)execute(): 同步执行,从依赖的服务返回一个结果对象,或是发生错误的时候抛出异常;
2)queue(): 异步执行,直接返回一个Future对象,其中包含服务执行结束时需要返回的一个结果对象。Hystrix内部会为每个依赖服务创建一个专有的线程池,每个依赖服务的调用时相互隔离的。
框架默认初始化HttpCientRibbonCommand
4.3.3 隔离策略和熔断保护
HystrixCommand在执行时有两个隔离策略:线程池隔离和信号量隔离,可以通过配置灵活设置该属性。
1)THREAD:通过线程池隔离的策略。它在独立的线程上执行,并且它的并发受线程池中线程数量的限制。
2)SEMAPHORE:通过信号量隔离的策略。它在调用线程上执行,并且它的并发受信号量计数的限制。
简言之,在THREAD模式下,Hystrix命令在线程池另起一个独立线程中执行(与请求不在同一个线程),在调用后台服务的过程中只要引发超时时,不管应答是否返回,就会触发Hystrix熔断;而在SEMAPHORE模式下,Hystrix命令跟请求在同一个线程中运行,在已经超时的情况下,只有从后台服务返回应答时才能触发超时熔断。
4.3.4 重试机制
Ribbon 对重试有如下配置:
MaxAutoRetries: 对同一个实例的重试次数
MaxAutoRetriesNextServer:一个实例访问失败之后,对其他实例的重试次数。
Hytrix框架会全程监控一次转发的过程,包括重试在内,只要达到HystrixCommand的超时时间,请求或者重试请求都会被终止,以达到熔断你保护的目的。
5. Ribbon性能优化配置
5.1 默认配置
Ribbon默认的配置类是RibbonClientConfiguration类,该类提供一系列自动装配的Bean,保证没有具体配置时也能成功启动Ribbon,代码如下:
@SuppressWarnings("deprecation")
@Configuration
@EnableConfigurationProperties
public class RibbonClientConfiguration {
@Value("${ribbon.client.name}")
private String name = "client";
// TODO: maybe re-instate autowired load balancers: identified by name they could be
// associated with ribbon clients
@Autowired
private PropertiesFactory propertiesFactory;
@Bean
@ConditionalOnMissingBean
public IClientConfig ribbonClientConfig() {
DefaultClientConfigImpl config = new DefaultClientConfigImpl();
config.loadProperties(this.name);
return config;
}
@Bean
@ConditionalOnMissingBean
public IRule ribbonRule(IClientConfig config) {
if (this.propertiesFactory.isSet(IRule.class, name)) {
return this.propertiesFactory.get(IRule.class, config, name);
}
ZoneAvoidanceRule rule = new ZoneAvoidanceRule();
rule.initWithNiwsConfig(config);
return rule;
}
@Bean
@ConditionalOnMissingBean
public IPing ribbonPing(IClientConfig config) {
if (this.propertiesFactory.isSet(IPing.class, name)) {
return this.propertiesFactory.get(IPing.class, config, name);
}
return new DummyPing();
}
@Bean
@ConditionalOnMissingBean
@SuppressWarnings("unchecked")
public ServerList<Server> ribbonServerList(IClientConfig config) {
if (this.propertiesFactory.isSet(ServerList.class, name)) {
return this.propertiesFactory.get(ServerList.class, config, name);
}
ConfigurationBasedServerList serverList = new ConfigurationBasedServerList();
serverList.initWithNiwsConfig(config);
return serverList;
}
@Configuration
@ConditionalOnProperty(name = "ribbon.httpclient.enabled", matchIfMissing = true)
protected static class HttpClientRibbonConfiguration {
@Value("${ribbon.client.name}")
private String name = "client";
@Bean
@ConditionalOnMissingBean(AbstractLoadBalancerAwareClient.class)
@ConditionalOnMissingClass(value = "org.springframework.retry.support.RetryTemplate")
public RibbonLoadBalancingHttpClient ribbonLoadBalancingHttpClient(
IClientConfig config, ServerIntrospector serverIntrospector,
ILoadBalancer loadBalancer, RetryHandler retryHandler) {
RibbonLoadBalancingHttpClient client = new RibbonLoadBalancingHttpClient(
config, serverIntrospector);
client.setLoadBalancer(loadBalancer);
client.setRetryHandler(retryHandler);
Monitors.registerObject("Client_" + this.name, client);
return client;
}
@Bean
@ConditionalOnMissingBean(AbstractLoadBalancerAwareClient.class)
@ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate")
public RetryableRibbonLoadBalancingHttpClient retryableRibbonLoadBalancingHttpClient(
IClientConfig config, ServerIntrospector serverIntrospector,
ILoadBalancer loadBalancer, RetryHandler retryHandler,
LoadBalancedRetryPolicyFactory loadBalancedRetryPolicyFactory) {
RetryableRibbonLoadBalancingHttpClient client = new RetryableRibbonLoadBalancingHttpClient(
config, serverIntrospector, loadBalancedRetryPolicyFactory);
client.setLoadBalancer(loadBalancer);
client.setRetryHandler(retryHandler);
Monitors.registerObject("Client_" + this.name, client);
return client;
}
}
@Configuration
@ConditionalOnProperty("ribbon.okhttp.enabled")
@ConditionalOnClass(name = "okhttp3.OkHttpClient")
protected static class OkHttpRibbonConfiguration {
@Value("${ribbon.client.name}")
private String name = "client";
@Bean
@ConditionalOnMissingBean(AbstractLoadBalancerAwareClient.class)
@ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate")
public RetryableOkHttpLoadBalancingClient okHttpLoadBalancingClient(IClientConfig config,
ServerIntrospector serverIntrospector,
ILoadBalancer loadBalancer,
RetryHandler retryHandler,
LoadBalancedRetryPolicyFactory loadBalancedRetryPolicyFactory) {
RetryableOkHttpLoadBalancingClient client = new RetryableOkHttpLoadBalancingClient(config,
serverIntrospector, loadBalancedRetryPolicyFactory);
client.setLoadBalancer(loadBalancer);
client.setRetryHandler(retryHandler);
Monitors.registerObject("Client_" + this.name, client);
return client;
}
@Bean
@ConditionalOnMissingBean(AbstractLoadBalancerAwareClient.class)
@ConditionalOnMissingClass(value = "org.springframework.retry.support.RetryTemplate")
public OkHttpLoadBalancingClient retryableOkHttpLoadBalancingClient(IClientConfig config,
ServerIntrospector serverIntrospector, ILoadBalancer loadBalancer,
RetryHandler retryHandler) {
OkHttpLoadBalancingClient client = new OkHttpLoadBalancingClient(config,
serverIntrospector);
client.setLoadBalancer(loadBalancer);
client.setRetryHandler(retryHandler);
Monitors.registerObject("Client_" + this.name, client);
return client;
}
}
@Configuration
@RibbonAutoConfiguration.ConditionalOnRibbonRestClient
protected static class RestClientRibbonConfiguration {
@Value("${ribbon.client.name}")
private String name = "client";
/**
* Create a Netflix {@link RestClient} integrated with Ribbon if none already exists
* in the application context. It is not required for Ribbon to work properly and is
* therefore created lazily if ever another component requires it.
*
* @param config the configuration to use by the underlying Ribbon instance
* @param loadBalancer the load balancer to use by the underlying Ribbon instance
* @param serverIntrospector server introspector to use by the underlying Ribbon instance
* @param retryHandler retry handler to use by the underlying Ribbon instance
* @return a {@link RestClient} instances backed by Ribbon
*/
@Bean
@Lazy
@ConditionalOnMissingBean(AbstractLoadBalancerAwareClient.class)
public RestClient ribbonRestClient(IClientConfig config, ILoadBalancer loadBalancer,
ServerIntrospector serverIntrospector, RetryHandler retryHandler) {
RestClient client = new OverrideRestClient(config, serverIntrospector);
client.setLoadBalancer(loadBalancer);
client.setRetryHandler(retryHandler);
Monitors.registerObject("Client_" + this.name, client);
return client;
}
}
@Bean
@ConditionalOnMissingBean
public ServerListUpdater ribbonServerListUpdater(IClientConfig config) {
return new PollingServerListUpdater(config);
}
@Bean
@ConditionalOnMissingBean
public ILoadBalancer ribbonLoadBalancer(IClientConfig config,
ServerList<Server> serverList, ServerListFilter<Server> serverListFilter,
IRule rule, IPing ping, ServerListUpdater serverListUpdater) {
if (this.propertiesFactory.isSet(ILoadBalancer.class, name)) {
return this.propertiesFactory.get(ILoadBalancer.class, config, name);
}
return new ZoneAwareLoadBalancer<>(config, rule, ping, serverList,
serverListFilter, serverListUpdater);
}
@Bean
@ConditionalOnMissingBean
@SuppressWarnings("unchecked")
public ServerListFilter<Server> ribbonServerListFilter(IClientConfig config) {
if (this.propertiesFactory.isSet(ServerListFilter.class, name)) {
return this.propertiesFactory.get(ServerListFilter.class, config, name);
}
ZonePreferenceServerListFilter filter = new ZonePreferenceServerListFilter();
filter.initWithNiwsConfig(config);
return filter;
}
@Bean
@ConditionalOnMissingBean
public RibbonLoadBalancerContext ribbonLoadBalancerContext(
ILoadBalancer loadBalancer, IClientConfig config, RetryHandler retryHandler) {
return new RibbonLoadBalancerContext(loadBalancer, config, retryHandler);
}
@Bean
@ConditionalOnMissingBean
public RetryHandler retryHandler(IClientConfig config) {
return new DefaultLoadBalancerRetryHandler(config);
}
@Bean
@ConditionalOnMissingBean
public ServerIntrospector serverIntrospector() {
return new DefaultServerIntrospector();
}
@PostConstruct
public void preprocess() {
setRibbonProperty(name, DeploymentContextBasedVipAddresses.key(), name);
}
static class OverrideRestClient extends RestClient {
private IClientConfig config;
private ServerIntrospector serverIntrospector;
protected OverrideRestClient(IClientConfig config,
ServerIntrospector serverIntrospector) {
super();
this.config = config;
this.serverIntrospector = serverIntrospector;
initWithNiwsConfig(this.config);
}
@Override
public URI reconstructURIWithServer(Server server, URI original) {
URI uri = updateToHttpsIfNeeded(original, this.config, this.serverIntrospector, server);
return super.reconstructURIWithServer(server, uri);
}
@Override
protected Client apacheHttpClientSpecificInitialization() {
ApacheHttpClient4 apache = (ApacheHttpClient4) super
.apacheHttpClientSpecificInitialization();
apache.getClientHandler()
.getHttpClient()
.getParams()
.setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.IGNORE_COOKIES);
return apache;
}
}
}
配置项读取的默认类是
DefaultClientConfigImpl, 该类继承自IClientConfig, 根据配置字段CommonClientConfigKey去yml配置文件读取Ribbon相关配置,列表如下:
5.2 配置项优化性能
跟性能相关的主要有如下配置,大多是关于连接数、线程池容量、超时时间等的限制:
总连接数可以优化到2000,每个节点的最大连接数可以到200,连接超时时间10s可以保持原状,等待应答时间可以适当调长,一般来说,一台机器如果长时间没有应答,就该赶紧切换到另一个实例尝试,因此MaxAutoRetries设置为0,MaxAutoRetriesNextServer设置为集群服务实例的数量。