背景说明
测试环境实现了不停机发布,但是在使用了Spring Cloud Gateway的微应用中,发布完成后会出现短暂的500错误,这里Spring Cloud中使用的注册中心为Zookeeper。
在查找问题后发现:
- 发布中实例通过docker stop停止,此时Dubbo服务和Spring Cloud应用实例节点都需要一段时间才会自行下线,这里没有摘除流量再停止应用
- 在Spring Cloud的Load Balancer中存在实例的缓存(默认35s的缓存刷新时间),以致于应用下线后仍然有流量进入到已下线实例
问题解析
想要解决以上问题:
- 微应用需要维护统一的接口摘除流量,涉及Dubbo服务下线和Spring Cloud应用实例节点下线,这里Dubbo服务下线可以通过qos实现(Spring Boot中dubbo qos是默认关闭的并非SSM应用中一样会默认开启,如果要使用需要自行配置参数开启,具体默认配置参数赋值源码见DubboDefaultPropertiesEnvironmentPostProcessor)
- Spring Cloud Gateway中需要有能监听实例上下线并清除应用实例缓存的逻辑
源码解读-应用实例节点下线
ZookeeperRegistration
Zookeeper服务注册信息,实现类为ServiceInstanceRegistration
在Spring Boot中如果未自行实例化,自动配置类ZookeeperAutoServiceRegistrationAutoConfiguration会自动实例化
ZookeeperServiceRegistry
主要方法:
- void register(ZookeeperRegistration registration) 服务注册(SpringBoot中会在监听到ServletWebServerInitializedEvent事件后调用)
- ServiceDiscovery getServiceDiscovery() 获取服务发现者
- void deregister(ZookeeperRegistration registration) 服务解绑
- void close() 关闭注册器
问题解决
服务应用增加接口,在发布下线前调用/service/offline.do接口摘除流量
@ConditionalOnProperty(value = "spring.cloud.service-registry.auto-registration.enabled", matchIfMissing = true)
@RestController
@RequestMapping(value = "/service")
public class ServiceController {
@Autowired
private ZookeeperServiceRegistry zookeeperServiceRegistry;
@Autowired
private ZookeeperRegistration registration;
@RequestMapping(value = "/offline.do")
public String offline() {
// 1.发布脚本中调用dubbo qos offline或者此处通过http调用
// 2.服务解绑
zookeeperServiceRegistry.deregister(registration);
// 3.关闭注册器
zookeeperServiceRegistry.close();
return "OK";
}
}
源码解读-监听清除应用实例缓存
ZookeeperLoadBalancerConfiguration
ServiceInstanceListSupplier的实例化,ServiceInstanceListSupplier是Spring Cloud Load Balancer中的关键接口之一,定义了如何找到可用的服务实例。
@Bean
@ConditionalOnBean(DiscoveryClient.class)
@ConditionalOnMissingBean
public ServiceInstanceListSupplier zookeeperDiscoveryClientServiceInstanceListSupplier(
DiscoveryClient discoveryClient, Environment env,
ApplicationContext context,
ZookeeperDependencies zookeeperDependencies) {
// 第一委托
DiscoveryClientServiceInstanceListSupplier firstDelegate = new DiscoveryClientServiceInstanceListSupplier(
discoveryClient, env);
// 第二委托
ZookeeperServiceInstanceListSupplier secondDelegate = new ZookeeperServiceInstanceListSupplier(firstDelegate,
zookeeperDependencies);
ObjectProvider<LoadBalancerCacheManager> cacheManagerProvider = context
.getBeanProvider(LoadBalancerCacheManager.class);
// 如果上下文中没有则实例化缓存服务实例
if (cacheManagerProvider.getIfAvailable() != null) {
return new CachingServiceInstanceListSupplier(secondDelegate,
cacheManagerProvider.getIfAvailable());
}
return secondDelegate;
}
CachingServiceInstanceListSupplier
// 实例化中这里的delegate为ZookeeperServiceInstanceListSupplier
public CachingServiceInstanceListSupplier(ServiceInstanceListSupplier delegate, CacheManager cacheManager) {
super(delegate);
this.serviceInstances = CacheFlux.lookup(key -> {
// TODO: configurable cache name
Cache cache = cacheManager.getCache(SERVICE_INSTANCE_CACHE_NAME);
if (cache == null) {
if (log.isErrorEnabled()) {
log.error("Unable to find cache: " + SERVICE_INSTANCE_CACHE_NAME);
}
return Mono.empty();
}
// 缓存中获取服务实例
List<ServiceInstance> list = cache.get(key, List.class);
if (list == null || list.isEmpty()) {
return Mono.empty();
}
return Flux.just(list).materialize().collectList();
}, delegate.getServiceId()).onCacheMissResume(delegate.get().take(1)) // 缓存丢失的情况下调用ZookeeperLoadBalancerConfiguration#get
.andWriteWith((key, signals) -> Flux.fromIterable(signals).dematerialize().doOnNext(instances -> {
Cache cache = cacheManager.getCache(SERVICE_INSTANCE_CACHE_NAME);
if (cache == null) {
if (log.isErrorEnabled()) {
log.error("Unable to find cache for writing: " + SERVICE_INSTANCE_CACHE_NAME);
}
}
else {
cache.put(key, instances);
}
}).then());
}
ZookeeperLoadBalancerConfiguration
获取服务实例并根据状态过滤
@Override
public Flux<List<ServiceInstance>> get() {
return delegate.get().map(this::filteredByZookeeperStatusUp);
}
// 过滤节点信息中payload.metadata.instance_status=UP的服务实例
private List<ServiceInstance> filteredByZookeeperStatusUp(List<ServiceInstance> serviceInstances) {
ArrayList<ServiceInstance> filteredInstances = new ArrayList<>();
for (ServiceInstance serviceInstance : serviceInstances) {
if (serviceInstance instanceof ZookeeperServiceInstance) {
org.apache.curator.x.discovery.ServiceInstance<ZookeeperInstance> zookeeperServiceInstance = ((ZookeeperServiceInstance) serviceInstance)
.getServiceInstance();
String instanceStatus = null;
if (zookeeperServiceInstance.getPayload() != null
&& zookeeperServiceInstance.getPayload().getMetadata() != null) {
instanceStatus = zookeeperServiceInstance.getPayload().getMetadata()
.get(INSTANCE_STATUS_KEY);
}
if (!StringUtils.hasText(instanceStatus) // backwards compatibility
|| instanceStatus.equalsIgnoreCase(STATUS_UP)) {
filteredInstances.add(serviceInstance);
}
}
}
return filteredInstances;
}
DiscoveryClientServiceInstanceListSupplier
public DiscoveryClientServiceInstanceListSupplier(DiscoveryClient delegate, Environment environment) {
this.serviceId = environment.getProperty(PROPERTY_NAME);
resolveTimeout(environment);
// 委托收到请求重新调用ZookeeperDiscoveryClient#getInstances获取服务实例
this.serviceInstances = Flux.defer(() -> Mono.fromCallable(() -> delegate.getInstances(serviceId)))
.timeout(timeout, Flux.defer(() -> {
logTimeout();
return Flux.just(new ArrayList<>());
}), Schedulers.boundedElastic()).onErrorResume(error -> {
logException(error);
return Flux.just(new ArrayList<>());
});
}
ZookeeperDiscoveryClient
DicoveryClient的zookeeper实现,服务发现获取实例:
public List<org.springframework.cloud.client.ServiceInstance> getInstances(
final String serviceId) {
try {
if (getServiceDiscovery() == null) {
return Collections.EMPTY_LIST;
}
// 获取用于查询的服务id
String serviceIdToQuery = getServiceIdToQuery(serviceId);
// curator根据服务id查询服务id节点下的所有子节点信息即服务实例信息
Collection<ServiceInstance<ZookeeperInstance>> zkInstances = getServiceDiscovery()
.queryForInstances(serviceIdToQuery);
List<org.springframework.cloud.client.ServiceInstance> instances = new ArrayList<>();
// 转化服务实例实体类
for (ServiceInstance<ZookeeperInstance> instance : zkInstances) {
instances.add(createServiceInstance(serviceIdToQuery, instance));
}
return instances;
}
catch (KeeperException.NoNodeException e) {
// 省略代码
}
catch (Exception exception) {
rethrowRuntimeException(exception);
}
return new ArrayList<>();
}
问题解决
Spring Cloud Gateway中增加zookeeper监听节点新增和删除获取服务id并清除缓存。
依赖defaultLoadBalancerCacheManager原本已在LoadBalancerCacheAutoConfiguration.DefaultLoadBalancerCacheManagerConfiguration中实例化,但是由于有注解参数autowireCandidate = false修饰,此处需要自行实例化
@Component
@Log4j2
public class DeployListener {
private static final String path = "/services";
@Autowired
private CuratorFramework curatorClient;
@Autowired
private LoadBalancerCacheManager defaultLoadBalancerCacheManager;
@PostConstruct
public void init() {
try {
TreeCacheListener listener = (client, event) -> {
if (NODE_REMOVED.equals(event.getType()) || NODE_ADDED.equals(event.getType())) {
String nodePath = event.getData().getPath();
// nodePath like /services/demo/ewqeqwewqewq
String serviceName = nodePath.split(CommonConstant.SLASH)[2];
Cache cache = defaultLoadBalancerCacheManager.getCache(SERVICE_INSTANCE_CACHE_NAME);
if (Objects.nonNull(cache)) {
cache.evict(serviceName);
}
}
};
TreeCache treeCache = TreeCache.newBuilder(curatorClient, path).setCacheData(false).build();
treeCache.getListenable().addListener(listener);
treeCache.start();
} catch (Exception e) {
throw new RuntimeException("Error when initializing deployListener", e);
}
}
}