概述
分布式问题面临的问题:复杂分布式体系结构中的应用程序有数十个依赖关系,每个依赖关系在某些时候将不可避免的失败。
服务雪崩:服务高可用受到破坏。
Hystrix:用于处理分布式系统的延迟和容错的开源库,在分布式系统里,许多依赖不可避免的会调用失败,比如超时、异常等,hystrix可以保证在一个依赖出问题的情况下,不会导致整体服务失败,避免级联故障,以提高分布式系统的弹性。
“断路器”:向调用方返回一个符合预期的、可预处理的备选响应(FallBack),而不是长时间的等待或抛出调用方法无法处理异常。
用途:服务降级、服务熔断、接近实时的监控、限流、隔离......
Hystrix重要概念
服务降级(fallback):服务器繁忙、请稍后再试,不让客户端等待并立刻返回一个友好提示。
- 哪些情况会发出降级:
- 程序运行异常
- 超时
- 服务熔断触发服务降级
- 线程池/信号量打满也会导致服务降级
服务熔断(break):保险丝,达到最大访问,直接拒绝访问,拉闸限电,再调用服务降级的方法并返回友好提示。
- 服务降级-》进而熔断-》恢复调用链路
服务限流(flowlimit):秒杀高并发等操作,严禁一窝蜂的过来拥挤,大家排队,一秒钟N个,有序进行。
hystrix案例
一、构建:先7001配置为单一服务application.yml改为 defaultZone: http://eureka7001.com:7001/eureka/
(1)新建cloud-provider-hystrix-payment8001,pom.xml(在8001接口基础上加)
<!-- hystrix 服务降级 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
(2)application.yml
server:
port: 8001
spring:
application:
name: cloud-provider-hystrix-payment
eureka:
client:
register-with-eureka: true
fetch-registry: true
service-url:
defaultZone: http://eureka7001.com:7001/eureka
(3)PaymentHystrixMain8001.java
package com.jiao.springcloud;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;
/**
* @author jyl
* @create 2021-2-2
*/
@SpringBootApplication
@EnableEurekaClient
public class PaymentHystrixMain8001 {
public static void main(String[] args) {
SpringApplication.run(PaymentHystrixMain8001.class,args);
}
}
(4)业务类:service、controller
PaymentService
package com.jiao.springcloud.service;
import org.springframework.stereotype.Service;
import java.util.concurrent.TimeUnit;
@Service
public class PaymentService {
//正常访问 OK
public String paymentInfo_OK(Integer id){
return "线程池: "+Thread.currentThread().getName()+" paymentInfo_OK,id: "+id+"成功……^_^";
}
//延时 模拟故障
public String paymentInfo_TimeOut(Integer id){
try {
TimeUnit.SECONDS.sleep(3);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "线程池: "+Thread.currentThread().getName()+" paymentInfo_TimeOut,id: "+id+"超时……^_^3秒钟";
}
}
PaymentController:
package com.jiao.springcloud.controller;
import com.jiao.springcloud.service.PaymentService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import javax.annotation.Resource;
@Controller
@Slf4j
public class PaymentController {
@Resource
private PaymentService paymentService;
@Value("${server.port}")
private String serverPort;
@GetMapping(value = "/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id){
String result = paymentService.paymentInfo_OK(id);
log.info("*****result:"+result);
return result;
}
@GetMapping(value = "/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
String result = paymentService.paymentInfo_TimeOut(id);
log.info("*****result:"+result);
return result;
}
}
(5)测试:(启动7001、hystrix的8001)
http://localhost:8001/payment/hystrix/ok/2
http://localhost:8001/payment/hystrix/timeout/2
二、高并发测试
Jmeter:20000个高并发,访问timeout
新建:cloud-consumer-feign-hystrix-order80模块
(1)pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>cloud2020</artifactId>
<groupId>com.jiao.springcloud</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>cloud-consumer-feign-hystrix-order80</artifactId>
<dependencies>
<!-- openfeign -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-openfeign</artifactId>
</dependency>
<!-- hystrix 服务降级 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
<!-- eureka-client -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
<!-- 引用自己定义的api通用包,可以使用Payment支付entity -->
<dependency>
<groupId>com.jiao.springcloud</groupId>
<artifactId>cloud-api-commons</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<scope>compile</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
(2)application.yml
server:
port: 80
eureka:
client:
register-with-eureka: false
#fetch-registry: true
service-url:
defaultZone: http://eureka7001.com:7001/eureka/
(3)OrderHystrixMain80.java
package com.jiao.springcloud;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.openfeign.EnableFeignClients;
/**
* @author jyl
* @create 2021-2-28
*/
@SpringBootApplication
@EnableFeignClients
public class OrderHystrixMain80 {
public static void main(String[] args) {
SpringApplication.run(OrderHystrixMain80.class,args);
}
}
(4)PaymentHystrixService.java
package com.jiao.springcloud.service;
import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.stereotype.Component;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
@Component
@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT")
public interface PaymentHystrixService {
@GetMapping(value = "/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id);
@GetMapping(value = "/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id);
}
(5)OrderHystrixController.java
package com.jiao.springcloud.controller;
import com.jiao.springcloud.service.PaymentHystrixService;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
import javax.annotation.Resource;
@RestController
//@Slf4j
public class OrderHystrixController {
@Resource
private PaymentHystrixService paymentHystrixService;
@GetMapping(value = "/consumer/payment/hystrix/ok/{id}")
public String paymentInfo_OK(@PathVariable("id") Integer id){
String result = paymentHystrixService.paymentInfo_OK(id);
return result;
}
@GetMapping(value = "/consumer/payment/hystrix/timeout/{id}")
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
String result = paymentHystrixService.paymentInfo_TimeOut(id);
return result;
}
}
(6)测试
三、故障现象和导致原因
8001同一层的其它接口服务被困死,因为Tomcat线程池里面的工作线程已经被挤占完毕。
80此时调用8001,客户端访问响应缓慢,转圈圈。
四、上述结论
因为有上述的问题,才出现了我们的降级/容错/限流等技术的诞生
五、如何解决?解决的要求
(1)超时导致服务器变慢(转圈)---超时不在等待
(2)出错(宕机或程序运行出错)---出错要有兜底
(3)解决
- 对方服务(8001)超时了,调用者(80)不能一直卡死等待,必须有服务降级
- 对方服务(8001)宕机了,调用者(80)不能一直卡死等待,必须有服务降级
- 对方服务(8001)OK,调用者(80)自己出故障或有自我要求(自己的等待时间小于服务提供者),自己处理降级
六、服务降级
- 降级配置
@HystrixCommand
- 8001先从自身找问题
设置自身调用超时时间的峰值,峰值内可以正常运行,超过了需要有兜底的方法处理,作为服务降级fallback。
- 8001fallback
(1)业务类启用:@HystrixCommand报异常后如何处理
一旦调用服务方法失败并抛出了错误信息后,会自动调用@HystrixCommand标注好的fallbackMethod调用类中的指定的方法。
cloud-provider-hystrix-payment8001中的service\PaymentService,java
...
//延时 模拟故障
@HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = {
@HystrixProperty(name="execution.isolation.thread.timeoutInMilliseconds",value = "3000")
})
public String paymentInfo_TimeOut(Integer id){
try {
TimeUnit.SECONDS.sleep(5);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "线程池: "+Thread.currentThread().getName()+" paymentInfo_TimeOut,id: "+id+"超时……^_^3秒钟";
}
public String paymentInfo_TimeOutHandler(Integer id){
return "线程池: "+Thread.currentThread().getName()+" paymentInfo_TimeOutHandler,id: "+id+"8001fallback";
}
...
(2)主启动类激活:添加新注解@EnableCircuitBreaker
- 80fallback
80订单微服务,也可以更好的保护自己,自己也照样画葫芦进行客户端降级保护。
题外话:我们自己配置过热部署方式对Java代码的改动明显,但对@HystrixCommand内属性的修改建议重启微服务
YML(cloud-consumer-feign-hystrix-order80下的)
...
feign:
hystrix:
enabled: true
主启动@EnableHystrix
业务类(OrderHystrixController.java)
...
@GetMapping(value = "/consumer/payment/hystrix/timeout/{id}")
@HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = {
@HystrixProperty(name="execution.isolation.thread.timeoutInMilliseconds",value = "1500")
})
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
String result = paymentHystrixService.paymentInfo_TimeOut(id);
return result;
}
public String paymentInfo_TimeOutHandler(Integer id){
return "线程池: "+Thread.currentThread().getName()+" paymentInfo_TimeOutHandler,id: "+id+"80fallback";
}
- 目前问题
每个业务方法对应一个兜底方法,代码膨胀;统一和自定义的分开。
- 解决问题
(1)每个方法配置一个???膨胀:
feign接口系列,@DefaultProperties(defaultFallback="")
80端口:OrderHystrixController.java
...
@DefaultProperties(defaultFallback = "payment_Global_FallbackMethod")
public class OrderHystrixController {
...
@GetMapping(value = "/consumer/payment/hystrix/timeout/{id}")
/*@HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = {
@HystrixProperty(name="execution.isolation.thread.timeoutInMilliseconds",value = "1500")
})*/
@HystrixCommand
public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
String result = paymentHystrixService.paymentInfo_TimeOut(id);
return result;
}
public String paymentInfo_TimeOutHandler(Integer id){
return "线程池: "+Thread.currentThread().getName()+" paymentInfo_TimeOutHandler,id: "+id+"80fallback";
}
//全局fallback
public String payment_Global_FallbackMethod(){
return "Global异常处理信息,请稍后再试试。";
}
}
(2)和业务逻辑混一起???混乱:
服务降级,客户端去调用服务端,碰上服务端宕机或关闭
本次案例服务降级处理是在客户端80实现完成的,与服务端8001没有关系
只需要为Feign客户端定义的接口添加一个服务降级处理的实现类即可实现解耦
未来我们要面对的异常:运行、超时、宕机
再来看看我们的业务类PaymentController.java:【上图】
修改cloud-consumer-feign-hystrix-order80:
根据80接口已经有的PaymentHystrixService.java接口,重新新建一个类(PaymentFallbackService)实现该接口,统一为接口里面的方法进行异常处理
package com.jiao.springcloud.service;
import org.springframework.stereotype.Component;
@Component
public class PaymentFallbackService implements PaymentHystrixService{
public String paymentInfo_OK(Integer id) {
return "-----PaymentFallbackService fall back-paymentInfo_ok.";
}
public String paymentInfo_TimeOut(Integer id) {
return "-----PaymentFallbackService fall back-paymentInfo_timeout.";
}
}
PaymentFallbackService实现PaymentFeignClientService
YML:feign: hystrix: enabled: true
PaymentFeignClientService接口
测试:7001,8001,80 访问consumer的ok可以,关闭8001,再次访问,进入PaymentFallbackService.java中的方法
七、服务熔断
断路器:https://martinfowler.com/bliki/CircuitBreaker.html
应对雪崩效应的一种微服务链路保护机制,当扇出链路的某个微服务出错不可用或响应时间太长,降级返回出错信息。正常后恢复到调用链路。
熔断机制注解:@HystrixCommand
实操:修改cloud-provider-hystrix-payment8001
(1)pom(在api模块中)
<!-- IdUtil hutool工具 -->
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>4.5.15</version>
</dependency>
(2)PaymentService
...
//-------------服务熔断:10秒请求10次失败60%就跳闸
@HystrixCommand(fallbackMethod = "paymentCircuitBreaker_fallback",commandProperties = {
@HystrixProperty(name = "circuitBreaker.enabled",value = "true"),//是否开启断路器
@HystrixProperty(name = "circuitBreaker.requestVolumeThreshold",value = "10"),//请求次数
@HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds",value = "10000"),//时间窗口期
@HystrixProperty(name = "circuitBreaker.errorThresholdPercentage",value = "60")//失败率达到多少后跳闸
})
public String paymentCircuitBreaker(@PathVariable("id") Integer id){
if (id <0 ){
throw new RuntimeException("********id 不能为负数。");
}
String serialNumber = IdUtil.simpleUUID(); //UUID.randomUUID().toString();
return Thread.currentThread().getName()+"\t"+"调用成功,流水号为:"+serialNumber;
}
public String paymentCircuitBreaker_fallback(@PathVariable("id") Integer integer){
return "id不能为负数,请稍后再试~~~";
}
1、circuitBreaker.enabled
是否启用熔断器,默认是TURE。
2、circuitBreaker.forceOpen
熔断器强制打开,始终保持打开状态。默认值FLASE。
3、circuitBreaker.forceClosed
熔断器强制关闭,始终保持关闭状态。默认值FLASE。
4、circuitBreaker.errorThresholdPercentage
设定错误百分比,默认值50%,例如一段时间(10s)内有100个请求,其中有55个超时或者异常返回了,那么这段时间内的错误百分比是55%,大于了默认值50%,这种情况下触发熔断器-打开。
5、circuitBreaker.requestVolumeThreshold
默认值20.意思是至少有20个请求才进行errorThresholdPercentage错误百分比计算。比如一段时间(10s)内有19个请求全部失败了。错误百分比是100%,但熔断器不会打开,因为requestVolumeThreshold的值是20.
6、circuitBreaker.sleepWindowInMilliseconds
半开试探休眠时间,默认值5000ms。当熔断器开启一段时间之后比如5000ms,会尝试放过去一部分流量进行试探,确定依赖服务是否恢复。
————————————————
作者:RogerXue12345
原文链接:https://blog.csdn.net/rogerxue12345/article/details/105559871
(2)PaymentController
...
//------服务熔断
@GetMapping(value = "/consumer/circuit/{id}")
public String paymentCircuitBreaker(@PathVariable("id") Integer id){
String result = paymentService.paymentCircuitBreaker(id);
//log.info("****result:"+result);
return result;
}
(3)测试(http://localhost:8001/consumer/circuit/1)
原理(小结)
熔断打开->熔断关闭->熔断半开
八、服务限流
高级篇的alibaba的Sentinel说明
hystrix工作流程
https://github.com/Netflix/Hystrix/wiki/How-it-Works
服务监控hystrixDashboard
准实时的调用监控
假如application.yml不变成绿色叶子的话:
仪表盘9001:(新建cloud-consumer-hystrix-dashboard9001模块)
(1)pom
<!-- 图形化hystix dashboard -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
</dependency>
(2)application
server:
port: 9001
(3)HystrixDashboardMain9001
package com.jiao.springcloud;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.netflix.hystrix.dashboard.EnableHystrixDashboard;
/**
* @author jyl
* @create 2021-3-4
*/
@SpringBootApplication
@EnableHystrixDashboard
public class HystrixDashboardMain9001 {
public static void main(String[] args) {
SpringApplication.run(HystrixDashboardMain9001.class,args);
}
}
(4)所有Provider微服务提供者8001/8002/8003都需要监控依赖配置actuator依赖!!!!!
(5)测试:http://localhost:9001/hystrix
实战:9001监控8001:
(1)8001的pom
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
(2)8001的主启动类
...
/*
此配置是为了服务监控而配置,与服务容错本身无关,springcloud升级后的坑,
ServletRegistrationBean因为springboot的默认路径不是“/hystrix.stream”,
只要在自己的项目里配置上下面的servlet就可以了。
*/
@Bean
public ServletRegistrationBean getServlet(){
HystrixMetricsStreamServlet streamServlet = new HystrixMetricsStreamServlet();
ServletRegistrationBean registrationBean = new ServletRegistrationBean(streamServlet);
registrationBean.setLoadOnStartup(1);
registrationBean.addUrlMappings("/hystrix.stream");
registrationBean.setName("HystrixMetricsStreamServlet");
return registrationBean;
}
(3)测试:启动7001、9001、8001
1)http://localhost:9001/hystrix下http://localhost:8001/hystrix.stream进入监控
2)点击多次http://localhost:8001/consumer/circuit/2、http://localhost:8001/consumer/circuit/-2
3)结果图