简介
Zipkin 是一个分布式链路追踪系统。它有助于收集解决服务架构中的延迟问题所需的时间数据,用于帮助我们定位问题有着很不错的效果,Zipkin 分为服务端和客户端,服务端提供了一个 UI 监控界面,客户端指的每个服务
Zipkin官网
安装运行
1.官网提供了三种运行方式,分别是docker、jar以及源码运行。
(1)docker运行
docker run -d -p 9411:9411 openzipkin/zipkin
(2) jar可执行包运行
curl -sSL https://zipkin.io/quickstart.sh | bash -s
java -jar zipkin.jar >> out.log &
(3) jar可执行包运行
# get the latest source
git clone https://github.com/openzipkin/zipkin
cd zipkin
# Build the server and also make its dependencies
./mvnw -DskipTests --also-make -pl zipkin-server clean install
# Run the server
java -jar ./zipkin-server/target/zipkin-server-*exec.jar
注:官方推荐docker,个人也推荐,因为简单方便,以下例子以docker演示
docker运行Zipkin
1.拉取镜像
#拉取指定版本
docker pull openzipkin/zipkin:2.21.7
#拉取最新版本
docker pull openzipkin/zipkin
#查看镜像
docker images
2.启动
#--restart=always表示docker启动后自行启动
docker run -d --restart=always -p 9411:9411 openzipkin/zipkin
3.访问Zipkin页面http://服务器ip:9411/,本地就是127.0.0.1
访问页面没问题,表示Zipkin服务端启动完成,接下来创建几个微服务测试一下
搭建微服务
1.本次测试使用的SpringCloud进行测试,feign-client服务pom如下:
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-openfeign</artifactId>
<version>2.1.0.RELEASE</version>
</dependency>
<!-- 开启监控中心-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!-- 使用Hystrix dashboard 监控熔断器状态,配合上面actuator使用 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
</dependency>
<!-- feign自带的hystrix并不是启动依赖 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
<version>2.1.0.RELEASE</version>
</dependency>
<!-- 引入eureka 客户端-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
<version>2.1.0.RELEASE</version>
</dependency>
<!-- 被zipkin服务追踪的启动依赖-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>
</dependencies>
feign-client服务配置文件:
eureka:
client:
service-url:
defaultZone: http://localhost:8761/eureka/
server:
port: 8762
spring:
application:
name: feign-client
zipkin:
base-url: http://192.168.32.193:9411
sleuth:
sampler:
probability: 1.0
feign:
hystrix:
enabled: true
httpclient:
connection-timeout: 12000
connection-timer-repeat: 12000
client:
config:
default:
connectTimeout: 12000
readTimeOut: 12000
2.user-service服务pom
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
<version>2.1.0.RELEASE</version>
</dependency>
<!-- 被zipkin服务追踪的启动依赖-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>
<!-- web应用 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
user-service服务配置文件:
server:
port: 8763
eureka:
client:
serviceUrl:
defaultZone: http://localhost:8761/eureka/
spring:
application:
name: user-service
sleuth:
sampler:
probability: 1.0
zipkin:
base-url: http://192.168.32.193:9411
注:有人可能好奇为什么两个服务的pom和配置文件有所差别,这里主要是由feign-client服务调用user-service服务,为了测试效果,我在user-service服务的接口里加了延时,如果没有feign相关的配置是会报错的。官网服务可有可无。
3.编写测试类启动服务
@RestController
@RequestMapping("/user")
public class UserController {
@Autowired
private UserService userService;
@GetMapping("/hi")
public String hi(@RequestParam("name") String name){
return userService.sayHi(name);
}
}
@Service
public class UserService {
@Autowired
private UserFeignClient userFeignClient;
public String sayHi(String name){
return userFeignClient.sayHi(name);
}
}
4.测试
启动服务访问http://localhost:5000/feign-client/user/hi?name=“ibai”
服务访问成功,可以多刷新几次,再次回到Zipkin页面查看
根据serviceName查询时默认带出了我们的微服务名称。同样在依赖哪里也能看到我们的服务之间的调用关系
很直观的告诉我们整个请求的周期,以及耗时。
把user-service服务停掉再请求
页面请求报错,看一下链路信息
请求失败的连接显示红色,并且详情里面告诉你报错类型
Zipkin持久化
持久化方案
1…上述启动命令存在问题,当docker重启,链路数据会丢失,因为它的数据是在服务端内存中存储,官方提供有两种持久化方式,分别是mysql和es。
mysql持久化
1.持久化的数据库必须得是5.7,mysql8.0版本启动时会报错,经查询说是必须得是5.7,由于我已经安装了8.0版本,所以端口改成了3309(虚拟机映射端口),
docker run --name zipkin -d --restart=always -p 9411:9411 -e STORAGE_TYPE=mysql -e MYSQL_HOST=172.17.0.1 -e MYSQL_TCP_PORT=3309 -e MYSQL_DB=zipkin -e MYSQL_USER=root -e MYSQL_PASS=123456 openzipkin/zipkin
MYSQL_HOST是数据库服务ip。由于我的数据库和Zipkin都使用docker启动。这里的ip可以是你虚拟机的ip,但是虚拟机重启后连接网络ip会变,在你没有配置固定ip时,你可以使用ip addr查询你的虚拟机ip地址。
这个地址经测试,无论怎么重启,怎么更改热点连接均不会变,具体什么ip什么原理还没查询,有知道的同学可以留言解答一下。
2.新建数据库表
--
-- Copyright 2015-2019 The OpenZipkin Authors
--
-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
-- in compliance with the License. You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software distributed under the License
-- is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
-- or implied. See the License for the specific language governing permissions and limitations under
-- the License.
--
CREATE TABLE IF NOT EXISTS zipkin_spans (
`trace_id_high` BIGINT NOT NULL DEFAULT 0 COMMENT 'If non zero, this means the trace uses 128 bit traceIds instead of 64 bit',
`trace_id` BIGINT NOT NULL,
`id` BIGINT NOT NULL,
`name` VARCHAR(255) NOT NULL,
`remote_service_name` VARCHAR(255),
`parent_id` BIGINT,
`debug` BIT(1),
`start_ts` BIGINT COMMENT 'Span.timestamp(): epoch micros used for endTs query and to implement TTL',
`duration` BIGINT COMMENT 'Span.duration(): micros used for minDuration and maxDuration query',
PRIMARY KEY (`trace_id_high`, `trace_id`, `id`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED CHARACTER SET=utf8 COLLATE utf8_general_ci;
ALTER TABLE zipkin_spans ADD INDEX(`trace_id_high`, `trace_id`) COMMENT 'for getTracesByIds';
ALTER TABLE zipkin_spans ADD INDEX(`name`) COMMENT 'for getTraces and getSpanNames';
ALTER TABLE zipkin_spans ADD INDEX(`remote_service_name`) COMMENT 'for getTraces and getRemoteServiceNames';
ALTER TABLE zipkin_spans ADD INDEX(`start_ts`) COMMENT 'for getTraces ordering and range';
CREATE TABLE IF NOT EXISTS zipkin_annotations (
`trace_id_high` BIGINT NOT NULL DEFAULT 0 COMMENT 'If non zero, this means the trace uses 128 bit traceIds instead of 64 bit',
`trace_id` BIGINT NOT NULL COMMENT 'coincides with zipkin_spans.trace_id',
`span_id` BIGINT NOT NULL COMMENT 'coincides with zipkin_spans.id',
`a_key` VARCHAR(255) NOT NULL COMMENT 'BinaryAnnotation.key or Annotation.value if type == -1',
`a_value` BLOB COMMENT 'BinaryAnnotation.value(), which must be smaller than 64KB',
`a_type` INT NOT NULL COMMENT 'BinaryAnnotation.type() or -1 if Annotation',
`a_timestamp` BIGINT COMMENT 'Used to implement TTL; Annotation.timestamp or zipkin_spans.timestamp',
`endpoint_ipv4` INT COMMENT 'Null when Binary/Annotation.endpoint is null',
`endpoint_ipv6` BINARY(16) COMMENT 'Null when Binary/Annotation.endpoint is null, or no IPv6 address',
`endpoint_port` SMALLINT COMMENT 'Null when Binary/Annotation.endpoint is null',
`endpoint_service_name` VARCHAR(255) COMMENT 'Null when Binary/Annotation.endpoint is null'
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED CHARACTER SET=utf8 COLLATE utf8_general_ci;
ALTER TABLE zipkin_annotations ADD UNIQUE KEY(`trace_id_high`, `trace_id`, `span_id`, `a_key`, `a_timestamp`) COMMENT 'Ignore insert on duplicate';
ALTER TABLE zipkin_annotations ADD INDEX(`trace_id_high`, `trace_id`, `span_id`) COMMENT 'for joining with zipkin_spans';
ALTER TABLE zipkin_annotations ADD INDEX(`trace_id_high`, `trace_id`) COMMENT 'for getTraces/ByIds';
ALTER TABLE zipkin_annotations ADD INDEX(`endpoint_service_name`) COMMENT 'for getTraces and getServiceNames';
ALTER TABLE zipkin_annotations ADD INDEX(`a_type`) COMMENT 'for getTraces and autocomplete values';
ALTER TABLE zipkin_annotations ADD INDEX(`a_key`) COMMENT 'for getTraces and autocomplete values';
ALTER TABLE zipkin_annotations ADD INDEX(`trace_id`, `span_id`, `a_key`) COMMENT 'for dependencies job';
CREATE TABLE IF NOT EXISTS zipkin_dependencies (
`day` DATE NOT NULL,
`parent` VARCHAR(255) NOT NULL,
`child` VARCHAR(255) NOT NULL,
`call_count` BIGINT,
`error_count` BIGINT,
PRIMARY KEY (`day`, `parent`, `child`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED CHARACTER SET=utf8 COLLATE utf8_general_ci;
sql来源在源码中有,感兴趣的可以去看一下源码
github地址
再次刷新请求路径
重启Zipkin服务,再次访问页面,数据依然还在。
docker restart d805ff90461b
总结
1.以上是使用微服务和docker进行的测试。有不足之处欢迎大家指点。