解决SkyWalking在gateway组件中logback日志链路丢失

解决SkyWalking在gateway组件中logback日志链路丢失

契机

吴晟大佬:Why ping me? The plugin is not written by me. I don’t know anything about agent running in Spring cloud.

一年半以前,公司就从单体应用逐步迁移到微服务管理。奈何业务绑定太深,一直在两边混合开发。由于迁移到微服务的进程完成了一大半了,之前在单体项目中打的链路追踪就不太好用了。随即想到了SkyWalking,随即发现了一个大坑,那就是在gateway组件中logback日志会丢失TraceId,网上搜了一圈都没有解决问题,我又跑到SkyWalking的issue中搜了一大圈,发现大佬以下回答,还是得自己想办法。

Untitled

场景复现

版本

  • spring-cloud-gateway-dependencies = 3.1.3
  • skywalking-oap-server:9.3.0
  • skywalking-ui:9.3.0
  • apache-skywalking-java-agent:9.0.0
  • 假设有gateway+user两个模块

skyWalking环境搭建

#先搞个elasticsearch 略

#docker搭建skywalking-oap-server
docker run -d \
--name skywalking-oap \
--privileged=true \
-p 11800:11800 \
-p 12800:12800 \
--restart always \
-e SW_CORE_RECORD_DATA_TTL=15 \
-e SW_CORE_METRICS_DATA_TTL=15 \
-e SW_ENABLE_UPDATE_UI_TEMPLATE="true" \
-e SW_HEALTH_CHECKER=default \
-e TZ=Asia/Shanghai \
-e SW_STORAGE=elasticsearch \
-e JAVA_OPTS="-Xms512m -Xmx1024m" \
-e SW_STORAGE_ES_CLUSTER_NODES=host.docker.internal:9200 \
-e SW_ES_PASSWORD=guest \
-e  SW_ES_USER=elastic [apache/skywalking-oap-server:9.3.0](http://registry.docker-cn.com/apache/skywalking-oap-server:9.3.0) 

#docker搭建skywalking-ui
docker run -d \
--name skywalking-ui \
--link skywalking-oap  \
--privileged=true \
-p 18080:8080  \
--restart always \
-e TZ=Asia/Shanghai \
-e SW_HEALTH_CHECKER=default \
-e SW_OAP_ADDRESS=http://host.docker.internal:12800 apache/skywalking-ui:9.3.0

#下载agent
wget [https://dlcdn.apache.org/skywalking/java-agent/9.0.0/apache-skywalking-java-agent-9.0.0.tgz](https://dlcdn.apache.org/skywalking/java-agent/9.0.0/apache-skywalking-java-agent-9.0.0.tgz)

agent目录配置

#进入skywalking-agent目录
cd skywalking-agent

#拷贝optional-plugins里面的必要插件到activations
apm-toolkit-logback-1.x-activation-9.0.0.jar
apm-toolkit-logging-common-9.0.0.jar
apm-toolkit-trace-activation-9.0.0.jar
apm-toolkit-webflux-activation-9.0.0.jar

项目运行配置

gateway和user模块都配上一下参数,注意SW_AGENT_NAME要变一下

#vm
-javaagent:/Users/y/Documents/development/skywalking-agent/skywalking-agent.jar
#env
MYSQL_HOST=127.0.0.1;SW_AGENT_NAME=gw;SW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800;SW_AGENT_SPAN_LIMIT=2000

Untitled

Maven依赖

<dependency>
    <groupId>org.apache.skywalking</groupId>
    <artifactId>apm-toolkit-trace</artifactId>
    <version>9.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.skywalking</groupId>
    <artifactId>apm-toolkit-logback-1.x</artifactId>
    <version>9.0.0</version>
</dependency>

logback配置

<?xml version="1.0" encoding="UTF-8"?>
<configuration debug="false" scan="false">

	<!-- 彩色日志格式   -->
	<property name="CONSOLE_LOG_PATTERN" value="[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%highlight(%-5level)]  [%tid] [%thread] %cyan(%logger{50}:%L) - %msg%n"/>
	
  <!-- 彩色日志依赖的渲染类 -->
	<conversionRule conversionWord="clr" converterClass="org.springframework.boot.logging.logback.ColorConverter"/>
	<conversionRule conversionWord="wex" converterClass="org.springframework.boot.logging.logback.WhitespaceThrowableProxyConverter"/>
	<conversionRule conversionWord="wEx" converterClass="org.springframework.boot.logging.logback.ExtendedWhitespaceThrowableProxyConverter"/>

	<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
		<encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
			<layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout"> 
				<pattern>${CONSOLE_LOG_PATTERN}</pattern>
			</layout>
		</encoder>
	</appender>

	<root level="INFO">
		<appender-ref ref="console"/>
	</root>

</configuration>

gateway模块请求日志打印代码

就是打印四条信息:请求开始,请求头,请求结束信息,请求结束返回

@Slf4j
@Component
public class ApiLogFilter implements GlobalFilter, Ordered {

    private static final String START_TIME = "startTime";

    private static final String X_REAL_IP = "X-Real-IP";

    private final DataBuffer emptyBuffer = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT).allocateBuffer(0);

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {

        //请求的基本信息
        List<String> ips = exchange.getRequest().getHeaders().get(X_REAL_IP);
        String ip = ips != null ? ips.get(0) : null;
        String method = exchange.getRequest().getMethodValue();
        String path = exchange.getRequest().getURI().getPath();

        exchange.getAttributes().put(START_TIME, System.currentTimeMillis());
        ServerHttpRequest request = exchange.getRequest();

        if (path.contains("api-docs")) {
            log.warn("{}正在请求接口文档{}", ip, path);
            return chain.filter(exchange.mutate().request(request).build());
        }

        //打印请求开始日志
        log.info("请求开始:\n\n方法:{}\nHost:{}\nip:{}\n请求接口:{}\n", method, exchange.getRequest().getURI().getHost(), ip,
            path);

        //打印header头
        log.info("请求头:\n\n{}\n", JSONUtil.toJsonStr(exchange.getRequest().getHeaders()));

        //包装response,在任意地方返回打印结束信息等
        ServerHttpResponse originalResponse = exchange.getResponse();
        DataBufferFactory bufferFactory = originalResponse.bufferFactory();
        ServerHttpResponseDecorator decoratedResponse = new ServerHttpResponseDecorator(originalResponse) {
            @Override
            @NotNull
            public Mono<Void> writeWith(@NotNull Publisher<? extends DataBuffer> body) {
                Long startTime = exchange.getAttribute(START_TIME);
                if (startTime != null) {
                    Long executeTime = (System.currentTimeMillis() - startTime);
                    int code = 500;
                    if (exchange.getResponse().getStatusCode() != null) {
                        code = exchange.getResponse().getStatusCode().value();
                    }
                    log.info("请求结束信息:\n\n响应状态码:{}\n请求耗时:{}ms\n", code, executeTime);
                }

                if (body instanceof Flux) {
                    Flux<? extends DataBuffer> fluxBody = Flux.from(body);
                    return super.writeWith(fluxBody.map(dataBuffer -> {
                        byte[] content = new byte[dataBuffer.readableByteCount()];
                        dataBuffer.read(content);
                        DataBufferUtils.release(dataBuffer);
                        String data = new String(content, StandardCharsets.UTF_8);//data
                        log.info("请求结束返回: \n\n{}\n", data);
                        return bufferFactory.wrap(content);
                    }));
                } else {
                    log.error("<--- {} 响应code异常", getStatusCode());
                }
                MDC.clear();
                return super.writeWith(body);
            }
        };
        //打印请求参数
        MultiValueMap<String, String> queryParams = request.getQueryParams();
        if (!queryParams.isEmpty()) {
            log.info("请求参数(url):\n\n{}\n", queryParams);
        }

        if (HttpMethod.POST.name().equals(method)) {
            return DataBufferUtils.join(request.getBody()).defaultIfEmpty(emptyBuffer).flatMap(dataBuffer -> {
                byte[] bytes = new byte[dataBuffer.readableByteCount()];
                if (PrimitiveArrayUtil.isNotEmpty(bytes)) {
                    dataBuffer.read(bytes);
                    String bodyString = new String(bytes, StandardCharsets.UTF_8);
                    try {
                        bodyString = UriUtils.decode(bodyString, "UTF-8");
                    } catch (Exception e) {
                        log.error("请求参数编码错误有特殊符号", e);
                    }
                    log.info("请求参数(form):\n\n{}\n", bodyString);
                    exchange.getAttributes().put("POST_BODY", bodyString);
                    DataBufferUtils.release(dataBuffer);
                    Flux<DataBuffer> cachedFlux = Flux.defer(() -> {
                        DataBuffer buffer = exchange.getResponse().bufferFactory().wrap(bytes);
                        return Mono.just(buffer);
                    });
                    ServerHttpRequest mutatedRequest = new ServerHttpRequestDecorator(request) {
                        @Override
                        public Flux<DataBuffer> getBody() {
                            return cachedFlux;
                        }
                    };
                    return chain.filter(exchange.mutate().request(mutatedRequest).response(decoratedResponse).build());
                } else {
                    return chain.filter(exchange.mutate().request(request).response(decoratedResponse).build());
                }
            });
        }
        return chain.filter(exchange.mutate().request(request).response(decoratedResponse).build());
    }

    @Override
    public int getOrder() {
        return Ordered.HIGHEST_PRECEDENCE;
    }

}

user模块测试代码


@Slf4j
@RestController
@AllArgsConstructor
@Tag(name = "Test-测试通用")
@RequestMapping("/user/xxx/test")
public class TestController {

    @NoLoginRequired
    @Operation(summary = "testSkywalking")
    @GetMapping(value = "/testSkywalking")
    public R<String> testSkywalking() {

        log.info("user-testSkywalking");

        return R.ok();

    }
}
 

访问测试

通过网关访问testSkywalking测试结果如下

gateway日志

[TID:N/A] [reactor-http-nio-4] ApiLogFilter:77 - 请求开始:

方法:GET
Host:localhost
ip:null
请求接口:/user/xxx/test/testSkywalking

[TID:N/A] [reactor-http-nio-4] com.tommi.pet.gateway.filter.ApiLogFilter:81 - 请求头:

{"Host":["localhost:9999"]}

[TID:N/A] [reactor-http-nio-6] com.tommi.pet.gateway.filter.ApiLogFilter:101 - 请求结束信息:

响应状态码:200
请求耗时:1062ms

[TID:N/A] [reactor-http-nio-6] com.tommi.pet.gateway.filter.ApiLogFilter:111 - 请求结束返回: 

{"code":0,"message":null,"data":"1","success":true,"timestamp":1}

user日志

[TID:xx.17057720223290001] [XNIO-1 task-1] TestController:54 - user-testSkywalking

可以明显看到user模块的tid是正常输出的,可是gateway的日志死活都是N/A

尝试解决

分析问题

首先去去看看哪里输出的这个追踪码,没有找到9.0.0的源码,只找到8.7.0的源码,先搞下来研究研究:https://github.com/apache/skywalking/blob/v8.7.0/。然后定位到logback插件,去寻找tid在哪里输出的,最终找到如下:

apm-sniffer/apm-toolkit-activation/apm-toolkit-logback-1.x-activation/src/main/java/org/apache/skywalking/apm/toolkit/activation/log/logback/v1/x/PrintTraceIdInterceptor.java


public class PrintTraceIdInterceptor implements InstanceMethodsAroundInterceptor {

    @Override
    public void beforeMethod(EnhancedInstance objInst, Method method, Object[] allArguments, Class<?>[] argumentsTypes,
        MethodInterceptResult result) throws Throwable {

    }

    @Override
    public Object afterMethod(EnhancedInstance objInst, Method method, Object[] allArguments, Class<?>[] argumentsTypes,
        Object ret) throws Throwable {
        if (!ContextManager.isActive()) {
            if (allArguments[0] instanceof EnhancedInstance) {
                SkyWalkingContext skyWalkingContext = (SkyWalkingContext) ((EnhancedInstance) allArguments[0]).getSkyWalkingDynamicField();
                if (skyWalkingContext != null) {
                    return "TID:" + skyWalkingContext.getTraceId();
                }
            }
        }
        return "TID:" + ContextManager.getGlobalTraceId();
    }

    @Override
    public void handleMethodException(EnhancedInstance objInst, Method method, Object[] allArguments,
        Class<?>[] argumentsTypes, Throwable t) {

    }
}

观察代码无非就是skyWalkingContext.getTraceId()和ContextManager.xx输出的N/A。

Reator模式又有点陌生,无从下手。无意间从另外一篇博客中发现,其实exchange中是有这个tid的

Untitled

曲线救国

gateway也就拦截器打印点日志,要不然就是exceptionHandler中打印。我要不然直接把这个值传给skywalking-logback插件就完了嘛。拿什么传呢,对的,使用MDC来传,同一个线程都可以传,并且改动最小。直接修改插件中PrintTraceIdInterceptor的源码如下,然后直接打包这个8.7的skywalking-logback插件替换掉原本apm-toolkit-logback-1.x-activation-9.0.0.jar插件(完全可用)

@Override
    public Object afterMethod(EnhancedInstance objInst, Method method, Object[] allArguments, Class<?>[] argumentsTypes,
        Object ret) throws Throwable {

        String result;

        if (!ContextManager.isActive()) {
            if (allArguments[0] instanceof EnhancedInstance) {
                SkyWalkingContext skyWalkingContext = (SkyWalkingContext) ((EnhancedInstance) allArguments[0]).getSkyWalkingDynamicField();
                if (skyWalkingContext != null) {
                    result = "TID:" + skyWalkingContext.getTraceId();
                    return result;
                } else {
                    String tid = MDC.get("tid");
                    if (tid != null) {
                        return "TID:" + tid;
                    }
                }
            }
        }
        return "TID:" + ContextManager.getGlobalTraceId();
    }

然后再搞一个工具类


@Slf4j
public class SkywalkingUtil {

    /**
     * tid放入MDC
     *
     * @param exchange
     */
    public static void putTidIntoMdc(ServerWebExchange exchange) {
        try {
            Object entrySpanInstance = exchange.getAttributes().get("SKYWALKING_SPAN");
            if (ObjectUtil.isEmpty(entrySpanInstance)) {
                return;
            }
            Class<?> entrySpanClazz = entrySpanInstance.getClass().getSuperclass().getSuperclass();
            Field field = entrySpanClazz.getDeclaredField("owner");
            field.setAccessible(true);
            Object ownerInstance = field.get(entrySpanInstance);
            Class<?> ownerClazz = ownerInstance.getClass();
            Method getTraceId = ownerClazz.getMethod("getReadablePrimaryTraceId");
            String traceId = (String) getTraceId.invoke(ownerInstance);
            MDC.put("tid", traceId);
        } catch (Exception e) {
            log.error("gateway追踪码获取失败", e);
        }
    }

}

然后在gateway拦截器日志打印的地方加上以下代码,但是请求开始和请求结束是两个不同的线程啊,无法使用MDC来传递,架不住打印日志的地方都使用的一个exchange,当然为了保险你可以在每一个gatway模块的log之前加上以下语句


//打印开始日志的地方
SkywalkingUtil.putTidIntoMdc(exchange);

......

//打印结束日志的地方
SkywalkingUtil.putTidIntoMdc(exchange);

验证

gateway日志

[TID:xx.17057741377320011] [reactor-http-nio-2] ApiLogFilter:77 - 请求开始:

方法:GET
Host:localhost
ip:null
请求接口:/user/user/pet/test/testSkywalking

[TID:xx.17057741377320011] [reactor-http-nio-2] ApiLogFilter:81 - 请求头:

{"Host":["localhost:9999"]}

[TID:xx.17057741377320011] [reactor-http-nio-7] ApiLogFilter:101 - 请求结束信息:

响应状态码:200
请求耗时:25ms

[TID:xx.17057741377320011] [reactor-http-nio-7] ApiLogFilter:111 - 请求结束返回: 

{"code":0,"message":null,"data":"1748769599551352832","success":true,"timestamp":1705774137758,"extraResult":""}

user日志

[TID:xx.17057741377320011] [XNIO-1 task-1] TestController:54 - user-testSkywalking

确实可用,完美解决

总结

  • 就是把apm-toolkit-logback-1.x-activation插件改了点代码替换掉
  • 然后在gatway模块的日志前面用MDC传了点值进到插件里面
  • 相当取巧,因为gatway也没啥日志
  • 目前看来没什么问题

写到最后

Untitled

  • 55
    点赞
  • 40
    收藏
    觉得还不错? 一键收藏
  • 4
    评论
在 Spring Cloud Gateway ,可以使用 SkyWalking 提供的插件来实现链路追踪。具体步骤如下: 1. 添加 SkyWalking 插件的依赖: ```xml <dependency> <groupId>org.apache.skywalking</groupId> <artifactId>apm-toolkit-trace</artifactId> <version>${skywalking.version}</version> </dependency> <dependency> <groupId>org.apache.skywalking</groupId> <artifactId>apm-toolkit-opentracing</artifactId> <version>${skywalking.version}</version> </dependency> <dependency> <groupId>org.apache.skywalking</groupId> <artifactId>apm-toolkit-log4j-1.x</artifactId> <version>${skywalking.version}</version> </dependency> ``` 其 `${skywalking.version}` 是 SkyWalking 版本号。 2. 创建一个 `GlobalFilter`,在该过滤器添加 SkyWalking 的 `TracingContext`: ```java @Component public class SkyWalkingFilter implements GlobalFilter, Ordered { @Override public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) { Tracer tracer = TracingContext.Tracing().tracer(); Span span = tracer.createEntrySpan(exchange.getRequest().getURI().getPath()); try { // 在这里处理逻辑 return chain.filter(exchange); } finally { span.asyncFinish(); } } @Override public int getOrder() { return Ordered.LOWEST_PRECEDENCE; } } ``` 在过滤器使用 `TracingContext.Tracing().tracer()` 获取当前线程的 `Tracer`,并使用 `tracer.createEntrySpan()` 创建一个入口 Span。在请求处理完成后,需要在 `finally` 块调用 `span.asyncFinish()` 来结束该 Span。 3. 在 `application.yml` 配置 SkyWalking: ```yaml skywalking: trace: ignore-path: /actuator/** ignored-tracers: org.springframework.cloud.gateway.filter.factory.rewrite.CachedBodyOutputMessage ``` 其 `ignore-path` 表示不进行链路追踪的路径,`ignored-tracers` 表示需要忽略的 Tracer。 4. 启动 SkyWalking Agent,就可以在 SkyWalking 的控制台看到链路追踪信息了。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值