全链路追踪竟然如此简单? bytebuddy搭建全链路追踪的demo 附代码

大家好,我是烤鸭:
    最近一直在研究全链路追踪,比如cat、skywalking、zipkin等。
    发现 skywalking 是基于bytebuddy 实现的,想自己试着写一下demo。
    demo的git地址,感兴趣的可以自己试下。代码在idea中可以跑,至于其他场景需要自己研究(比如用cmd或者linux可能会报NoClassDefDoundError)。

demo地址:(仅实现了http方式的链路,有需要的可以自己补充,比如dubbo或者其他rpc方式的拦截)

https://gitee.com/fireduck_admin/link-trace-demo
    环境:
    JDK 8


1.    设计目标


    监控接口(方法)耗时和链路关系(http请求),对比aop方式,zipkin和cat 是基于拦截的形式。

2.  bytebuddy


    bytebuddy网上资料虽然不多,但是api比较简单,看看基本就会了。我也不介绍了。具体想看的去官网看下吧。
    https://bytebuddy.net/

3.  接口耗时伪代码说明


    由于测试链路,我们需要一个agent项目和demo项目(用于请求转发)。
    agent项目创建拦截和写具体的拦截逻辑,这里以 拦截 Spring的service注解为例。这是在agent项目里的。    

public static void premain(String agentArgs, Instrumentation inst) {
        System.out.println("==============Client=============== premain =============start============");

        AgentBuilder.Transformer transformerService = new AgentBuilder.Transformer() {
            @Override
            public DynamicType.Builder<?> transform(DynamicType.Builder<?> builder, TypeDescription typeDescription, ClassLoader classLoader, JavaModule module) {
                return builder
                        .method(ElementMatchers.<MethodDescription>any()) // 拦截任意方法
                        .intercept(MethodDelegation.to(MyServiceAdvice.class)); // 委托
            }
        };
        // 拦截 Service
        AgentBuilder agentBuilder = agentBuilder.type(ElementMatchers.isAnnotatedWith(ElementMatchers.named("org.springframework.stereotype.Service"))) // 指定需要拦截的类
                .transform(transformerService);
        // 注入 inst
        agentBuilder.installOn(inst);
        System.out.println("================Client============ premain ================finish===========");
    }

  demo项目启动的时候需要在idea配置vm参数。不知道怎么配的看图。

 -javaagent:\xx\xx\target\link-trace-demo-agent-1.0-SNAPSHOT.jar

拦截效果如图,这样就实现了接口(方法)调用的耗时统计。

4.  全链路伪代码说明


    其实自从google在2010年提出了dapper论文后,后续的链路追踪基本都是按照这个思路来实现的,我这就是简易版。
    agent拦截 controller 注解跟上面的service类似,就不贴代码了。
    这里我们需要一个span对象,当前的请求信息记录在span对象(主要是谁调的你),并且放到threadlocal的调用堆栈中,这样当前的请求和线程就绑定了(方便单个服务内的流转,比如controller调service)。
    这里截图可以看下只单独拦截了web,由于没有上游的信息,所以生成的新的span,seq为1。如图。

    模拟下拦截web后调用web方法的链路信息。如图所示。pid(parentid)指的是上个链路的id,这样就可以获取到整个调用链的完整信息(出入参、时间、方法等)


    再多链路,比如web-service-web或者更多服务的自己试下吧,思路就是这样的。

5.  aop个agent的对比


    只是单纯的统计aop方式和bytebuddy两种方式。aop底层有接口使用jdk 代理,无接口使用cglib(底层asm)。而bytebuddy底层也是 asm
    先放一张官方的对比图。

    至于到底快不快,我试下。先链路的代码先注释,单纯调下接口试试。(测试方法在test包下)
    先看下单次的:


    单独调用controller的时候,bytebuddy明显快的,几乎没有损耗。
    调用web+serivce的时候时间差不多。
    下面单独调用两种情况,500次的平均值:
    仅调用Controller:    
    AOP方式: 5.7 ms,主要损耗在首次调用。

    [329, 12, 9, 6, 7, 5, 6, 5, 6, 7, 9, 8, 6, 5, 8, 8, 9, 22, 7, 11, 7, 5, 7, 6, 6, 6, 5, 6, 7, 10, 9, 9, 6, 8, 8, 13, 20, 10, 38, 7, 9, 8, 7, 10, 12, 9, 7, 7, 8, 9, 11, 11, 6, 7, 8, 5, 5, 5, 4, 5, 4, 3, 3, 4, 4, 4, 6, 4, 4, 4, 5, 6, 6, 7, 7, 6, 7, 5, 5, 8, 6, 7, 4, 7, 7, 6, 5, 5, 4, 4, 3, 4, 4, 4, 4, 5, 9, 4, 5, 5, 7, 4, 5, 11, 7, 7, 6, 9, 7, 22, 8, 14, 8, 4, 3, 4, 3, 3, 3, 4, 3, 4, 4, 5, 4, 5, 5, 11, 4, 4, 4, 4, 4, 7, 5, 8, 8, 7, 6, 6, 6, 7, 7, 4, 5, 3, 4, 4, 3, 4, 3, 4, 3, 4, 4, 5, 4, 4, 4, 4, 6, 10, 4, 6, 8, 7, 11, 12, 5, 8, 7, 6, 5, 4, 4, 5, 4, 5, 4, 4, 4, 4, 4, 4, 4, 4, 8, 4, 5, 8, 6, 12, 4, 8, 5, 8, 6, 7, 6, 9, 3, 4, 6, 4, 3, 4, 3, 3, 3, 3, 3, 3, 4, 5, 5, 4, 3, 3, 3, 4, 4, 5, 4, 6, 6, 5, 4, 7, 4, 9, 4, 5, 6, 5, 8, 5, 6, 4, 4, 4, 3, 3, 4, 4, 2, 3, 3, 3, 2, 3, 3, 3, 4, 3, 4, 5, 4, 3, 3, 3, 4, 4, 5, 4, 5, 4, 7, 4, 6, 5, 5, 6, 3, 3, 4, 4, 5, 5, 5, 4, 4, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 2, 2, 3, 3, 3, 3, 2, 3, 6, 4, 4, 4, 4, 4, 7, 3, 4, 6, 3, 5, 7, 5, 6, 4, 6, 7, 5, 4, 7, 6, 3, 3, 3, 4, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 4, 4, 3, 4, 3, 4, 5, 5, 4, 10, 12, 7, 7, 5, 10, 4, 6, 6, 5, 4, 6, 6, 5, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 4, 13, 6, 11, 4, 6, 4, 4, 6, 4, 4, 5, 4, 4, 6, 4, 3, 4, 3, 4, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 2, 2, 2, 3, 2, 3, 4, 3, 4, 5, 4, 4, 6, 5, 4, 6, 4, 4, 6, 5, 4, 7, 3, 4, 4, 5, 3, 4, 4, 3, 3, 12, 5, 3, 3, 3, 4, 3, 2, 2, 2, 3, 2, 3, 3, 4, 3, 4, 4, 2, 3, 5, 4, 3, 7, 4, 4, 5, 3, 12, 21, 15, 16, 7, 2, 3, 2, 3, 2, 3, 3, 3, 3, 3, 4, 2, 2, 3, 3, 2, 2, 2, 2, 2]
    avg = OptionalDouble[5.706]
    AGENT方式: 5.4 ms,主要损耗在首次调用。
    [251, 9, 11, 12, 9, 9, 9, 9, 8, 11, 13, 6, 5, 5, 5, 6, 5, 5, 4, 6, 6, 6, 4, 10, 7, 6, 8, 7, 8, 6, 7, 7, 6, 6, 7, 6, 8, 5, 6, 5, 4, 4, 5, 5, 4, 4, 4, 4, 3, 4, 3, 5, 4, 3, 4, 4, 5, 5, 6, 5, 7, 6, 11, 10, 7, 9, 6, 5, 7, 5, 8, 6, 10, 7, 7, 7, 5, 5, 5, 5, 5, 5, 7, 5, 9, 19, 10, 7, 4, 3, 4, 4, 4, 3, 3, 4, 4, 6, 5, 4, 3, 5, 6, 4, 4, 5, 9, 10, 5, 3, 4, 5, 7, 4, 5, 10, 5, 8, 5, 11, 6, 6, 10, 4, 4, 5, 5, 7, 5, 4, 4, 7, 4, 4, 4, 5, 6, 4, 4, 5, 6, 6, 4, 6, 4, 4, 5, 8, 4, 5, 5, 4, 4, 5, 4, 4, 4, 7, 7, 10, 4, 4, 4, 3, 4, 3, 3, 4, 5, 4, 5, 8, 4, 5, 6, 7, 4, 4, 6, 5, 4, 7, 4, 4, 5, 5, 8, 5, 4, 6, 4, 9, 4, 5, 3, 3, 3, 4, 4, 3, 3, 4, 4, 4, 6, 5, 12, 6, 5, 5, 6, 8, 7, 14, 8, 5, 6, 6, 5, 5, 10, 5, 6, 5, 4, 4, 3, 4, 4, 4, 4, 5, 4, 6, 7, 4, 8, 5, 6, 4, 5, 7, 4, 6, 5, 5, 6, 3, 4, 3, 3, 4, 4, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 4, 4, 4, 5, 3, 5, 7, 3, 4, 5, 4, 4, 7, 4, 5, 6, 5, 8, 4, 4, 5, 4, 3, 3, 3, 3, 4, 2, 3, 6, 4, 3, 3, 3, 3, 4, 4, 3, 3, 4, 4, 5, 8, 5, 6, 5, 10, 3, 4, 14, 5, 7, 3, 7, 3, 3, 8, 3, 4, 5, 3, 4, 4, 3, 3, 4, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 4, 6, 6, 6, 7, 5, 7, 5, 5, 4, 4, 4, 3, 5, 5, 6, 4, 6, 3, 2, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 5, 4, 3, 3, 4, 4, 5, 8, 4, 6, 3, 5, 5, 4, 6, 8, 5, 3, 3, 4, 6, 7, 5, 8, 4, 4, 4, 5, 6, 5, 4, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 4, 7, 3, 7, 4, 6, 6, 3, 5, 7, 4, 5, 3, 5, 5, 4, 4, 5, 3, 3, 2, 5, 3, 26, 17, 8, 20, 9, 10, 3, 3, 4, 6, 3, 5, 6, 3, 5, 5, 4, 4, 5, 4, 4, 5, 3, 3, 3, 3, 2, 3, 2, 2, 2, 2, 2, 3, 2, 2, 2, 3, 3, 2, 2, 2, 2, 3, 2, 2, 2, 3, 3, 3, 2]
    avg = OptionalDouble[5.438]

    结论是在仅有cglib代理的时候(单独调用controller),耗时差不多,bytebuddy要稍快一些。
    调用Controller+Service:
    AOP方式: 5.7 ms,主要损耗在首次调用。

    [328, 14, 8, 12, 11, 9, 8, 7, 8, 8, 13, 6, 6, 6, 7, 7, 6, 6, 8, 5, 7, 8, 11, 11, 9, 8, 9, 5, 12, 8, 8, 5, 5, 8, 6, 9, 6, 8, 6, 4, 4, 5, 5, 4, 5, 7, 7, 11, 6, 9, 5, 8, 10, 5, 10, 6, 6, 6, 7, 6, 5, 4, 4, 4, 4, 3, 5, 4, 5, 5, 5, 5, 6, 7, 9, 5, 8, 6, 7, 7, 6, 4, 5, 7, 5, 8, 5, 5, 5, 4, 5, 4, 5, 4, 6, 9, 8, 9, 6, 6, 5, 4, 7, 9, 4, 4, 10, 8, 25, 30, 29, 4, 5, 6, 6, 7, 4, 6, 5, 5, 6, 6, 9, 6, 10, 12, 8, 13, 8, 6, 6, 4, 5, 4, 3, 5, 3, 3, 3, 4, 3, 3, 4, 3, 3, 3, 3, 4, 4, 4, 6, 5, 5, 6, 4, 6, 4, 4, 6, 5, 4, 5, 6, 6, 5, 5, 5, 4, 4, 4, 6, 12, 7, 5, 6, 5, 4, 5, 4, 6, 9, 8, 10, 4, 3, 6, 4, 4, 4, 4, 11, 8, 5, 4, 14, 6, 7, 5, 6, 6, 5, 5, 9, 4, 6, 8, 5, 5, 5, 6, 3, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 2, 2, 3, 4, 4, 6, 14, 8, 6, 6, 4, 4, 9, 4, 6, 5, 6, 4, 4, 5, 6, 4, 4, 3, 4, 4, 4, 3, 5, 4, 3, 3, 5, 15, 11, 11, 6, 9, 5, 6, 5, 5, 7, 5, 8, 6, 7, 5, 4, 5, 5, 7, 4, 4, 4, 5, 5, 4, 4, 5, 9, 4, 6, 5, 4, 7, 5, 7, 6, 9, 7, 5, 7, 5, 5, 5, 5, 6, 6, 5, 6, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 3, 5, 3, 5, 5, 8, 5, 5, 6, 8, 5, 5, 4, 4, 6, 4, 4, 5, 4, 4, 4, 3, 4, 3, 4, 4, 3, 3, 3, 3, 2, 3, 3, 3, 4, 4, 3, 7, 4, 4, 8, 4, 7, 4, 3, 8, 4, 4, 6, 5, 3, 7, 3, 4, 5, 3, 3, 2, 3, 2, 3, 3, 3, 2, 3, 3, 3, 4, 7, 4, 6, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 5, 4, 4, 4, 6, 4, 7, 4, 5, 6, 4, 4, 6, 5, 7, 4, 3, 6, 3, 3, 3, 3, 3, 2, 3, 2, 3, 3, 3, 3, 3, 3, 4, 3, 3, 2, 3, 4, 4, 4, 4, 4, 5, 4, 5, 3, 5, 4, 7, 4, 31, 5, 4, 5, 6, 5, 6, 5, 4, 5, 8, 4, 4, 4, 4, 4, 7, 4, 4, 4, 4, 4, 5, 4, 3, 8, 4, 32, 13, 37, 28, 5, 4, 3, 3, 3, 3, 2, 2, 3, 3, 3, 5, 6, 3, 3, 3, 4, 5, 3, 3, 3, 4]
    avg = OptionalDouble[6.186]
    AGENT方式: 5.89 ms,主要损耗在首次调用。
    [306, 16, 7, 9, 10, 11, 9, 7, 8, 11, 18, 11, 8, 9, 11, 9, 6, 6, 6, 11, 7, 8, 10, 9, 15, 9, 7, 12, 22, 8, 6, 6, 5, 5, 5, 5, 4, 4, 5, 5, 5, 4, 5, 4, 8, 7, 11, 15, 17, 8, 9, 5, 6, 8, 7, 6, 9, 5, 4, 5, 5, 4, 4, 5, 4, 5, 4, 5, 4, 4, 6, 6, 10, 6, 6, 8, 5, 8, 6, 10, 6, 9, 7, 7, 6, 8, 6, 5, 6, 5, 5, 6, 7, 7, 6, 8, 6, 10, 8, 9, 6, 4, 5, 8, 4, 5, 6, 5, 33, 19, 19, 6, 5, 4, 6, 5, 6, 5, 7, 6, 6, 7, 7, 5, 6, 6, 5, 8, 5, 4, 4, 5, 4, 3, 3, 4, 3, 4, 3, 3, 3, 3, 3, 3, 4, 3, 5, 4, 4, 6, 6, 5, 7, 4, 5, 6, 4, 6, 4, 7, 6, 3, 6, 3, 3, 3, 4, 3, 4, 3, 5, 3, 4, 5, 4, 4, 3, 3, 4, 3, 8, 4, 11, 5, 6, 6, 9, 4, 5, 7, 5, 16, 6, 4, 6, 4, 4, 5, 3, 4, 2, 3, 3, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 6, 5, 6, 4, 5, 9, 5, 8, 4, 4, 5, 4, 8, 7, 8, 5, 5, 4, 5, 5, 7, 4, 6, 4, 4, 6, 4, 4, 4, 5, 4, 4, 5, 8, 18, 12, 5, 7, 5, 5, 5, 4, 4, 7, 5, 3, 3, 4, 3, 4, 3, 3, 3, 4, 3, 4, 3, 4, 3, 9, 4, 4, 4, 3, 3, 3, 5, 5, 4, 4, 5, 4, 5, 6, 4, 3, 4, 6, 4, 5, 7, 3, 5, 4, 4, 6, 4, 3, 4, 3, 3, 3, 3, 3, 2, 2, 3, 2, 3, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 7, 4, 5, 4, 4, 4, 4, 5, 4, 6, 4, 5, 7, 5, 6, 5, 3, 4, 4, 4, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 2, 3, 2, 2, 3, 3, 3, 2, 3, 4, 4, 3, 4, 2, 6, 4, 4, 5, 3, 3, 5, 4, 5, 7, 7, 5, 12, 4, 5, 4, 3, 4, 5, 5, 3, 4, 4, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 4, 4, 4, 6, 5, 6, 5, 5, 6, 4, 5, 6, 4, 7, 5, 5, 7, 6, 6, 4, 4, 5, 4, 5, 3, 3, 5, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 4, 4, 3, 4, 4, 5, 4, 4, 4, 5, 4, 6, 4, 4, 6, 4, 4, 7, 4, 4, 6, 2, 2, 5, 3, 4, 4, 3, 3, 18, 7, 9, 10, 21, 28, 20, 12, 5, 4, 4, 3, 3, 5, 2, 3, 3, 2, 3, 2, 2, 2, 3, 3, 3]
    avg = OptionalDouble[5.896]
    结论是在仅有cglib代理+jdk proxy的时候(调用controller+service),耗时差不多,bytebuddy要稍快一些。

    事实证明,确实这样,尤其是我只是测试一个简单接口,如果链路长的时候这个差距会更加明显。

6.  结论

    事实上链路追踪的框架已经很很多了,选一款适合自己的就好,如果业务个性化需求比较多,自己开发也是一个不错的选择。你也看到了,自己写一个也没有那么复杂。如果选用开源框架的话,我推荐 skywalking,社区生态整体都挺好的,而且也方便二次开发,网上文章和文档也挺多的,就不多介绍了。

 

推荐几篇 javaagent 的文章:
https://www.jianshu.com/p/5c62b71fd882

https://www.jianshu.com/p/b72f66da679f

https://www.jianshu.com/p/7b2072513819

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

烤鸭的世界我们不懂

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值