skywalking实战--agent异常日志监控

本源码来自于skywalking-agent 8.9.0版本

背景

由于skywalking-agent客户端改为sidecar部署,每次更新skywalking-agent会使所有的项目都更新。 在各个服务中,并且很多业务服务的请求量很大、业务很是复杂,每次更新客户端时我们没办法每个业务流程都测试到位,我们担心某次的客户端更新导致测试环境出现问题,甚至是生产环境出现问题。如果能及时发现还能做对应的补救,如果是我们无法自己发现问题,需要业务方来联系我们那么问题的严重性就要被升级了。所以我们想是否有种机制可以让我们主动发现skywlaking-agent出现问题?如果我们在skywalking-agent的error出现时去报警是否可行?

实现历程

最初我们的想法是直接在AbstractLogger的error方法处进行promethues打点记录。代码案例如下:

@Override
    public void error(String message, Throwable throwable) {
        if (this.isErrorEnable()) {
            Metrics.counter("skywalking-agent_error_log","");
            this.logger(LogLevel.ERROR, message, throwable);
        }
    }
    

这种实现方式在本地进行测试时还可以正常运行,但是到了容器环境就出现了报错,报错的意思就是说在tomcat容器还未初始化完成就进行了 promethues 的注册。


12/09 14:28:19 {"instant":{"epochSecond":1670567299,"nanoOfSecond":713000000},"thread":"main","level":"ERROR","loggerName":"org.springframework.boot.diagnostics.LoggingFailureAnalysisReporter","message":"\n\n***************************\nAPPLICATION FAILED TO START\n***************************\n\nDescription:\n\nAn attempt was made to call a method that does not exist. The attempt was made from the following location:\n\n    io.micrometer.prometheus.PrometheusMeterRegistry.<init>(PrometheusMeterRegistry.java:67)\n\nThe following method did not exist:\n\n    io.micrometer.prometheus.PrometheusConfig.requireValid()V\n\nThe method's class, io.micrometer.prometheus.PrometheusConfig, is available from the following locations:\n\n    jar:file:/www/root-spring-boot.jar!/BOOT-INF/lib/micrometer-registry-prometheus-1.6.3.jar!/io/micrometer/prometheus/PrometheusConfig.class\n\nThe class hierarchy was loaded from the following locations:\n\n    io.micrometer.prometheus.PrometheusConfig: jar:file:/www/root-spring-boot.jar!/BOOT-INF/lib/micrometer-registry-prometheus-1.6.3.jar!/\n\n\nAction:\n\nCorrect the classpath of your application so that it contains a single, compatible version of io.micrometer.prometheus.PrometheusConfig\n","endOfBatch":false,"loggerFqcn":"org.apache.commons.logging.LogAdapter$Log4jLog","skyWalkingDynamicField":{"traceId":"N/A"},"threadId":1,"threadPriority":5,"requestId":"${ctx:requestId}","traceId":"${ctx:traceId}"}

出现这个问题后想到的就是能否在tomcat实例化后再进行 promethues 注册,其实也是不合理的,因为有的项目可能使用的不是tomcat容器。

最终想到的就是 AbstractLogger的error方法 内不使用 promethues 打点,但是用一个 AtomicLong 记录 error日志的次数,在 promethues 进行注册时拦截,并暴露 error日志的次数 的指标。

代码如下:

下面是 AbstractLogger 的大致代码

public abstract class AbstractLogger implements ILog {

public static AtomicLong incr = new AtomicLong(0);

@Override
    public void error(Throwable throwable, String message, Object... objects) {
        if (this.isErrorEnable()) {
            incr.incrementAndGet();
            this.logger(LogLevel.ERROR, replaceParam(message, objects), throwable);
        }
    }


}


插件定义的代码如下:

public class MetricsInstrumentation extends ClassStaticMethodsEnhancePluginDefine {
    /**
     * Enhance class.
     */
    private static final String ENHANCE_CLASS = "io.micrometer.core.instrument.Metrics";

    /**
     * The intercept class for "invoke" method in the class "org.apache.catalina.core.StandardWrapperValve"
     */
    private static final String INTERCEPT_CLASS = "org.apache.skywalking.apm.plugin.metrics.v1.MetricsAddRegistryInterceptor";

    @Override
    public ConstructorInterceptPoint[] getConstructorsInterceptPoints() {
        return null;
    }

    @Override
    public StaticMethodsInterceptPoint[] getStaticMethodsInterceptPoints() {
        return new StaticMethodsInterceptPoint[]{
                new StaticMethodsInterceptPoint() {
                    @Override
                    public ElementMatcher<MethodDescription> getMethodsMatcher() {
                        return named("addRegistry");
                    }

                    @Override
                    public String getMethodsInterceptor() {
                        return INTERCEPT_CLASS;
                    }

                    @Override
                    public boolean isOverrideArgs() {
                        return false;
                    }
                }
        };
    }


    @Override
    protected ClassMatch enhanceClass() {
        return NameMatch.byName(ENHANCE_CLASS);
    }
}


增强代码如下:

public class MetricsAddRegistryInterceptor implements StaticMethodsAroundInterceptor {
    private static ILog LOGGER = LogManager.getLogger(MetricsAddRegistryInterceptor.class);
    private int init = 0;

    @Override
    public void beforeMethod(Class clazz, Method method, Object[] allArguments, Class<?>[] parameterTypes, MethodInterceptResult result) {

    }

    @Override
    public Object afterMethod(Class clazz, Method method, Object[] allArguments, Class<?>[] parameterTypes, Object ret) {
        return null;
    }

    @Override
    public void handleMethodException(Class clazz, Method method, Object[] allArguments, Class<?>[] parameterTypes, Throwable t) {

    }

    @Override
    public void onAfterMethod(Long startTime, Class clazz, Method method, Object[] allArguments, Class<?>[] parameterTypes, Object ret) {
        if (init != 0) {
            return;
        }
        MeterRegistry registry = (MeterRegistry) allArguments[0];
        try {
            Field field = registry.getClass().getDeclaredField("collectorMap");
            field.setAccessible(true);
            Object o = field.get(registry);
            if (null != o) {
                ConcurrentHashMap map = (ConcurrentHashMap) o;
                if (map.size() > 0) {
                    Metrics.gauge(MetricsConfig.SERVICE_ERROR_CODE, AbstractLogger.incr);
                    init = 1;
                }
            }
        } catch (Exception e) {
            LOGGER.error(e,"MetricsAddRegistryInterceptor error");
        }
    }
}


最终报表效果

在这里插入图片描述

配合上报警系统就可以在 出现异常时及时报警

在这里插入图片描述

有了这个功能在出现skywalking-agent问题是我们可以提早知道问题,并做回退处理,在业务方发现之前把问题覆盖掉。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值