高效实时监控:异步计算任务的挑战与解决方案(一)

本文主要描述,在异步计算任务背景下,怎么高效监控节点状态

背景

  1. 某业务历史计算存在多业务线、大量异步计算任务,实时监控对于之后的devops,开发测试,任务跟踪,性能调优,数据可视化有重要作用;
  2. 架构部目前提供的监控组件均基于 nginx 拦截,对于大量非 http 的异步计算任务不兼容;
  3. 架构部目前提供的监控组件实时性较差,只提供细粒度分钟级以上的统计。

EIS

eye in sky 缩写,以对业务代码零侵入的方式,实现灵活的、轻量的实时监控

实现

  • aop 自研插件实时采集数据,采用推模式,push 远程存储;
  • 底层存储为时序数据库influxdb,及定制化的聚合压缩和过期策略;
  • ui为grafana提供监控大盘;
  • 相关技术栈均为docker部署,接近零维护成本;
  • 支持耗时、qps、p99等多指标监控;

EISDingMonitor

AOP方式监控异步节点任务,底层采用钉钉通知形式,实时监控任务。

自定义注解
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD,ElementType.PARAMETER})
@Inherited
public @interface EISDingMonitor {
    String node();
    String remark() default "";
    boolean input() default false;
    boolean output() default false;
    String token() default "";
    String secret() default "";
    boolean log() default true;
    boolean exceptionOnly() default false;
}

属性含义是否必需默认值
node节点名称
remark节点备注信息
input节点输入false
output节点输出false
token钉钉群 token
secret钉钉群 secret
log打印日志false
exceptionOnly仅异常时推送false
切片类

@Slf4j
@Aspect
@Component
public class EISDingMonitorAspect {

    @Autowired
    private DingTalkService dingTalkService;

    private static final String appName = "xxx-xxx";

    @Value("${ding.token}")
    private String robotToken;

    @Value("${ding.secret}")
    private String robotSecret;

    private static String hostName;

    static {
        try {
            hostName = InetAddress.getLocalHost().getHostName();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    @Pointcut("@annotation(com.xxx.aspect.EISDingMonitor)")
    private void pointCut(){}


    @AfterThrowing(pointcut = "pointCut()", throwing = "e")
    public void afterThrowable(JoinPoint joinPoint, Throwable e){
        try{
            Signature signature = joinPoint.getSignature();
            MethodSignature methodSignature = (MethodSignature) signature;
            Method method = methodSignature.getMethod();
            DingMonitor ding = method.getAnnotation(EISDingMonitor.class);
            String token = ding.token().trim();
            String secret = ding.secret().trim();
            token = (token == null || token.equals("")) ? robotToken : token;
            secret = (secret == null || secret.equals("")) ? robotSecret : secret;
            if(token == null || token.equals("") || secret == null || secret.equals("")){
                return;
            }

            String node = ding.node();
            String remark = ding.remark();
            remark = remark.equals("") ? null : remark;
            long currentTime = System.currentTimeMillis();
            String inputStr;

            Object[] args = joinPoint.getArgs();

            inputStr = ding.input()
                    && Objects.nonNull(args)
                    && args.length > 0
                    ? Arrays.stream(args).map(o -> o.toString()).collect(Collectors.toList()).toString()
                    : null;

            DingTalkCardMsg failedMsg = DingTalkCardMsg.builder()
                    .reporter(appName)
                    .host(hostName)
                    .node(node)
                    .remark(remark)
                    .status("【failed】")
                    .current(DateUtil.timeStamp2Date(currentTime, "yyyy-MM-dd HH:mm:ss"))
                    .input(inputStr)
                    .msg("节点失败, e = " + e.getMessage())
                    .build();
            dingTalkService.ding(failedMsg, token, secret);

            if(ding.log()){
                log.error("EIS ding logging, threadId = {}, method = {}, input = {}, remark = {}",
                        Thread.currentThread().getId(),
                        method.getName(),
                        inputStr,
                        remark,
                        e);
            }
        } catch (Exception localException){
            log.error("failed to push eis ding msg, e = {}", localException.getMessage());
        }
    }

    @Around("pointCut()")
    public Object around(ProceedingJoinPoint joinPoint) throws Throwable {

        MethodSignature methodSignature = (MethodSignature) joinPoint.getSignature();

        Method method = methodSignature.getMethod();

        DingMonitor ding = method.getAnnotation(EISDingMonitor.class);

        String token = ding.token().trim();

        String secret = ding.secret().trim();

        token = (token == null || token.equals("")) ? robotToken : token;

        secret = (secret == null || secret.equals("")) ? robotSecret : secret;

        if (token == null || token.equals("") || secret == null || secret.equals("")) {
            return joinPoint.proceed();
        }

        if (ding.exceptionOnly()) {
            return joinPoint.proceed();
        }

        String node = ding.node();
        String remark = ding.remark();

        Object[] args = joinPoint.getArgs();

        String inputStr = null;

        try {
            inputStr = ding.input()
                    && Objects.nonNull(args)
                    && args.length > 0
                    ? Arrays.stream(args).map(o -> o.toString()).collect(Collectors.toList()).toString()
                    : null;

            remark = remark.equals("") ? null : remark;

            DingTalkCardMsg startMsg = DingTalkCardMsg.builder()
                    .reporter(appName)
                    .host(hostName)
                    .current(DateUtil.timeStamp2Date(System.currentTimeMillis(), "yyyy-MM-dd HH:mm:ss"))
                    .node(node)
                    .remark(remark)
                    .status("start")
                    .msg("进入节点")
                    .input(inputStr)
                    .build();
            dingTalkService.ding(startMsg, token, secret);
        } catch (Exception e){
            log.error("failed to push ding msg, e = {}", e.getMessage());
        }

        long startTime = System.currentTimeMillis();
        Object result = joinPoint.proceed();
        long currentTime = System.currentTimeMillis();

        try{
            String cost = DateUtil.duration(startTime, currentTime);

            String outputStr = ding.output() && !Objects.isNull(result) ? result.toString() : null;

            DingTalkCardMsg successMsg = DingTalkCardMsg.builder()
                    .reporter(appName)
                    .host(hostName)
                    .current(DateUtil.timeStamp2Date(currentTime, "yyyy-MM-dd HH:mm:ss"))
                    .node(node)
                    .remark(remark)
                    .status("success")
                    .msg("节点完成")
                    .cost(cost)
                    .input(inputStr)
                    .output(outputStr)
                    .build();
            dingTalkService.ding(successMsg, token, secret);

            if(ding.log()){
                log.info("monitor ding logging, threadId = {}, method = {}, cost = {}, input = {}, output = {}, remark = {}",
                        Thread.currentThread().getId(),
                        method.getName(),
                        cost,
                        inputStr,
                        outputStr,
                        remark);
            }
        }catch (Exception e){
            log.error("failed to push monitor ding msg, e = {}", e.getMessage());
        }
        return result;
    }
}

钉钉通知

总结

通过 AOP 切面,可以实时观察到任务的状态,包括执行到哪一步、输入参数、输出参数、执行成功还是失败、耗时等维度信息。

关于influxdb存储的部分,则是另一个注解EISTimerMonitor了,关于这个的实现,我们下次再说。

  • 6
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值