用Prometheus和Grafana监控Java Spring应用

最近要对一些业务流程进行端到端的监控,这些业务是由几个微服务构成,微服务都是Java Spring编写的,我们需要了解整个业务涉及的各个模块的流量统计,性能状况,例如总共有多少次业务请求调用,多少次成功或失败的回复,每个步骤的耗时是多少等等。因此我也研究了一下如何在Java Spring应用中输出统计指标,通过Prometheus来统一收集指标,并在Grafana中通过不同的报表来呈现这些信息。

首先我们先定义一个简单的业务流程,假设我们有两个Spring的应用,一个是提供业务请求接口的HTTP调用,在收到业务请求后,把里面携带的信息发送到Kafka。另一个应用是订阅Kafka的消息,获取应用一发出的业务数据,并进行处理。

应用一

在start.spring.io网站里面新建一个应用,artifact的名字为kafka-sender-example,Dependancies里面选择Apache kafka for spring, Actuator, Spring Web。打开生成的项目文件,添加一个名为RemoteCommandController的类,实现一个http接口,代码如下:

package cn.roygao.kafkasenderexample;

import java.util.Collections;
import java.util.Map;
import java.util.UUID;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import java.util.logging.Logger;

import org.apache.kafka.clients.producer.ProducerRecord;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.ResponseEntity;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;

import com.alibaba.fastjson.JSONObject;

@RestController
public class RemoteCommandController {
    @Autowired
    private KafkaTemplate<Integer, String> template;

    private final static Logger LOGGER = Logger.getLogger(RemoteCommandController.class.getName());

    @PostMapping("/sendcommand")
    public ResponseEntity<Map<String, Object>> sendCommand(@RequestBody JSONObject commandMsg) {
        String requestId = UUID.randomUUID().toString();
        String vin = commandMsg.getString("vin");
        String command = commandMsg.getString("command");
        LOGGER.info("Send command to vehicle:" + vin + ", command:" + command);
        Map<String, Object> requestIdObj = Collections.singletonMap("requestId", requestId);
        ProducerRecord<Integer, String> record = new ProducerRecord<>("remotecommand", 1, command);
        try {
            System.out.println(System.currentTimeMillis());
            template.send(record).get(10, TimeUnit.SECONDS);
        }
        catch (ExecutionException e) {
            LOGGER.info("Error");
            LOGGER.info(e.getMessage());
        }
        catch (TimeoutException | InterruptedException e) {
            LOGGER.info("Timeout");
            LOGGER.info(e.getMessage());
        }
        return ResponseEntity.accepted().body(requestIdObj);
    }
}

这个代码很简单,提供了一个POST的/sendcommand的接口,用户调用这个接口,提供车辆的VIN号和要发送的指令信息,收到请求之后,将把这些业务请求信息转发到Kafka的消息主题。这里用到了KafkaTemplate来进行消息的发送。为此,定义一个名为KafkaSender的配置类,代码如下:

package cn.roygao.kafkasenderexample;

import java.util.HashMap;
import java.util.Map;

import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.serialization.IntegerSerializer;
import org.apache.kafka.common.serialization.StringSerializer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.config.TopicBuilder;
import org.springframework.kafka.core.DefaultKafkaProducerFactory;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.core.ProducerFactory;

@Configuration
public class KafkaSender {
    @Bean
    public NewTopic topic() {
        return TopicBuilder.name("remotecommand")
                .build();
    }

    @Bean
    public ProducerFactory<Integer, String> producerFactory() {
        return new DefaultKafkaProducerFactory<>(producerConfigs());
    }

    @Bean
    public Map<String, Object> producerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, IntegerSerializer.class);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        // See https://kafka.apache.org/documentation/#producerconfigs for more properties
        return props;
    }

    @Bean
    public KafkaTemplate<Integer, String> kafkaTemplate() {
        return new KafkaTemplate<Integer, String>(producerFactory());
    }
}

代码里面定义了Kafka服务器的地址,消息主题等配置。

运行./mvnw clean package进行编译打包。

应用二

在start.spring.io网站里面新建一个应用,artifact的名字为kafka-sender-example,Dependancies里面选择Apache kafka for spring, Actuator。打开生成的项目文件,新建一个名为RemoteCommandHandler的类,实现接收Kafka信息的功能,代码如下:

package cn.roygao.kafkareceiverexample;

import java.util.concurrent.TimeUnit;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.kafka.listener.adapter.ConsumerRecordMetadata;
import org.springframework.stereotype.Component;

import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;

@Component
public class RemoteCommandHandler {
    private Timer timer;

    public RemoteCommandHandler(MeterRegistry registry) {
        this.timer = Timer
            .builder("kafka.process.latency")
            .publishPercentiles(0.15, 0.5, 0.95)
            .publishPercentileHistogram()
            .register(registry);
    }

    @KafkaListener(id = "myId", topics = "remotecommand")
    public void listen(String in, ConsumerRecordMetadata meta) {
        long latency = System.currentTimeMillis()-meta.timestamp();
        timer.record(latency, TimeUnit.MILLISECONDS);
    }
}

这里类的构造函数需要传入一个MeterRetistry的对象,然后新建一个Timer对象,这是Micrometer提供的四种Metric之一,可以用来记录时长的信息。把这个Timer注册到MeterRegistry。

在listen方法中,定义了从Kafka的消息主题订阅消息,获取消息的metadata中的生成时间的时间戳,并与当前的时间进行比较,计算出从消息生成到消息消费的耗时,然后用timer来进行计算。Timer会按照之前的定义进行不同百分位区间的分布统计。

同样我们也需要定义一个Kafka的配置类,代码如下:

package cn.roygao.kafkareceiverexample;

import java.util.HashMap;
import java.util.Map;

import org.apache.kafka.clients.producer.ProducerConfig;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.annotation.EnableKafka;
import org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory;
import org.springframework.kafka.config.KafkaListenerContainerFactory;
import org.springframework.kafka.core.ConsumerFactory;
import org.springframework.kafka.core.DefaultKafkaConsumerFactory;
import org.springframework.kafka.listener.ConcurrentMessageListenerContainer;

@Configuration
@EnableKafka
public class KafkaConfig {
    @Bean
    KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Integer, String>>
                        kafkaListenerContainerFactory() {
        ConcurrentKafkaListenerContainerFactory<Integer, String> factory =
                                new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        factory.setConcurrency(3);
        factory.getContainerProperties().setPollTimeout(3000);
        return factory;
    }

    @Bean
    public ConsumerFactory<Integer, String> consumerFactory() {
        return new DefaultKafkaConsumerFactory<>(consumerConfigs());
    }

    @Bean
    public Map<String, Object> consumerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.IntegerDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        return props;
    }
}

在application.properties文件中添加以下配置:

spring.kafka.consumer.auto-offset-reset=earliest
server.port=7777
management.endpoints.web.exposure.include=health,info,prometheus
management.endpoints.enabled-by-default=true
management.endpoint.health.show-details: always

然后运行./mvnw clean package进行编译打包。

启动Kafka

这里我采用Docker的方式来启动Kafka,compose文件的内容如下:

---
version: '2'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:6.1.0
    hostname: zookeeper
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  broker:
    image: confluentinc/cp-server:6.1.0
    hostname: broker
    container_name: broker
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
      - "9101:9101"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_CONFLUENT_LICENSE_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_CONFLUENT_BALANCER_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9101
      KAFKA_JMX_HOSTNAME: localhost
      KAFKA_CONFLUENT_SCHEMA_REGISTRY_URL: http://schema-registry:8081
      CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:29092
      CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
      CONFLUENT_METRICS_ENABLE: 'true'
      CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'

  schema-registry:
    image: confluentinc/cp-schema-registry:6.1.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - broker
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:29092'
      SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081

  connect:
    image: cnfldemos/cp-server-connect-datagen:0.4.0-6.1.0
    hostname: connect
    container_name: connect
    depends_on:
      - broker
      - schema-registry
    ports:
      - "8083:8083"
    environment:
      CONNECT_BOOTSTRAP_SERVERS: 'broker:29092'
      CONNECT_REST_ADVERTISED_HOST_NAME: connect
      CONNECT_REST_PORT: 8083
      CONNECT_GROUP_ID: compose-connect-group
      CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
      CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
      CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
      CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
      CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
      CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
      CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
      CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
      CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
      CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
      # CLASSPATH required due to CC-2422
      CLASSPATH: /usr/share/java/monitoring-interceptors/monitoring-interceptors-6.1.0.jar
      CONNECT_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
      CONNECT_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
      CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
      CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR

  control-center:
    image: confluentinc/cp-enterprise-control-center:6.1.0
    hostname: control-center
    container_name: control-center
    depends_on:
      - broker
      - schema-registry
      - connect
      - ksqldb-server
    ports:
      - "9021:9021"
    environment:
      CONTROL_CENTER_BOOTSTRAP_SERVERS: 'broker:29092'
      CONTROL_CENTER_CONNECT_CLUSTER: 'connect:8083'
      CONTROL_CENTER_KSQL_KSQLDB1_URL: "http://ksqldb-server:8088"
      CONTROL_CENTER_KSQL_KSQLDB1_ADVERTISED_URL: "http://localhost:8088"
      CONTROL_CENTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      CONTROL_CENTER_REPLICATION_FACTOR: 1
      CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
      CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
      CONFLUENT_METRICS_TOPIC_REPLICATION: 1
      PORT: 9021

  ksqldb-server:
    image: confluentinc/cp-ksqldb-server:6.1.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    depends_on:
      - broker
      - connect
    ports:
      - "8088:8088"
    environment:
      KSQL_CONFIG_DIR: "/etc/ksql"
      KSQL_BOOTSTRAP_SERVERS: "broker:29092"
      KSQL_HOST_NAME: ksqldb-server
      KSQL_LISTENERS: "http://0.0.0.0:8088"
      KSQL_CACHE_MAX_BYTES_BUFFERING: 0
      KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
      KSQL_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
      KSQL_KSQL_CONNECT_URL: "http://connect:8083"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_REPLICATION_FACTOR: 1
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: 'true'
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: 'true'

  ksqldb-cli:
    image: confluentinc/cp-ksqldb-cli:6.1.0
    container_name: ksqldb-cli
    depends_on:
      - broker
      - connect
      - ksqldb-server
    entrypoint: /bin/sh
    tty: true

  ksql-datagen:
    image: confluentinc/ksqldb-examples:6.1.0
    hostname: ksql-datagen
    container_name: ksql-datagen
    depends_on:
      - ksqldb-server
      - broker
      - schema-registry
      - connect
    command: "bash -c 'echo Waiting for Kafka to be ready... && \
                       cub kafka-ready -b broker:29092 1 40 && \
                       echo Waiting for Confluent Schema Registry to be ready... && \
                       cub sr-ready schema-registry 8081 40 && \
                       echo Waiting a few seconds for topic creation to finish... && \
                       sleep 11 && \
                       tail -f /dev/null'"
    environment:
      KSQL_CONFIG_DIR: "/etc/ksql"
      STREAMS_BOOTSTRAP_SERVERS: broker:29092
      STREAMS_SCHEMA_REGISTRY_HOST: schema-registry
      STREAMS_SCHEMA_REGISTRY_PORT: 8081

  rest-proxy:
    image: confluentinc/cp-kafka-rest:6.1.0
    depends_on:
      - broker
      - schema-registry
    ports:
      - 8082:8082
    hostname: rest-proxy
    container_name: rest-proxy
    environment:
      KAFKA_REST_HOST_NAME: rest-proxy
      KAFKA_REST_BOOTSTRAP_SERVERS: 'broker:29092'
      KAFKA_REST_LISTENERS: "http://0.0.0.0:8082"
      KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'

运行nohup docker compose up > ./kafka.log 2>&1 &即可启动。在浏览器输入localhost:9021,可以在控制台界面观看Kafka的相关信息。

分别运行应用一和应用二,然后调用POST http://localhost:8080/remotecommand接口发送业务请求,例如以下的命令:

curl --location --request POST 'http://localhost:8080/sendcommand' \
--header 'Content-Type: application/json' \
--data-raw '{
    "vin": "ABC123",
    "command": "engine-start"
}'

在Kafka的控制台可以看到有一个remotecommand的消息主题,并且有一条信息发送和被消费。

启动Prometheus和Grafana

同样采用docker compose的方式来启动,compose文件内容如下:

services:
  prometheus:
    image: prom/prometheus-linux-amd64
    #network_mode: host
    container_name: prometheus
    restart: unless-stopped
    volumes:
      - ./config:/etc/prometheus/
    command:
      - '--config.file=/etc/prometheus/prometheus.yaml'
    ports:
      - 9090:9090
  grafana:
    image: grafana/grafana
    user: '472'
    #network_mode: host
    container_name: grafana
    restart: unless-stopped
    links:
      - prometheus:prometheus
    volumes:
      - ./data/grafana:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    ports:
      - 3000:3000
    depends_on:
      - prometheus

在这个compose文件的目录下新建一个config目录,里面存放prometheus的配置文件,内容如下:

scrape_configs:
  - job_name: 'Spring Boot Application input'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 2s
    static_configs:
      - targets: ['172.17.0.1:7777']
        labels:
          application: 'My Spring Boot Application'

这里面的targets配置的是应用二暴露的地址,metrics_path是采集指标的路径。

在compose文件的目录下新建一个data/grafana目录,挂载给Grafana的文件目录,注意这里需要用chmod 777来修改目录权限,不然Grafana会报权限错误。

运行nohup docker compose up > ./prometheus.log 2>&1 &运行即可。

打开localhost:9090可以访问prometheus的页面,然后我们可以输入kafka进行搜索,可以看到应用二上报的kafka_process_latency的指标数据,按照我们的定义进行了0.15,0.5, 0.95这三个百分位区间的统计。

打开localhost:3000可以访问Grafana的页面,配置datasource,选择Prometheus这个容器的地址,然后save&test。之后可以新建一个dashboard,然后可以在报表里面显示kafka_process_latency的指标图形。

【未完待续】,还要增加对Http接口调用的Counter metric,以及在Grafana定义更多的报表,包括其他服务指标等等。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

gzroy

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值