介绍
Flink 提供了很多的metrics ,和reporter ,官方地址:
https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/metrics.html#io
Flink的 WebConsole 提供了监控的指标和方式,但是这些内容都是in flight的,作业运行中去查看Web 控制端 或者Rest API 去查看都没有问题,但是成熟的产品或者工程项目都是需要回溯历史记录,作业挂掉或者异常能够通过一些持久化的指标进行分析, 所以这里根据Flink 官方文档的指导, 构建基于 InfluxDB + Grafana 的Flink 监控
安装
#InfluxDB
Ubuntu & Debian
SHA256:
a3296922db5ecb58f759f12abce6e98b55759079aeb838a6083b71325cf662b7
https://dl.influxdata.com/influxdb/releases/influxdb2-2.0.4-amd64.deb sudo dpkg -i influxdb2-2.0.4-amd64.deb
Ubuntu & Debian (ARM 64-bit)
SHA256:
15ea8fd002a933df8488c6ed8a593d6db5ba534e655320c8217645e03cf67f6c
https://dl.influxdata.com/influxdb/releases/influxdb2-2.0.4-arm64.deb sudo dpkg -i influxdb2-2.0.4-arm64.deb
初始化用户名密码,retention等配置
influx setup
Welcome to InfluxDB 2.0!
Please type your primary username: adminPlease type your password:
启动
influxd
2021-03-22T09:36:19.275226Z info Welcome to InfluxDB {"log_id": "0T2KWsIW000", "version": "2.0.4", "commit": "4e7a59bb9a", "build_date": "2021-02-08T17:47:02Z"}
2021-03-22T09:36:19.326180Z info Resources opened {"log_id": "0T2KWsIW000", "service": "bolt", "path": "/root/.influxdbv2/influxd.bolt"}
2021-03-22T09:36:19.339459Z info Bringing up metadata migrations {"log_id": "0T2KWsIW000", "service": "migrations", "migration_count": 14}
2021-03-22T09:36:20.324403Z info Using data dir {"log_id": "0T2KWsIW000", "service": "storage-engine", "service": "store", "path": "/root/.influxdbv2/engine/data"}
2021-03-22T09:36:20.324594Z info Compaction settings {"log_id": "0T2KWsIW000", "service": "storage-engine", "service": "store", "max_concurrent_compactions": 16, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648}
2021-03-22T09:36:20.324622Z info Open store (start) {"log_id": "0T2KWsIW000", "service": "storage-engine", "service": "store", "op_name": "tsdb_open", "op_event": "start"}
2021-03-22T09:36:20.324710Z info Open store (end) {"log_id": "0T2KWsIW000", "service": "storage-engine", "service": "store", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "0.073ms"}
2021-03-22T09:36:20.324800Z info Starting retention policy enforcement service {"log_id": "0T2KWsIW000", "service": "retention", "check_interval": "30m"}
2021-03-22T09:36:20.324840Z info Starting precreation service {"log_id": "0T2KWsIW000", "service": "shard-precreation", "check_interval": "10m", "advance_period": "30m"}
2021-03-22T09:36:20.324965Z info Starting query controller {"log_id": "0T2KWsIW000", "service": "storage-reads", "concurrency_quota": 10, "initial_memory_bytes_quota_per_query": 9223372036854775807, "memory_bytes_quota_per_query": 9223372036854775807, "max_memory_bytes": 0, "queue_size": 10}
2021-03-22T09:36:20.325917Z info Configuring InfluxQL statement executor (zeros indicate unlimited). {"log_id": "0T2KWsIW000", "max_select_point": 0, "max_select_series": 0, "max_select_buckets": 0}
2021-03-22T09:36:20.645481Z info Listening {"log_id": "0T2KWsIW000&