prometheus高可用之thanos

图片

thanos架构详解

thanos是prometheus的高可用解决方案之一,thanos与prometheus无缝集成,并提高了一些高级特性,满足了长期存储 + 无限拓展 + 全局视图 + >无侵入性的需求

图片

这张图中包含了 Thanos 的几个核心组件,但并不包括所有组件,简单介绍下图中几个组件:

Thanos Sidecar:连接 Prometheus,将其数据提供给 Thanos Query 查询,并且/或者将其上传到对象存储,以供长期存储

Thanos Query:实现了 Prometheus API,提供全局查询视图,将来StoreAPI提供的数据进行聚合最终返回给查询数据的client(如grafana)

Thanos Store Gateway:将对象存储的数据暴露给 Thanos Query 去查询。

Thanos Ruler:对监控数据进行评估和告警,还可以计算出新的监控数据,将这些新数据提供给 Thanos Query 查询并且/或者上传到对象存储,以>供长期存储。

Thanos Compact:将对象存储中的数据进行压缩和降低采样率,加速大时间区间监控数据查询的速度

Thanos Receiver:从 Prometheus 的远程写入 WAL 接收数据,将其公开和/或上传到云存储。

环境规划

主机

组件

Prometheus01

Prometheus,alertmanager,thanos_sidecar,thanos_rule,thanos_query,thanos_compact,thanos_store

Prometheus02

Prometheus,alertmanager,thanos_sidecar,thanos_rule,thanos_query,thanos_compact,thanos_store

Prometheus03

Prometheus,alertmanager,thanos_sidecar,thanos_rule,thanos_query,thanos_compact,thanos_store

prometheus部署

prometheus配置文件(prometheus01

[root@prometheus01 prometheus-2.36.1]# vi prometheus.ymlglobal:scrape_interval:     15sevaluation_interval: 15sexternal_labels:cluster: sincercloud_iaasreplica: '0'   #用于数据去重alerting:alertmanagers:- follow_redirects: trueenable_http2: truescheme: httptimeout: 10sapi_version: v2static_configs:- targets:- 10.250.38.201:9093- 10.250.38.202:9093- 10.250.38.203:9093labels:env: dev_sit

prometheus配置文件(prometheus02

[root@prometheus02 prometheus-2.36.1]# vi prometheus.yml

global:scrape_interval:     15sevaluation_interval: 15sexternal_labels:cluster: sincercloud_iaasreplica: '1'    #用于数据去重alerting:alertmanagers:- follow_redirects: trueenable_http2: truescheme: httptimeout: 10sapi_version: v2static_configs:- targets:- 10.250.38.201:9093- 10.250.38.202:9093- 10.250.38.203:9093labels:env: dev_sit

prometheus配置文件(prometheus03

[root@prometheus03 prometheus-2.36.1]# vi prometheus.ymlglobal:scrape_interval:     15sevaluation_interval: 15sexternal_labels:cluster: sincercloud_iaasreplica: '2'    #用于数据去重alerting:alertmanagers:- follow_redirects: trueenable_http2: truescheme: httptimeout: 10sapi_version: v2static_configs:- targets:- 10.250.38.201:9093- 10.250.38.202:9093- 10.250.38.203:9093labels:env: dev_sit

Prometheus service文件(所有节点)

[root@prometheus03 prometheus-2.36.1]# cat /usr/lib/systemd/system/prometheus.service[Unit]Description=PrometheusWants=network-online.targetAfter=network-online.target[Service]User=prometheusGroup=prometheusType=simpleExecStartPre=/service/software/prometheus-2.36.1/promtool check config /service/software/prometheus-2.36.1/prometheus.ymlExecStart=/service/software/prometheus-2.36.1/prometheus \--config.file /service/software/prometheus-2.36.1/prometheus.yml     \--storage.tsdb.path /thanos-data/prometheus \--storage.tsdb.min-block-duration=2h     --storage.tsdb.max-block-duration=2h   \--storage.tsdb.retention.time=2h    \--web.console.templates=/service/software/prometheus-2.36.1/consoles \--web.console.libraries=/service/software/prometheus-2.36.1/console_libraries \--web.listen-address=:9090 \--web.enable-lifecycle   \--web.enable-admin-apiExecReload=/usr/bin/curl -XPOST  http://127.0.0.1:9090/-/reload[Install]WantedBy=multi-user.target

Thanos_sidecar部署

Thanos_sidecar service文件(所有节点)

[root@prometheus03 prometheus-2.36.1]# cat /usr/lib/systemd/system/thanos_sidecar.service

[Unit]Description=Thanos-SidecarWants=network-online.targetAfter=network-online.target prometheus.service[Service]User=prometheusGroup=prometheusType=simpleRestart=on-failureRestartSec=5sExecStart=/service/software/thanos-0.25.0_sidecar/thanos sidecar   \--tsdb.path /thanos-data/prometheus \--grpc-address 0.0.0.0:10901 \--http-address 0.0.0.0:10902 \--reloader.config-file /service/software/prometheus-2.36.1/prometheus.yml    \--prometheus.url http://localhost:9090 \--objstore.config-file /service/software/thanos-0.25.0_store/ceph-oss.yaml   \--log.level info \--log.format json[Install]WantedBy=multi-user.target

Thanos_query部署

thanos_query service 文件(所有节点)​​​​​​​

[root@prometheus03 prometheus-2.36.1]# cat /usr/lib/systemd/system/thanos_query.service[Unit]Description=Thanos-QureyWants=network-online.targetAfter=network-online.target prometheus.service[Service]User=prometheusGroup=prometheusType=simpleRestart=on-failureRestartSec=5sExecStart=/service/software/thanos-0.25.0_sidecar/thanos query     \--grpc-address 0.0.0.0:10951 \--http-address 0.0.0.0:10952 \--query.replica-label replica \--query.replica-label rule_replica \--query.replica-label receive_replica \--query.auto-downsampling \--store=10.250.38.201:10901 \--store=10.250.38.202:10901 \--store=10.250.38.203:10901 \--store=10.250.38.202:10911 \--store=10.250.38.201:10911 \--store=10.250.38.201:10921 \--store=10.250.38.202:10921 \--store=10.250.38.203:10921 \--log.level=info \--log.format=json[Install]WantedBy=multi-user.target[root@prometheus03 prometheu

Thanos_rule部署

创建目录(所有主机)

# mkdir /service/software/thanos-0.25.0_rule/rules /thanos-data/thanos_rule

Thanos_rule service 文件(所有主机)

[root@prometheus03 prometheus-2.36.1]# cat /usr/lib/systemd/system/thanos_rule.service[Unit]Description=Thanos-ruleWants=network-online.targetAfter=network-online.target prometheus.service[Service]User=prometheusGroup=prometheusType=simpleRestart=on-failureRestartSec=5sExecStart=/service/software/thanos-0.25.0_rule/thanos rule   \--data-dir            "/thanos-data/thanos_rule"  \--http-address "0.0.0.0:10912"  \--grpc-address "0.0.0.0:10911"  \--eval-interval        "30s"   \--rule-file            "/service/software/thanos-0.25.0_rule/rules/*rules.yaml"    \--alert.query-url      "http://0.0.0.0:9090"  \--alertmanagers.url    "http://10.250.38.201:9093"  \--alertmanagers.url    "http://10.250.38.202:9093"   \--alertmanagers.url    "http://10.250.38.203:9093"  \--query                "10.250.38.201:10952"  \--query                "10.250.38.202:10952"   \--objstore.config-file "/service/software/thanos-0.25.0_store/ceph-oss.yaml"    \- -label                'rule_replica="2"'  \  #三台主机分别0 1 2以区分,避免重复告警-alert.label-drop     "rule_replica"  \--log.format            json    \ --log.level             info[Install]WantedBy=multi-user.target

Thanos_store部署

创建目录(所有节点)

# mkdir /thanos-data/thanos_store /thanos-data/ceph-oss

配置文件(所有节点)​​​​​​​

[root@prometheus03 prometheus-2.36.1]# cat /service/software/thanos-0.25.0_store/ceph-oss.yamltype: FILESYSTEMconfig:directory: "/thanos-data/ceph-oss"Thanos_store service文件[root@prometheus03 prometheus-2.36.1]# cat /usr/lib/systemd/system/thanos_store.service[Unit]Description=Thanos-StoreWants=network-online.targetAfter=network-online.target prometheus.service[Service]User=prometheusGroup=prometheusType=simpleLimitNOFILE=262144Restart=on-failureRestartSec=5sExecStart=/service/software/thanos-0.25.0_store/thanos store  \--grpc-address 0.0.0.0:10921 \--http-address 0.0.0.0:10922 \--data-dir=/thanos-data/thanos_store \--objstore.config-file=/service/software/thanos-0.25.0_store/ceph-oss.yaml \--log.level=info \--log.format=json[Install]WantedBy=multi-user.target

thanos_compact 部署

thanos_compact service文件(所有节点)​​​​​​​

[root@prometheus03 prometheus-2.36.1]# cat /usr/lib/systemd/system/thanos_compact.service[Unit]Description=Thanos-CompactWants=network-online.targetAfter=network-online.target prometheus.service[Service]User=rootGroup=rootType=simpleRestart=on-failureRestartSec=5sExecStart=/service/software/thanos-0.25.0_compact/thanos compact 、--wait --http-address 0.0.0.0:10932 \--debug.accept-malformed-index   \--data-dir=/thanos-data/thanos_compact     \--objstore.config-file=/service/software/thanos-0.25.0_store/ceph-oss.yaml \--retention.resolution-raw=90d \--retention.resolution-5m=180d   \--retention.resolution-1h=360d[Install]WantedBy=multi-user.target

alertmanager部署

省略​​​​​​​

# systemctl start prometheus# systemctl start thanos_*# systemctl start alertmanager

图片

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值