目标
需要在AWS的EC2环境,启用Prometheus的内网服务发现功能。
假设
已经有1台EC2上面运行了Prometheus服务器,1台EC2上面运行Prometheus的Node Export程序。
IAM Role
创建策略
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeAvailabilityZones"
],
"Resource": "*"
}
]
}
创建角色
这个角色需要赋予之前创建的策略。
EC2实例添加角色
这个步骤不能少。
prometheus.yml
sudo vim /etc/prometheus/prometheus.yml
添加内容如下:
- job_name: "ec2"
ec2_sd_config:
- region: cn-north-1
profile: arn:aws-cn:iam::xxxx:instance-profile/dev-prometheus
port: 9100
filters:
- name: tag:environment
values:
- prod
- name: tag:service
values:
- web
- db
整体文件内容如下:
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "node_exporter"
static_configs:
- targets: ["172.30.2.26:9100"]
- job_name: "ec2"
ec2_sd_config:
- region: cn-north-1
profile: arn:aws-cn:iam::xxxx:instance-profile/dev-prometheus
port: 9100
filters:
- name: tag:environment
values:
- prod
- name: tag:service
values:
- web
- db
重启Prometheus
sudo systemctl restart prometheus
测试
总结:
这里就完成Prometheus的服务发现。这里使用了filters进行node过滤,filters可以使用通配符。