问题背景
目标:实现对 docker 容器的性能监控
背景:已部署 GTI(Grafana+Telegraf+Influxdb)
问题:如何让 telegraf 能够采集到 docker 容器的性能指标?
解决过程
已知 telegraf 自带多种插件可用于对多种类型对象的指标采集,查看 /etc/telegraf/telegraf.conf 配置文件能发现配置项 [[inputs.docker]],去掉该项注释并合理进行配置即可。较完整的配置项内容如下:
[[inputs.docker]]
# ## Docker Endpoint
# ## To use TCP, set endpoint = "tcp://[ip]:[port]"
# ## To use environment variables (ie, docker-machine), set endpoint = "ENV"
# endpoint = "unix:///var/run/docker.sock"
#
# ## Set to true to collect Swarm metrics(desired_replicas, running_replicas)
# gather_services = false
#
# ## Only collect metrics for these containers, collect all if empty
# container_names = []
#
# ## Set the source tag for the metrics to the container ID hostname, eg first 12 chars
# source_tag = false
#
# ## Containers to include and exclude. Globs accepted.
# ## Note that an empty array for both will include all containers
# container_name_include = []
# container_name_exclude = []
#
# ## Container states to include and exclude. Globs accepted.
# ## When empty only containers in the "running" state will be captured.
# ## example: container_state_include = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
# ## example: container_state_exclude = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
# # container_state_include = []
# # container_state_exclude = []
#
# ## Timeout for docker list, info, and stats commands
# timeout = "5s"
#
# ## Whether to report for each container per-device blkio (8:0, 8:1...),
# ## network (eth0, eth1, ...) and cpu (cpu0, cpu1, ...) stats or not.
# ## Usage of this setting is discouraged since it will be deprecated in favor of 'perdevice_include'.
# ## Default value is 'true' for backwards compatibility, please set it to 'false' so that 'perdevice_include' setting
# ## is honored.<