github.com/prometheus/client_golang/prometheus/promhttp 使用golang编写Prometheus Exporter

最新推荐文章于 2023-04-15 18:32:19 发布

y果子

最新推荐文章于 2023-04-15 18:32:19 发布

阅读量3k

点赞数 2

分类专栏： Go-Package

本文链接：https://blog.csdn.net/ayqy42602/article/details/109066342

版权

Go-Package 专栏收录该内容

23 篇文章 1 订阅

订阅专栏

Exporter是基于Prometheus实施的监控系统中重要的组成部分，承担数据指标的采集工作，官方的exporter列表中已经包含了常见的绝大多数的系统指标监控，比如用于机器性能监控的node_exporter, 用于网络设备监控的snmp_exporter等等。这些已有的exporter对于监控来说，仅仅需要很少的配置工作就能提供完善的数据指标采集。

有时我们需要自己去写一些与业务逻辑比较相关的指标监控，这些指标无法通过常见的exporter获取到。比如我们需要提供对于DNS解析情况的整体监控，了解如何编写exporter对于业务监控很重要，也是完善监控系统需要经历的一个阶段。接下来我们就介绍如何编写exporter。

搭建环境
首先确保机器上安装了go语言(1.7版本以上)，并设置好了对应的GOPATH。接下来我们就可以开始编写代码了。以下是一个简单的exporter

下载对应的prometheus包

go get github.com/prometheus/client_golang/prometheus/promhttp

程序主函数:

package main

import (
	"github.com/prometheus/client_golang/prometheus/promhttp"
	"log"
	"net/http"
)

func main() {
	http.Handle("/metrics",promhttp.Handler())
	log.Fatal(http.ListenAndServe(":8080",nil))
}

这个代码中我们仅仅通过http模块指定了一个路径，并将client_golang库中的promhttp.Handler()作为处理函数传递进去后，就可以获取指标信息了,两行代码实现了一个exporter。这里内部其实是使用了一个默认的收集器将通过NewGoCollector采集当前Go运行时的相关信息比如go堆栈使用,goroutine的数据等等。通过访问http://localhost:8080/metrics即可查看详细的指标参数。

上面的代码仅仅展示了一个默认的采集器，并且通过接口调用隐藏了太多实施细节，对于下一步开发并没什么作用，为了实现自定义的监控我们需要先了解一些基本概念。

指标类别
Prometheus中主要使用的四类指标类型，如下所示

Counter (累加指标)
Gauge (测量指标)
Summary (概略图)
Histogram (直方图)

Counter 一个累加指标数据，这个值随着时间只会逐渐的增加，比如程序完成的总任务数量，运行错误发生的总次数。常见的还有交换机中snmp采集的数据流量也属于该类型，代表了持续增加的数据包或者传输字节累加值。

Gauge代表了采集的一个单一数据，这个数据可以增加也可以减少，比如CPU使用情况，内存使用量，硬盘当前的空间容量等等

Histogram和Summary使用的频率较少，两种都是基于采样的方式。另外有一些库对于这两个指标的使用和支持程度不同，有些仅仅实现了部分功能。这两个类型对于某一些业务需求可能比较常见，比如查询单位时间内：总的响应时间低于300ms的占比，或者查询95%用户查询的门限值对应的响应时间是多少。使用Histogram和Summary指标的时候同时会产生多组数据，_count代表了采样的总数，_sum则代表采样值的和。 _bucket则代表了落入此范围的数据。

下面是使用historam来定义的一组指标，计算出了平均五分钟内的查询请求小于0.3s的请求占比总量的比例值。

  sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job)
/
  sum(rate(http_request_duration_seconds_count[5m])) by (job)

如果需要聚合数据，可以使用histogram. 并且如果对于分布范围有明确的值的情况下（比如300ms），也可以使用histogram。但是如果仅仅是一个百分比的值（比如上面的95%），则使用Summary

定义指标
这里我们需要引入另一个依赖库

go get github.com/prometheus/client_golang/prometheus

下面先来定义了两个指标数据，一个是Guage类型，一个是Counter类型。分别代表了CPU温度和磁盘失败次数统计，使用上面的定义进行分类。

    cpuTemp = prometheus.NewGauge(prometheus.GaugeOpts{
        Name: "cpu_temperature_celsius",
        Help: "Current temperature of the CPU.",
    })
    hdFailures = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "hd_errors_total",
            Help: "Number of hard-disk errors.",
        },
        []string{"device"},
    )

这里还可以注册其他的参数，比如上面的磁盘失败次数统计上，我们可以同时传递一个device设备名称进去，这样我们采集的时候就可以获得多个不同的指标。每个指标对应了一个设备的磁盘失败次数统计。

注册指标

func init() {
    // Metrics have to be registered to be exposed:
    prometheus.MustRegister(cpuTemp)
    prometheus.MustRegister(hdFailures)
}

使用prometheus.MustRegister是将数据直接注册到Default Registry，就像上面的运行的例子一样，这个Default Registry不需要额外的任何代码就可以将指标传递出去。注册后既可以在程序层面上去使用该指标了，这里我们使用之前定义的指标提供的API（Set和With().Inc）去改变指标的数据内容。

func main() {
    cpuTemp.Set(65.3)
    hdFailures.With(prometheus.Labels{"device":"/dev/sda"}).Inc()

    // The Handler function provides a default handler to expose metrics
    // via an HTTP server. "/metrics" is the usual endpoint for that.
    http.Handle("/metrics", promhttp.Handler())
    log.Fatal(http.ListenAndServe(":8080", nil))
}

其中With函数是传递到之前定义的label=”device”上的值，也就是生成指标类似于

cpu_temperature_celsius 65.3
hd_errors_total{"device"="/dev/sda"} 1

完整代码：

package main

//Exporter是基于Prometheus实施的监控系统中重要的组成部分，承担数据指标的采集工作，官方的exporter列表中已经包含了常见的绝大多数的系统指标监控，比如用于机器性能监控的node_exporter, 用于网络设备监控的snmp_exporter等等。这些已有的exporter对于监控来说，仅仅需要很少的配置工作就能提供完善的数据指标采集。
//有时我们需要自己去写一些与业务逻辑比较相关的指标监控，这些指标无法通过常见的exporter获取到。比如我们需要提供对于DNS解析情况的整体监控，了解如何编写exporter对于业务监控很重要，也是完善监控系统需要经历的一个阶段。接下来我们就介绍如何编写exporter。

import (
	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	"log"
	"net/http"
)

// 定义指标
// 先来定义了两个指标数据，一个是Guage类型， 一个是Counter类型。分别代表了CPU温度和磁盘失败次数统计，使用上面的定义进行分类。
var cpuTemp = prometheus.NewGauge(prometheus.GaugeOpts{
	Name: "cpu_temperature_celsius",
	Help: "Current temperature of the CPU",
})

// 这里还可以注册其他的参数，比如上面的磁盘失败次数统计上，我们可以同时传递一个device设备名称进去，这样我们采集的时候就可以获得多个不同的指标。每个指标对应了一个设备的磁盘失败次数统计。
var hdFailures = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "hd_errors_total",
		Help: "Number of hard-disk errors.",
	},
	[]string{"device"},
)

// 注册指标：
func init() {
	prometheus.MustRegister(cpuTemp)
	prometheus.MustRegister(hdFailures)
}

//使用prometheus.MustRegister是将数据直接注册到Default Registry，就像上面的运行的例子一样，这个Default Registry不需要额外的任何代码就可以将指标传递出去。注册后就可以在程序层面上去使用该指标了，这里我们使用之前定义的指标提供的API（Set和With().Inc）去改变指标的数据内容

func main() {
	cpuTemp.Set(65.3)
	hdFailures.With(prometheus.Labels{"device": "/dev/sda"}).Inc()

	// The Handler function provides a default handler to expose metrics
	// via an HTTP server. "/metrics" is the usual endpoint for that.
	http.Handle("/metrics", promhttp.Handler())
	log.Fatal(http.ListenAndServe(":8080", nil))

}

在浏览器输入：

http://localhost:8080/metrics

得到输出：
在这里插入图片描述

当然我们写在main函数中的方式是有问题的，这样这个指标仅仅改变了一次，不会随着我们下次采集数据的时候发生任何变化，我们希望的是每次执行采集的时候，程序都去自动的抓取指标并将数据通过http的方式传递给我们。

Counter数据采集实例
下面是一个采集Counter类型数据的实例，这个例子中实现了一个自定义的，满足采集器(Collector)接口的结构体，并手动注册该结构体后，使其每次查询的时候自动执行采集任务。

我们先来看下采集器Collector接口的实现


type Collector interface {
    // 用于传递所有可能的指标的定义描述符
    // 可以在程序运行期间添加新的描述，收集新的指标信息
    // 重复的描述符将被忽略。两个不同的Collector不要设置相同的描述符
    Describe(chan<- *Desc)

    // Prometheus的注册器调用Collect执行实际的抓取参数的工作，
    // 并将收集的数据传递到Channel中返回
    // 收集的指标信息来自于Describe中传递，可以并发的执行抓取工作，但是必须要保证线程的安全。
    Collect(chan<- Metric)
}

了解了接口的实现后，我们就可以写自己的实现了，先定义结构体，这是一个集群的指标采集器，每个集群都有自己的Zone,代表集群的名称。另外两个是保存的采集的指标。

type ClusterManager struct {
    Zone         string
    OOMCountDesc *prometheus.Desc
    RAMUsageDesc *prometheus.Desc
}

我们来实现一个采集工作,放到了ReallyExpensiveAssessmentOfTheSystemState函数中实现，每次执行的时候，返回一个按照主机名作为键采集到的数据，两个返回值分别代表了OOM错误计数，和RAM使用指标信息。

func (c *ClusterManager) ReallyExpensiveAssessmentOfTheSystemState() (
    oomCountByHost map[string]int, ramUsageByHost map[string]float64,
) {
    oomCountByHost = map[string]int{
        "foo.example.org": int(rand.Int31n(1000)),
        "bar.example.org": int(rand.Int31n(1000)),
    }
    ramUsageByHost = map[string]float64{
        "foo.example.org": rand.Float64() * 100,
        "bar.example.org": rand.Float64() * 100,
    }
    return
}

实现Describe接口，传递指标描述符到channel

// Describe simply sends the two Descs in the struct to the channel.
func (c *ClusterManager) Describe(ch chan<- *prometheus.Desc) {
    ch <- c.OOMCountDesc
    ch <- c.RAMUsageDesc
}

Collect函数将执行抓取函数并返回数据，返回的数据传递到channel中，并且传递的同时绑定原先的指标描述符。以及指标的类型（一个Counter和一个Guage）

func (c *ClusterManager) Collect(ch chan<- prometheus.Metric) {
    oomCountByHost, ramUsageByHost := c.ReallyExpensiveAssessmentOfTheSystemState()
    for host, oomCount := range oomCountByHost {
        ch <- prometheus.MustNewConstMetric(
            c.OOMCountDesc,
            prometheus.CounterValue,
            float64(oomCount),
            host,
        )
    }
    for host, ramUsage := range ramUsageByHost {
        ch <- prometheus.MustNewConstMetric(
            c.RAMUsageDesc,
            prometheus.GaugeValue,
            ramUsage,
            host,
        )
    }
}

创建结构体及对应的指标信息,NewDesc参数第一个为指标的名称，第二个为帮助信息，显示在指标的上面作为注释，第三个是定义的label名称数组，第四个是定义的Labels

func NewClusterManager(zone string) *ClusterManager {
    return &ClusterManager{
        Zone: zone,
        OOMCountDesc: prometheus.NewDesc(
            "clustermanager_oom_crashes_total",
            "Number of OOM crashes.",
            []string{"host"},
            prometheus.Labels{"zone": zone},
        ),
        RAMUsageDesc: prometheus.NewDesc(
            "clustermanager_ram_usage_bytes",
            "RAM usage as reported to the cluster manager.",
            []string{"host"},
            prometheus.Labels{"zone": zone},
        ),
    }
}

执行主程序

func main() {
    workerDB := NewClusterManager("db")
    workerCA := NewClusterManager("ca")

    // Since we are dealing with custom Collector implementations, it might
    // be a good idea to try it out with a pedantic registry.
    reg := prometheus.NewPedanticRegistry()
    reg.MustRegister(workerDB)
    reg.MustRegister(workerCA)
}

如果直接执行上面的参数的话，不会获取任何的参数，因为程序将自动推出，我们并未定义http接口去暴露数据出来，因此数据在执行的时候还需要定义一个httphandler来处理http请求。

添加下面的代码到main函数后面，即可实现数据传递到http接口上：

    gatherers := prometheus.Gatherers{
        prometheus.DefaultGatherer,
        reg,
    }

    h := promhttp.HandlerFor(gatherers,
        promhttp.HandlerOpts{
            ErrorLog:      log.NewErrorLogger(),
            ErrorHandling: promhttp.ContinueOnError,
        })
    http.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
        h.ServeHTTP(w, r)
    })
    log.Infoln("Start server at :8080")
    if err := http.ListenAndServe(":8080", nil); err != nil {
        log.Errorf("Error occur when start server %v", err)
        os.Exit(1)
    }

其中prometheus.Gatherers用来定义一个采集数据的收集器集合，可以merge多个不同的采集数据到一个结果集合，这里我们传递了缺省的DefaultGatherer，所以他在输出中也会包含go运行时指标信息。同时包含reg是我们之前生成的一个注册对象，用来自定义采集数据。

promhttp.HandlerFor()函数传递之前的Gatherers对象，并返回一个httpHandler对象，这个httpHandler对象可以调用其自身的ServHTTP函数来接手http请求，并返回响应。其中promhttp.HandlerOpts定义了采集过程中如果发生错误时，继续采集其他的数据。

尝试刷新几次浏览器获取最新的指标信息

clustermanager_oom_crashes_total{host="bar.example.org",zone="ca"} 364
clustermanager_oom_crashes_total{host="bar.example.org",zone="db"} 90
clustermanager_oom_crashes_total{host="foo.example.org",zone="ca"} 844
clustermanager_oom_crashes_total{host="foo.example.org",zone="db"} 801
# HELP clustermanager_ram_usage_bytes RAM usage as reported to the cluster manager.
# TYPE clustermanager_ram_usage_bytes gauge
clustermanager_ram_usage_bytes{host="bar.example.org",zone="ca"} 10.738111282075208
clustermanager_ram_usage_bytes{host="bar.example.org",zone="db"} 19.003276633920805
clustermanager_ram_usage_bytes{host="foo.example.org",zone="ca"} 79.72085409108028
clustermanager_ram_usage_bytes{host="foo.example.org",zone="db"} 13.041384617379178

每次刷新的时候，我们都会获得不同的数据，类似于实现了一个数值不断改变的采集器。当然，具体的指标和采集函数还需要按照需求进行修改，满足实际的业务需求。

完整代码：

package main

import (
	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	"github.com/prometheus/common/log"
	"math/rand"
	"net/http"
	"os"
)

//先定义结构体，这是一个集群的指标采集器，每个集群都有自己的Zone,代表集群的名称。另外两个是保存的采集的指标。
type ClusterManager struct {
	Zone         string
	OOMCountDesc *prometheus.Desc
	RAMUsageDesc *prometheus.Desc
}

// 我们来实现一个采集工作,放到了ReallyExpensiveAssessmentOfTheSystemState函数中实现，每次执行的时候，返回一个按照主机名作为键采集到的数据，两个返回值分别代表了OOM错误计数，和RAM使用指标信息。
func (c *ClusterManager) ReallyExpensiveAssessmentOfTheSystemState() (oomCountByHost map[string]int, ramUsageByHost map[string]float64) {
	oomCountByHost = map[string]int{
		"foo.example.org": int(rand.Int31n(1000)),
		"bar.example.org": int(rand.Int31n(1000)),
	}
	ramUsageByHost = map[string]float64{
		"foo.example.org": rand.Float64() * 100,
		"bar.example.org": rand.Float64() * 100,
	}
	return
}

// 实现Describe接口，传递指标描述符到channel
func (c *ClusterManager) Describe(ch chan<- *prometheus.Desc) {
	ch <- c.OOMCountDesc
	ch <- c.RAMUsageDesc
}

// Collect函数将执行抓取函数并返回数据，返回的数据传递到channel中，并且传递的同时绑定原先的指标描述符。以及指标的类型（一个Counter和一个Guage）
func (c *ClusterManager) Collect(ch chan<- prometheus.Metric) {
	oomCountByHost, ramUsageByHost := c.ReallyExpensiveAssessmentOfTheSystemState()
	for host, oomCount := range oomCountByHost {
		ch <- prometheus.MustNewConstMetric(
			c.OOMCountDesc,
			prometheus.CounterValue,
			float64(oomCount),
			host,
		)
	}
	for host, ramUsage := range ramUsageByHost {
		ch <- prometheus.MustNewConstMetric(
			c.RAMUsageDesc,
			prometheus.GaugeValue,
			ramUsage,
			host,
		)
	}
}

//创建结构体及对应的指标信息,NewDesc参数第一个为指标的名称，第二个为帮助信息，显示在指标的上面作为注释，第三个是定义的label名称数组，第四个是定义的Labels
func NewClusterManager(zone string) *ClusterManager {
	return &ClusterManager{
		Zone: zone,
		OOMCountDesc: prometheus.NewDesc(
			"clustermanager_oom_crashes_total",
			"Number of OOM crashes.",
			[]string{"host"},
			prometheus.Labels{"zone": zone},
		),
		RAMUsageDesc: prometheus.NewDesc(
			"clustermanager_ram_usage_bytes",
			"RAM usage as reported to the cluster manager.",
			[]string{"host"},
			prometheus.Labels{"zone": zone},
		),
	}
}

func main() {
	workerDB := NewClusterManager("db")
	workerCA := NewClusterManager("ca")

	// Since we are dealing with custom Collector implementations, it might
	// be a good idea to try it out with a pedantic registry.
	reg := prometheus.NewPedanticRegistry()
	reg.MustRegister(workerDB)
	reg.MustRegister(workerCA)
	//如果直接执行上面的参数的话，不会获取任何的参数，因为程序将自动推出，我们并未定义http接口去暴露数据出来，
	//因此数据在执行的时候还需要定义一个httphandler来处理http请求。

	//添加下面的代码到main函数后面，即可实现数据传递到http接口上：

	//其中prometheus.Gatherers用来定义一个采集数据的收集器集合，可以merge多个不同的采集数据到一个结果集合，
	//这里我们传递了缺省的DefaultGatherer，所以他在输出中也会包含go运行时指标信息。同时包含reg是我们之前生成的一个注册对象，用来自定义采集数据。
	gatherers := prometheus.Gatherers{
		prometheus.DefaultGatherer,
		reg,
	}

	//promhttp.HandlerFor()函数传递之前的Gatherers对象，并返回一个httpHandler对象，这个httpHandler对象可以调用其自身的ServHTTP函数来接手http请求，并返回响应。
	//其中promhttp.HandlerOpts定义了采集过程中如果发生错误时，继续采集其他的数据。
	h := promhttp.HandlerFor(gatherers,
		promhttp.HandlerOpts{
			ErrorLog:            log.NewErrorLogger(),
			ErrorHandling:       promhttp.ContinueOnError,
		})
	http.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
		h.ServeHTTP(w,r)
	})
	log.Infoln("Start server at:8080")
	if err := http.ListenAndServe(":8080",nil);err != nil {
		log.Errorf("Error occur when start server %v",err)
		os.Exit(1)
	}
}

在浏览器访问：http://localhost:8080/metrics
在这里插入图片描述

https://blog.csdn.net/u014029783/article/details/80001251

y果子

关注

2
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
github.com/prometheus/client_golang/prometheus/promhttp 使用golang编写Prometheus Exporter

Exporter是基于Prometheus实施的监控系统中重要的组成部分，承担数据指标的采集工作，官方的exporter列表中已经包含了常见的绝大多数的系统指标监控，比如用于机器性能监控的node_exporter, 用于网络设备监控的snmp_exporter等等。这些已有的exporter对于监控来说，仅仅需要很少的配置工作就能提供完善的数据指标采集。有时我们需要自己去写一些与业务逻辑比较相关的指标监控，这些指标无法通过常见的exporter获取到。比如我们需要提供对于DNS解析情况的整体监控，了解如
复制链接

扫一扫