Flink源码剖析：flink-metrics-reporters

最新推荐文章于 2024-03-28 19:06:00 发布

Matty_Blog

最新推荐文章于 2024-03-28 19:06:00 发布

阅读量1.9k

点赞数 1

分类专栏： Flink

本文链接：https://blog.csdn.net/a1240466196/article/details/105423946

版权

文章目录

本文将介绍下flink已经支持的指标reporters。主要围绕flink源码中的flink-metrics的子模块展开。最后介绍下flink指标平台化实践。

1. 指标 reporters

flink 内置了多种指标 reporter ，如jmx、slf4j、graphite、prometheus、influxdb、statsd、datadog等。
MetricReporter类图

图1：MetricReporter类图

1.1 flink-metrics-dropwizard

只是将flink内部定义的指标org.apache.flink.metrics.Metric和dropwizard中定义的指标com.codahale.metrics.Metric接口和子类互相包装转换。
并且实现了 ScheduledDropwizardReporter ：

public static final String ARG_HOST = "host";
public static final String ARG_PORT = "port";
public static final String ARG_PREFIX = "prefix";
public static final String ARG_CONVERSION_RATE = "rateConversion";
public static final String ARG_CONVERSION_DURATION = "durationConversion";

// ------------------------------------------------------------------------
/**
 * dropwizard 包中的 MetricRegistry
 */
protected final MetricRegistry registry;
/**
 * dropwizard 包中的 ScheduledReporter
 */
protected ScheduledReporter reporter;

private final Map<Gauge<?>, String> gauges = new HashMap<>();
private final Map<Counter, String> counters = new HashMap<>();
private final Map<Histogram, String> histograms = new HashMap<>();
private final Map<Meter, String> meters = new HashMap<>();

/**
 * 添加指标，需要将flink内部的Metric转换成dropwizard中的Metric，
 * 再注册到 dropwizard 的 MetricRegistry 中
 */
@Override
public void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group) {
   
	final String fullName = group.getMetricIdentifier(metricName, this);

	synchronized (this) {
   
		if (metric instanceof Counter) {
   
			counters.put((Counter) metric, fullName);
			registry.register(fullName, new FlinkCounterWrapper((Counter) metric));
		}
		else if (metric instanceof Gauge) {
   
			gauges.put((Gauge<?>) metric, fullName);
			registry.register(fullName, FlinkGaugeWrapper.fromGauge((Gauge<?>) metric));
		} else if (metric instanceof Histogram) {
   
			Histogram histogram = (Histogram) metric;
			histograms.put(histogram, fullName);

			if (histogram instanceof DropwizardHistogramWrapper) {
   
				registry.register(fullName, ((DropwizardHistogramWrapper) histogram).getDropwizardHistogram());
			} else {
   
				registry.register(fullName, new FlinkHistogramWrapper(histogram));
			}
		} else if (metric instanceof Meter) {
   
			Meter meter = (Meter) metric;
			meters.put(meter, fullName);

			if (meter instanceof DropwizardMeterWrapper) {
   
				registry.register(fullName, ((DropwizardMeterWrapper) meter).getDropwizardMeter());
			} else {
   
				registry.register(fullName, new FlinkMeterWrapper(meter));
			}
		} else {
   
			log.warn("Cannot add metric of type {}. This indicates that the reporter " +
				"does not support this metric type.", metric.getClass().getName());
		}
	}
}

/**
 * report 时直接从 dropwizard 内部的 MetricRegistry 中捞取所有指标，执行 ScheduledReporter 的 report 方法
 */
@Override
public void report() {
   
	// we do not need to lock here, because the dropwizard registry is
	// internally a concurrent map
	@SuppressWarnings("rawtypes")
	final SortedMap<String, com.codahale.metrics.Gauge> gauges = registry.getGauges();
	final SortedMap<String, com.codahale.metrics.Counter> counters = registry.getCounters();
	final SortedMap<String, com.codahale.metrics.Histogram> histograms = registry.getHistograms();
	final SortedMap<String, com.codahale.metrics.Meter> meters = registry.getMeters();
	final SortedMap<String, com.codahale.metrics.Timer> timers = registry.getTimers();

	this.reporter.report(gauges, counters, histograms, meters, timers);
}

public abstract ScheduledReporter getReporter(MetricConfig config);

只有flink-metrics-graphite模块会引用这个模块，直接复用 dropwizard 包提供的 GraphiteReporter 功能。

1.2 flink-metrics-graphite

Reporter实现
GraphiteReporter 继承了 flink-metrics-dropwizard 模块中的 ScheduledDropwizardReporter。
只需要实现其中的 getReporter() 抽象方法：

@Override
public ScheduledReporter getReporter(MetricConfig config) {
   
	String host = config.getString(ARG_HOST, null);
	int port = config.getInteger(ARG_PORT, -1);

	if (host == null || host.length() == 0 || port < 1) {
   
		throw new IllegalArgumentException("Invalid host/port configuration. Host: " + host + " Port: " + port);
	}

	String prefix = config.getString(ARG_PREFIX, null);
	String conversionRate = config.getString(ARG_CONVERSION_RATE, null);
	String conversionDuration = config.getString(ARG_CONVERSION_DURATION, null);
	String protocol = config.getString(ARG_PROTOCOL, "TCP");

	// 复用 dropwizard 包提供的 GraphiteReporter
	com.codahale.metrics.graphite.GraphiteReporter.Builder builder =
		com.codahale.metrics.graphite.GraphiteReporter.forRegistry(registry);

	if (prefix != null) {
   
		builder.prefixedWith(prefix);
	}

	if (conversionRate != null) {
   
		builder.convertRatesTo(TimeUnit.valueOf(conversionRate));
	}

	if (conversionDuration != null) {
   
		builder.convertDurationsTo(TimeUnit.valueOf(conversionDuration));
	}

	Protocol prot;
	try {
   
		prot = Protocol.valueOf(protocol);
	} catch (IllegalArgumentException iae) {
   
		log.warn("Invalid protocol configuration: " + protocol + " Expected: TCP or UDP, defaulting to TCP.");
		prot = Protocol.TCP;
	}

	log.info("Configured GraphiteReporter with {host:{}, port:{}, protocol:{}}", host, port, prot);
	switch(prot) {
   
		case UDP:
			return builder.build(new GraphiteUDP(host, port));
		case TCP:
		default:
			return builder.build(new Graphite(host, port));
	}
}

配置

复制 flink-metrics-graphite-xxx.jar 到 $FLINK_HOME/lib 下
在 flink-conf.yml 增加如下配置：

metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter
metrics.reporter.grph.host: localhost  # Graphite server host
metrics.reporter.grph.port: 2003       # Graphite server port
metrics.reporter.grph.protocol: TCP    # protocol to use (TCP/UDP)

1.3 flink-metrics-influxdb

influxdb基本概念

使用方法参考：时序数据库 Influxdb 使用详解
为了方便理解 InfluxdbReporter 的实现，这里简单说下 Influxdb 中的几个概念：

name: census
-————————————
time                     butterflies     honeybees     location   scientist
2015-08-18T00:00:00Z      12                23           1         langstroth
2015-08-18T00:00:00Z      1                 30           1         perpetua
2015-08-18T00:06:00Z      11                28           1         langstroth
2015-08-18T00:06:00Z      3                 28           1         perpetua
2015-08-18T05:54:00Z      2                 11           2         langstroth
2015-08-18T06:00:00Z      1                 10           2         langstroth
2015-08-18T06:06:00Z      8                 23           2         perpetua
2015-08-18T06:12:00Z      7                 22           2         perpetua

timestamp
既然是时间序列数据库，influxdb 的数据都有一列名为 time 的列。
field key,field value,field set
bufferflies 和 honeybees 为 field key，它们为String类型，用于存储元数据。
数据 12-7 为 bufferflies 的field value，数据 23-22 为 honeybees 的field value。field value可以为String,float,integer或boolean类型。
field key 和 field value 对组成的集合称之为 field set，如下：

butterflies = 12 honeybees = 23
butterflies = 1 honeybees = 30
butterflies = 11 honeybees = 28
butterflies = 3 honeybees = 28
butterflies = 2 honeybees = 11
butterflies = 1 honeybees = 10
butterflies = 8 honeybees = 23
butterflies = 7 honeybees = 22

在 influxdb 中，field 是必须的，但是字段是没有索引的，如果字段作为查询条件，会扫描所有符合查询条件的所有字段值。相当于SQL的没有索引的列。

tag key,tag value,tag set
location 和 scientist 是两个tag，location 有两个 tag value：1和2，scientist 有两个 tag value：langstroth 和 perpetua。
tag key 和 tag value 对组成的集合称之为 tag set，如下：

location = 1, scientist = langstroth
location = 2, scientist = langstroth
location = 1, scientist = perpetua
location = 2, scientist = perpetua

在 influxdb 中，tag 是可选的，但 tag 相当于SQL中有索引的列，因此强烈建议使用。

measurement
指标项，是 fields，tags 以及 time 列的容器。
retention policy
数据保留策略，默认是 autogen，表示数据一直保留永不过期，副本数量为1。
series
指共享同一个 retention policy，measurement 以及 tag set 的数据集合，如下：

｜ Arbitrary series number ｜ Retention policy ｜ Measurement ｜ Tag set ｜
｜ ----------------------- ｜ ---------------- ｜ ----------- ｜ ------------------------------- ｜
｜        series 1         ｜       autogen    ｜     census  ｜ location=1,scientist=langstroth ｜
｜        series 2         ｜       autogen    ｜     census  ｜ location=2,scientist=perpetua   ｜
｜        series 3         ｜       autogen    ｜     census  ｜ location=1,scientist=langstroth ｜
｜        series 4         ｜       autogen    ｜     census  ｜ location=2,scientist=perpetua   ｜

point
指的是同一个series中具有相同时间的 field set，points 相当于SQL中的数据行。如下：

name: census
-----------------
time                  butterflies    honeybees   location    scientist
2015-08-18T00:00:00Z

最低0.47元/天解锁文章

Matty_Blog

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
5
评论
Flink源码剖析：flink-metrics-reporters

文章目录1. 指标 reporters1.1 flink-metrics-dropwizard1.2 flink-metrics-graphite1.3 flink-metrics-influxdb1.4 flink-metrics-prometheus1.5 flink-metrics-jmx1.6 flink-metrics-slf4j1.7 flink-metrics-statsd1.8 f...
复制链接

扫一扫