storm metric的使用说明
@(STORM)[storm]
(一)概述
storm metric类似于hadoop的counter,用于收集应用程序中的特定指标,输出到外部。在storm中是存储到各个机器logs目录下的metric.log文件中。有时我们想保存一些计算的中间变量,当达到一定状态时,统一在一个位置输出,或者统计整个应用的一些指标,metric是个很好的选择。
(二)使用storm metric的关键步骤
1、在bolt的prepare中注册metric
transient CountMetric _countMetric;
transient MultiCountMetric _wordCountMetric;
transient ReducedMetric _wordLengthMeanMetric;
void initMetrics(TopologyContext context) {
_countMetric = new CountMetric();
_wordCountMetric = new MultiCountMetric();
_wordLengthMeanMetric = new ReducedMetric(new MeanReducer());
context.registerMetric("execute_count", _countMetric, 5);
context.registerMetric("word_count", _wordCountMetric, 60);
context.registerMetric("word_length", _wordLengthMeanMetric, 60);
}
(1)metric都定义为了transient。因为这些Metric实现都没有实现Serializable,而在storm的spout/bolt中,所有非transient的变量都必须Serializable
(2)三个参数为metric名称,metric对象,以及时间间隔。时间间隔表示多久一次metric将数据发送给metric consumer。
2、在bolt的execute方法中更新metric
void updateMetrics(String word) {
_countMetric.incr();
_wordCountMetric.scope(word).incr();
_wordLengthMeanMetric.update(word.length());
}
3、在topo中指定将metric consumer,这里使用了storm自带的consumer将其输出到日志文件中,也可以自定义consumer。见下面。
conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 2);
4、示例程序完整代码
package com.lujinhong.demo.storm.metrics;
import java.util.Map;
import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.metric.LoggingMetricsConsumer;
import backtype.storm.metric.api.CountMetric;
import backtype.storm.metric.api.MeanReducer;
import backtype.storm.metric.api.MultiCountMetric;
import backtype.storm.metric.api.ReducedMetric;
import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.testing.TestWordSpout;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.TopologyBuilder;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Tuple;
import backtype.storm.tuple.Values;
import backtype.storm.utils.Utils;
public class ExclamationTopology {
public static class ExclamationBolt extends BaseRichBolt {
OutputCollector _collector;
// 定义指标统计对象
transient CountMetric _countMetric;
transient MultiCountMetric _wordCountMetric;
transient ReducedMetric _wordLengthMeanMetric;
@Override
public void prepare(Map conf, TopologyContext context,
OutputCollector collector) {
_collector = collector;
initMetrics(context);
}
@Override
public void execute(Tuple tuple) {
_collector.emit(tuple, new Values(tuple.getString(0) + "!!!"));
_collector.ack(tuple);
updateMetrics(tuple.getString(0));
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
// 初始化计数器
void initMetrics(TopologyContext context) {
_countMetric = new CountMetric();
_wordCountMetric = new MultiCountMetric();
_wordLengthMeanMetric = new ReducedMetric(new MeanReducer());
context.registerMetric("execute_count", _countMetric, 5);
context.registerMetric("word_count", _wordCountMetric, 60);
context.registerMetric("word_length", _wordLengthMeanMetric, 60);
}
// 更新计数器
void updateMetrics(String word) {
_countMetric.incr();
_wordCountMetric.scope(word).incr();
_wordLengthMeanMetric.update(word.length());
}
}
public static void main(String[] args) throws Exception {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("word", new TestWordSpout(), 10);
builder.setBolt("exclaim1", new ExclamationBolt(), 3).shuffleGrouping(
"word");
builder.setBolt("exclaim2", new ExclamationBolt(), 2).shuffleGrouping(
"exclaim1");
Config conf = new Config();
conf.setDebug(true);
// 输出统计指标值到日志文件中
conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 2);
if (args != null && args.length > 0) {
conf.setNumWorkers(3);
StormSubmitter.submitTopology(args[0], conf,
builder.createTopology());
} else {
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("test", conf, builder.createTopology());
Utils.sleep(10000);
cluster.killTopology("test");
cluster.shutdown();
}
}
}
5、输出示例
2015-11-05 14:31:06,801 38311 1446705066 gdc-storm11-storm.i.nease.net:6727 8:exclaim1 execute_count 165
2015-11-05 14:31:06,835 38345 1446705066 gdc-storm08-storm.i.nease.net:6727 6:exclaim1 execute_count 165
2015-11-05 14:31:11,801 43311 1446705071 gdc-storm11-storm.i.nease.net:6727 8:exclaim1 execute_count 166
2015-11-05 14:31:11,835 43345 1446705071 gdc-storm08-storm.i.nease.net:6727 9:exclaim2 execute_count 250
2015-11-05 14:32:32,391 123901 1446705152 gdc-storm08-storm.i.nease.net:6727 6:exclaim1 word_length 5.852204408817635
2015-11-05 14:33:32,373 183883 1446705212 gdc-storm11-storm.i.nease.net:6727 8:exclaim1 word_length 5.797094188376754
2015-11-05 14:33:32,391 183901 1446705212 gdc-storm08-storm.i.nease.net:6727 6:exclaim1 word_length 5.776606425702811
2015-11-05 14:31:32,374 63884 1446705092 gdc-storm11-storm.i.nease.net:6727 8:exclaim1 word_count {bertels=413, jackson=354, nathan=432, mike=380, golda=412}
2015-11-05 14:31:32,391 63901 1446705092 gdc-storm08-storm.i.nease.net:6727 6:exclaim1 word_count {bertels=374, jackson=396, nathan=411, mike=416, golda=392}
2015-11-05 14:31:32,404 63914 1446705092 gdc-storm08-storm.i.nease.net:6727 9:exclaim2 word_count {bertels!!!=593, golda!!!=615, jackson!!!=571, nathan!!!=646, mike!!!=562}
(三)详细说明
1、metric和metric consumer
storm中关于metric的API主要有2部分:metric和metric consumer。前者用于生成一些统计值,后者将其处理后输出。更详细请见下面。
2、metric
(1)metric在用于在spout/bolt中收集度量值。
(2)metric通过TopologyContext.registerMetric(…)来注册到相应的spout或者bolt中,一般是在prepare/open方法中注册metric,然后在execute方法中更新metric值。
(3)Metric必须实现backtype.storm.metric.api.IMetric接口。
3、自带的Metric实现
storm提供了几个常用的metric实现:
AssignableMetric – set the metric to the explicit value you supply. Useful if it’s an external value or in the case that you are already calculating the summary statistic yourself. Note: Useful for statsd Gauges.
CombinedMetric – generic interface for metrics that can be updated associatively.
CountMetric – a running total of the supplied values. Call incr() to increment by one, incrBy(n) to add/subtract the given number.
Note: Useful for statsd counters.
MultiCountMetric– a hashmap of count metrics. Note: Useful for many Counters where you may not know the name of the metric a priori or where creating many Counter’s manually is burdensome.
MeanReducer – an implementation of ReducedMetric that tracks a running average of values given to its reduce() method. (It accepts Double, Integer or Long values, and maintains the internal average as a Double.) Despite his reputation, the MeanReducer is actually a pretty nice guy in person.
4、Metric Consumer
(1)MetricsConsumer用于收集拓扑中注册的所有Metric,并进行处理、输出等。处理的对象为DataPoint类的对象,同时包括了一些额外信息,如worker主机,worker端口,组件ID,taskID,时间戳,更新周期等。
(2)MetricsConsumer使用backtype.storm.Config.registerMetricsConsumer(…)在创建topo时注册,或者在storm的配置文件中指定topology.metrics.consumer.register。
(3)MetricsConsumer必须实现backtype.storm.metric.api.IMetricsConsumer接口。
5、自定义MetricsConsumer
上面例子中的
conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 2);
其中LoggingMetricsConsumer是由系统提供的类,它将内容写入metrics.log文件中,格式可参考上面的输出。
如果对输出有特殊要求,比如要将日志输出到其它地方,或者是输出的内容不同,可以自定义一个consumer。例如:
public class MonitorKafkaLogConsumer implements IMetricsConsumer {
public static final Logger LOG = LoggerFactory
.getLogger(MonitorKafkaLogConsumer.class);
@Override
public void prepare(Map stormConf, Object registrationArgument,
TopologyContext context, IErrorReporter errorReporter) {
}
@Override
public void handleDataPoints(TaskInfo taskInfo,
Collection<DataPoint> dataPoints) {
StringBuilder sb = new StringBuilder();
for (DataPoint p : dataPoints) {
sb.append(p.name).append(":").append(p.value);
LOG.info(sb.toString());
}
}
@Override
public void cleanup() {
}
}
6、storm ui的变化
在storm的UI中会有一个metric的节点,它与所有的节点都有交集。
没有metric的情况:
加上metric的情况:
(四)应用场景
1、调试
将一些调整级别的信息输出到metric中,以协助定位分析问题
2、监控
将一些核心指标定期输出到外部系统,如监控系统,当发生异常时自动报警。