Hadoop Metrics2的实现应该是在14年左右就已经非常成熟了,研究的人也比较多了。一个出现很久的东西,并非没有学习价值。如Metrics2 和之前的Metrics 一代做了哪些改进?如果我们自己设计一套Metrics信息,哪些是可以借鉴的地方?
- 所有Source 和Sink全部是可配置的(和我们常见的Log4j配置一样),这样可以在不修改源码的情况下,自定义Metrics的监控
- Metrics2相比Metrics1 更加灵活,Metrics2支持各种Filter,那么用户就可以只提取自己关心的那一部分Metrics信息就好了。
源码结构
- package : org.apache.hadoop.metrics2
MetricsSystemImpl 服务
// 1. 系统启动时,启动配置好的 source和sink
private synchronized void configure(String prefix) {
config = MetricsConfig.create(prefix);
configureSinks();
configureSources();
configureSystem();
}
// 2. 启动一个定时器,定时 sample Metrics,并 publishMetrics
startTimer();
synchronized void onTimerEvent() {
logicalTime += period;
if (sinks.size() > 0) {
publishMetrics(sampleMetrics(), false);
}
}
// 3. metrics 数据采集
/**
* Sample all the sources for a snapshot of metrics/tags
* @return the metrics buffer containing the snapshot
*/
synchronized MetricsBuffer sampleMetrics() {
collector.clear();
MetricsBufferBuilder bufferBuilder = new MetricsBufferBuilder();
for (Entry<String, MetricsSourceAdapter> entry : sources.entrySet()) {
if (sourceFilter == null || sourceFilter.accepts(entry.getKey())) {
snapshotMetrics(entry.getValue(), bufferBuilder);
}
}
if (publishSelfMetrics) {
snapshotMetrics(sysSource, bufferBuilder);
}
MetricsBuffer buffer = bufferBuilder.get();
return buffer;
}
// 4. 发送Metrics 数据到Sink
/**
* Publish a metrics snapshot to all the sinks
* @param buffer the metrics snapshot to publish
* @param immediate indicates that we should publish metrics immediately
* instead of using a separate thread.
*/
synchronized void publishMetrics(MetricsBuffer buffer, boolean immediate) {
int dropped = 0;
for (MetricsSinkAdapter sa : sinks.values()) {
long startTime = Time.now();
boolean result;
if (immediate) {
result = sa.putMetricsImmediate(buffer);
} else {
result = sa.putMetrics(buffer, logicalTime);
}
dropped += result ? 0 : 1;
publishStat.add(Time.now() - startTime);
}
droppedPubAll.incr(dropped);
}
MetricsSource
- MetricsInfo : Metrics 名字,描述等Metrics 本身描述信息,例如
enum JvmMetricsInfo implements MetricsInfo
- MetricsSource : 负责收集MetricsInfo
class JvmMetrics implements MetricsSource
- MetricsSourceAdapter : 和 MetricsSource 一一对应,对MetricsSource增加很多Filter过滤器等功能增强
class MetricsSourceAdapter implements DynamicMBean
- MetricsSourceAdapter 实现了 MBean 子类,可以获取JMX 信息。在 getAttribute(), getAttributes(), getMBeanInfo() 这些方法中都有更新JMX 信息的调用
// 1. 通过sa 调用 source的 getMetrics() 方法获取Metrics 信息
// 以 JvmMetrics 调用为例
@Override
public void getMetrics(MetricsCollector collector, boolean all) {
// 2 通过 MetricsCollector collector 创建一个Metrics recored builder
MetricsRecordBuilder rb = collector.addRecord(JvmMetrics)
.setContext("jvm").tag(ProcessName, processName)
.tag(SessionId, sessionId);
// 进行 gauge 收集
getMemoryUsage(rb);
getGcUsage(rb);
getThreadUsage(rb);
getEventCounters(rb);
}
// 获取MXBean对象
final MemoryMXBean memoryMXBean = ManagementFactory.getMemoryMXBean();
final List<GarbageCollectorMXBean> gcBeans =
ManagementFactory.getGarbageCollectorMXBeans();
final ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
// 3. MetricsRecordBuilderImpl中维护一个 AbstractMetric List,builder.addGauge() 添加Metrics
// 通过MXBean对象取 Metrics Gauge 信息
// JVM 参数:
// used : 当前内存实际使用量
// committed : Java virtual machine guaranteed to be available 使用的内存大小
// max : Memory management can be used 的最大内存
private void getMemoryUsage(MetricsRecordBuilder rb) {
MemoryUsage memNonHeap = memoryMXBean.getNonHeapMemoryUsage();
MemoryUsage memHeap = memoryMXBean.getHeapMemoryUsage();
Runtime runtime = Runtime.getRuntime();
rb.addGauge(MemNonHeapUsedM, memNonHeap.getUsed() / M)
.addGauge(MemNonHeapCommittedM, memNonHeap.getCommitted() / M)
.addGauge(MemNonHeapMaxM, memNonHeap.getMax() / M)
.addGauge(MemHeapUsedM, memHeap.getUsed() / M)
.addGauge(MemHeapCommittedM, memHeap.getCommitted() / M)
.addGauge(MemHeapMaxM, memHeap.getMax() / M)
.addGauge(MemMaxM, runtime.maxMemory() / M);
}
MetricsSink
- MetricsSink 方法接口如下,具体方法实现可以是文件,ganglia等。
void putMetrics(MetricsRecord record);
void flush();
- MetricsSinkAdapter 和 MetricsSink 一一对应,
void consume(MetricsBuffer buffer)
实现将Buffer中的所有entry.record 经过各种Filter过滤后,进行Sink。 - SinkQueue : 实现一个简单的阻塞消息队列,等待消息到达后进行sink。
// SinkQueue.java
// 1. MetricsSinkAdapter: 先将收到的消息放入到queue中,并notify后续sink进行消费
boolean putMetrics(MetricsBuffer buffer, long logicalTime) {
if (logicalTime % period == 0) {
LOG.debug("enqueue, logicalTime="+ logicalTime);
// enqueue 后调用notify() 方法进行后续消费
if (queue.enqueue(buffer)) return true;
dropped.incr();
return false;
}
return true; // OK
}
// 2. SinkQueue: waitForData() 方法被唤醒,调用consumer进行消费
void consume(Consumer<T> consumer) throws InterruptedException {
T e = waitForData();
try {
consumer.consume(e); // can take forever
_dequeue();
}
finally {
clearConsumerLock();
}
}
// 3. MetricsSinkAdapter: 经过一系列的过滤后调用 sink.putMetrics 接口,由各sink实例进行 Metrics Sink
@Override
public void consume(MetricsBuffer buffer) {
long ts = 0;
for (MetricsBuffer.Entry entry : buffer) {
if (sourceFilter == null || sourceFilter.accepts(entry.name())) {
for (MetricsRecordImpl record : entry.records()) {
if ((context == null || context.equals(record.context())) &&
(recordFilter == null || recordFilter.accepts(record))) {
if (LOG.isDebugEnabled()) {
LOG.debug("Pushing record "+ entry.name() +"."+ record.context() +
"."+ record.name() +" to "+ name);
}
sink.putMetrics(metricFilter == null
? record
: new MetricsRecordFiltered(record, metricFilter));
if (ts == 0) ts = record.timestamp();
}
}
}
}
if (ts > 0) {
sink.flush();
latency.add(Time.now() - ts);
}
if (buffer instanceof WaitableMetricsBuffer) {
((WaitableMetricsBuffer)buffer).notifyAnyWaiters();
}
LOG.debug("Done");
}