What is Hadoop Metrics2?

转载 2015年11月19日 21:23:01

source:http://blog.cloudera.com/blog/2012/10/what-is-hadoop-metrics2/


Metrics are collections of information about Hadoop daemons, events and measurements; for example, data nodes collect metrics such as the number of blocks replicated, number of read requests from clients, and so on. For that reason, metrics are an invaluable resource for monitoring Apache Hadoop services and an indispensable tool for debugging system problems. 

This blog post focuses on the features and use of the Metrics2 system for Hadoop, which allows multiple metrics output plugins to be used in parallel, supports dynamic reconfiguration of metrics plugins, provides metrics filtering, and allows all metrics to be exported via JMX.

Metrics vs. MapReduce Counters

When speaking about metrics, a question about their relationship to MapReduce counters usually arises. This differences can be described in two ways: First, Hadoop daemons and services are generally the scope for metrics, whereas MapReduce applications are the scope for MapReduce counters (which are collected for MapReduce tasks and aggregated for the whole job). Second, whereas Hadoop administrators are the main audience for metrics, MapReduce users are the audience for MapReduce counters.

Contexts and Prefixes

For organizational purposes metrics are grouped into named contexts – e.g., jvm for java virtual machine metrics or dfs for the distributed file system metric. There are different sets of contexts supported by Hadoop-1 and Hadoop-2; the table below highlights the ones supported for each of them.  

Branch-1

Branch-2

– jvm
– rpc
– rpcdetailed
– metricssystem
– mapred
– dfs
– ugi
– yarn
– jvm
– rpc
– rpcdetailed
– metricssystem
– mapred
– dfs
– ugi

A Hadoop daemon collects metrics in several contexts. For example, data nodes collect metrics for the “dfs”, “rpc” and “jvm” contexts. The daemons that collect different metrics in Hadoop (for Hadoop-1 and Hadoop-2) are listed below:

Branch-1 Daemons/Prefixes Branch-2 Daemons/Prefixes

– namenode
– datanode
– jobtracker
– tasktracker
– maptask
– reducetask

 

– namenode
– secondarynamenode
– datanode
– resourcemanager
– nodemanager
– mrappmaster
– maptask
– reducetask

System Design

The Metrics2 framework is designed to collect and dispatch per-process metrics to monitor the overall status of the Hadoop system. Producers register the metrics sources with the metrics system, while consumers register the sinks. The framework marshals metrics from sources to sinks based on (per source/sink) configuration options. This design is depicted below.

 

Here is an example class implementing the MetricsSource:

The “MyMetric” in the listing above could be, for example, the number of open connections for a specific server.

Here is an example class implementing the MetricsSink:

To use the Metric2s framework, the system needs to be initialized and sources and sinks registered. Here is an example initialization:

Configuration and Filtering

The Metrics2 framework uses the PropertiesConfiguration from the apache commons configuration library.

Sinks are specified in a configuration file (e.g., “hadoop-metrics2-test.properties”), as:

The configuration syntax is:

In the previous example, test is the prefix and mysink0 is an instance name. DefaultMetricsSystem would try to load hadoop-metrics2-[prefix].properties first, and if not found, try the default hadoop-metrics2.properties in the class path. Note, the [instance] is an arbitrary name to uniquely identify a particular sink instance. The asterisk (*) can be used to specify default options.

Here is an example with inline comments to identify the different configuration sections:

Here is an example set of NodeManager metrics that are dumped into the NodeManager sink file:

Each line starts with a time followed by the context and metrics name and the corresponding value for each metric.

Filtering

By default, filtering can be done by source, context, record and metrics. More discussion of different filtering strategies can be found in the Javadoc and wiki.

Example:

Conclusion

The Metrics2 system for Hadoop provides a gold mine of real-time and historical data that help monitor and debug problems associated with the Hadoop services and jobs. 


Nicholas:HDFS:What is New in Hadoop 2

  • 2014年05月29日 14:08
  • 1.54MB
  • 下载

DevOps - What is DevOps and Evaluation metrics

This article is focus on DevOps conception and tech you what is DevOps, and then provide you metric ...
  • HHL2002
  • HHL2002
  • 2016年07月11日 13:19
  • 541

What is Hadoop

  • 2011年06月25日 02:13
  • 678KB
  • 下载

What are WEP, WPA, and WPA2? Which is best?

These acronyms refer to different wireless encryption protocols that are intended to protect the inf...

What to Do if 11gR2 Clusterware is Unhealthy [ID 1068835.1]

What to Do if 11gR2 Clusterware is Unhealthy [ID 1068835.1]   Modified 0...

_What is Cone-Beam CT

  • 2016年02月23日 21:12
  • 2.03MB
  • 下载

What is Computer Science

  • 2014年10月15日 08:05
  • 4.09MB
  • 下载

第132讲:Hadoop集群监控:日志、Metrics学习笔记

第132讲:Hadoop集群监控:日志、Metrics学习笔记 一般用第三方工具,因为有针对性,可视化更好。 监控的目标是检测集群,在什么时候没有提供需要的服务。 hadoop最需要监控的是na...
  • slq1023
  • slq1023
  • 2015年11月15日 18:37
  • 470

what is statistics.

  • 2014年08月06日 16:54
  • 888KB
  • 下载

what is html5

  • 2013年01月08日 10:26
  • 3.46MB
  • 下载
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:What is Hadoop Metrics2?
举报原因:
原因补充:

(最多只允许输入30个字)