概念
Ambari Metrics是Ambari中负责监控集群状态的功能组件。它有如下一些主要的概念:
| Terminology | Description |
|---|---|
| Ambari Metrics System (“AMS”) | The built-in metrics collection system for Ambari. |
| Metrics Collector | The standalone server that collects metrics, aggregates metrics, serves metrics from the Hadoop service sinks and the Metrics Monitor. |
| Metrics Hadoop Sinks | Plugs into the various Hadoop components sinks to send Hadoop metrics to the Metrics Collector. |
| Metrics Monitor | Installed on each host in the cluster to collect system-level metrics and forward to the Metrics Collector. |
简单地说,Ambari收集两类信息放到Collector上:
1. 各节点“系统级”的指标
2. Hadoop各组件的指标
前者是通过安装在每个节点上的Metrics Monitor(就是Agent)来收集的,后者是通过面向特定Hadoop组件的Sink(概念上和Flume的Sink是一样的)来收集的。
最后补充一一点,Collector是使用HBase存放Metrics数据的。
架构

配置
配置Ambari Metrics为分布式模式
默认安装时Ambari Metrics为embedded模式,这样收集的所有数据是存放在Collector节点的本地的,大量的Metrics数据会挤占大量的本地存储空间,该为分布式模式后Metrics数据会放置到HDFS上,所以通常这是安装Ambari后必备一个操作。具体的操作可以参考: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_ambari_reference_guide/content/_configuring_ambari_metrics_for_distributed_mode.html
配置Metrics数据的生命周期
大量的Metrics会占用非常大的存数空间,设定Metrics数据的保留时间(TTL)是很必要的,控制Metrics数据保留时间的参数位于ams-site.xml中,以下是相关的配置项:
| 配置项 | 默认值 | 描述 |
|---|---|---|
| timeline.metrics.host.aggregator.ttl | 86400 | 1 minute resolution data purge interval. Default is 1 day. |
| timeline.metrics.host.aggregator.minute.ttl | 604800 | Host based X minutes resolution data purge interval. Default is 7 days.(X = configurable interval, default interval is 2 minutes) |
| timeline.metrics.host.aggregator.hourly.ttl | 2592000 | Host based hourly resolution data purge interval. Default is 30 days. |
| timeline.metrics.host.aggregator.daily.ttl | 31536000 | Host based daily resolution data purge interval. Default is 1 year. |
| timeline.metrics.cluster.aggregator.minute.ttl | 2592000 | Cluster wide minute resolution data purge interval. Default is 30 days. |
| timeline.metrics.cluster.aggregator.hourly.ttl | 31536000 | Cluster wide hourly resolution data purge interval. Default is 1 year. |
| timeline.metrics.cluster.aggregator.daily.ttl | 63072000 | Cluster wide daily resolution data purge interval. Default is 2 years. |
深入理解AmbariMetrics:集群状态监控系统
本文详细介绍了AmbariMetrics在Ambari集群中的作用,包括其核心概念、架构、配置和数据生命周期管理。重点阐述了如何通过安装在每个节点的MetricsMonitor收集系统级指标和通过特定Hadoop组件的Sink收集Hadoop组件指标,并说明了将数据存储在HBase和配置数据保留时间的重要性。
402

被折叠的 条评论
为什么被折叠?



