文章目录
1 GANGLIA简述
Ganglia是一款能够监控各种网络参数以及服务器健康性和完整性的软件。Ganglia使用灵活的通知机制,允许用户为几乎任何事件配置基于邮件的告警。这样可以快速反馈服务器的问题。基于已存储的数据,Ganglia提供了出色的报告和数据可视化功能。
Ganglia主要用来监控系统性能的软件,通过曲线很容易见到每个节点的工作状态,对合理调整,分配系统资源,提高系统整体性能起到重要作用,支持浏览器方式访问,但不能监控节点硬件技术指标。Ganglia是分布式的监控系统。
官网:https://www.ganglia.com/
1.1 GANGLIA的组件介绍
Ganglia由gmond、gmetad和gweb三部分组成。
gmond(Ganglia Monitoring Daemon)是一种轻量级服务,安装在每台需要收集指标数据的节点主机上。使用gmond,你可以很容易收集很多系统指标数据,如CPU、内存、磁盘、网络和活跃进程的数据等。
gmetad(Ganglia Meta Daemon)整合所有信息,并将其以RRD格式存储至磁盘的服务。
gweb(Ganglia Web)Ganglia可视化工具,gweb是一种利用浏览器显示gmetad所存储数据的PHP前端。在Web界面中以图表方式展现集群的运行状态下收集的多种不同指标数据。
2 GANGLIA安装SERVER节点
2.1 中间件版本选取
中间件名称 | 版本号 |
---|---|
CentOS | CentOS6.8 |
Java | 1.8.0_121 |
http | 2.2.15 |
php | 5.6.40 |
Ganglia | 3.7.1 |
2.2 环境准备
2.2.1 CentOS 6.8
CentOS6.8 安过程省略。预先创建用户/用户组zhouchen
预先安装jdk1.8.0_121 +
2.2.2 关闭防火墙-root
[zhouchen@hadoop102 software]$ sudo service iptables stop
[zhouchen@hadoop102 software]$ sudo chkconfig iptables off
2.3 部署GANGLIA
2.3.1 安装部署
1.安装httpd服务与php
[zhouchen@hadoop102 flume]$ sudo yum -y install httpd php
2.安装其他依赖
[zhouchen@hadoop102 flume]$ sudo yum -y install rrdtool perl-rrdtool rrdtool-devel
[zhouchen@hadoop102 flume]$ sudo yum -y install apr-devel
3.安装Ganglia
[zhouchen@hadoop102 flume]$ sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[zhouchen@hadoop102 flume]$ sudo yum -y install Ganglia-gmetad
[zhouchen@hadoop102 flume]$ sudo yum -y install Ganglia-web
[zhouchen@hadoop102 flume]$ sudo yum install -y Ganglia-gmond
2.3.2 配置Ganglia
- 安装yum源
[zhouchen@hadoop102 flume]$ sudo vim /etc/httpd/conf.d/Ganglia.conf
修改为红颜色的配置:
# Ganglia monitoring system php web frontend
Alias /Ganglia /usr/share/Ganglia
<Location /Ganglia>
Order deny,allow
#Deny from all
Allow from all
# Allow from 127.0.0.1
# Allow from ::1
# Allow from .example.com
</Location>
- 修改配置文件/etc/Ganglia/gmetad.conf
[zhouchen@hadoop102 flume]$ sudo vim /etc/Ganglia/gmetad.conf
修改为:
data_source "hadoop102" 192.168.139.130
- 修改配置文件/etc/Ganglia/gmond.conf
[zhouchen@hadoop102 flume]$ sudo vim /etc/Ganglia/gmond.conf
修改为:
cluster {
name = "hadoop102"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
# mcast_join = 239.2.11.71
host = 192.168.139.130
port = 8649
ttl = 1
}
udp_recv_channel {
# mcast_join = 239.2.11.71
port = 8649
bind = 192.168.139.130
retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}
2.4 GANGLIA启动
2.4.1 启动httpd
1.启动
[zhouchen@hadoop102 software]$ sudo service httpd start
2.开机自启
[zhouchen@hadoop102 software]$ sudo chkconfig httpd on
2.4.2 启动gmetad
1.启动
[zhouchen@hadoop102 software]$ sudo service gmetad start
2.开机自启
[zhouchen@hadoop102 software]$ sudo chkconfig --add gmetad
[zhouchen@hadoop102 software]$ sudo chkconfig gmetad on
2.4.3 启动gmond
1.启动
[zhouchen@hadoop102 software]$ sudo service gmond start
2.开机自启
[zhouchen@hadoop102 software]$ sudo chkconfig --add gmond
[zhouchen@hadoop102 software]$ sudo chkconfig gmond on
2.4.4 登陆Ganglia首页
http://1hadoop102/Ganglia
3 GANGLIA基础操作
3.1 监控FLUME
1.修改/opt/module/flume/conf目录下的flume-env.sh配置
JAVA_OPTS="-Dflume.monitoring.type=Ganglia
-Dflume.monitoring.hosts=hadoop102:8649
-Xms100m
-Xmx200m"
2.编辑一个flume传输的任务
在conf文件夹下创建Flume Agent配置文件flume-netcat-logger.conf。
[zhouchen@hadoop102 conf]$ vim flume-netcat-logger.conf
在flume-netcat-logger.conf文件中添加如下内容。
添加内容如下:
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
3.启动flume
[zhouchen@hadoop102 flume]$ bin/flume-ng agent --conf conf/ --name a1 --conf-file conf/flume-netcat-logger.conf -Dflume.root.logger=INFO,console
4.使用netcat工具向本机的44444端口发送内容
[zhouchen@hadoop102 ~]$ nc localhost 44444
hello
zhouchen
...
5.查看Ganglia监控
3.2 GANGLIA监控HADOOP
1.修改hadoop配置$HADOOP_HOME/etc/hadoop/hadoop-metrics2.properties
注意先将所有都注释掉,只添加如下内容即可。根据实际的节点分布添加namenode、datanode、resourcemanager、nodemanager和jobhistory。然后分发到hadoop集群所有节点
*.sink.Ganglia.class=org.apache.hadoop.metrics2.sink.Ganglia.GangliaSink31
*.sink.Ganglia.period=10
*.sink.Ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
*.sink.Ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
namenode.sink.Ganglia.servers=hadoop102:8649
datanode.sink.Ganglia.servers=hadoop102:8649,hadoop103:8649,hadoop104:8649
resourcemanager.sink.Ganglia.servers=hadoop103:8649
nodemanager.sink.Ganglia.servers=hadoop102:8649,hadoop103:8649,hadoop104:8649
jobhistoryserver.sink.Ganglia.servers=hadoop103:8649
2.启动hadoop集群
3.启动Ganglia所有服务
4.查看Ganglia监控
3.3 GANGLIA监控HBASE
1.修改Hbase配置$HBASE_HOME/conf/hadoop-metrics2-hbase.properties
注意先将所有内容注释掉,只添加如下内容即可。分发到所有hbase节点
*.sink.Ganglia.class=org.apache.hadoop.metrics2.sink.Ganglia.GangliaSink31
*.sink.Ganglia.period=10
hbase.sink.Ganglia.period=10
hbase.sink.Ganglia.servers=hadoop102:8649 #Hbase Master节点
2.启动Hbase集群
3.启动Ganglia所有服务
4.查看Ganglia监控