prometheus+grafana 监控hadoop、yarn

大数据监控 专栏收录该内容
1 篇文章 0 订阅

主要是以jmx_exporter、prometheus为主导进行对hadoop的metrics进行收集,通过grafana进行展示、预警。

1、安装jmx_exporter以及配置文件

1、通过

阿里云镜像里下载:https://maven.aliyun.com/mvn/search

搜索下载 jmx_prometheus_javaagent

2、创建配置文件:xxxx.yml(根据用途不同可以叫不同的名字,在启动时指定配置)

startDelaySeconds: 0
hostPort: localhost:1234  #1234为想设置的jmx端口(可设置为未被占用的端口)
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false

startDelaySeconds: 0
hostPort: localhost:1235  #1235为想设置的jmx端口(可设置为未被占用的端口)
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false

3、将以上3个文件放到 /usr/local/prometheus_jmx_export_0.3.1 

 并执行 chown -R hadoop:root /usr/local/prometheus_jmx_export_0.3.1 

4、修改$HADOOP_HOME/etc/hadoop/hadoop-env.sh (提示:端口1234(1235)要与之前设置的jmx端口保持一致)

​export HADOOP_NAMENODE_JMX_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=1234 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9222:/usr/local/prometheus_jmx_export_0.3.1/namenode.yaml"
export HADOOP_DATANODE_JMX_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=1235 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9322:/usr/local/prometheus_jmx_export_0.3.1/datanode.yaml"

5、修改$HADOOP_HOME/bin/hdfs 修改 namenode、datanode启动参数如下

if [ "$COMMAND" = "namenode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode'
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_JMX_OPTS $HADOOP_NAMENODE_OPTS"
.......
elif [ "$COMMAND" = "datanode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_DATANODE_JMX_OPTS"
  if [ "$starting_secure_dn" = "true" ]; then
    HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
  fi

6、重启 hadoop dfs集群,namenode机器访问 http://xxx:9222/metrics   datanode机器访问 http://xxx:9322/metrics 即可获得metrics信息

2、安装Prometheus以及配置文件

1、https://github.com/prometheus/prometheus/releases/download/v2.3.2/prometheus-2.3.2.linux-amd64.tar.gz 下载 prometheus linux版本到 /usr/local/ 下,

解压 并执行  chown -R hadoop:root prometheus-2.3.2.linux-amd64.tar.gz

2、修改配置文件 prometheus.yml(注意:以下代码只是在测试上执行的,对多少台机器进行监控就需要配置多少个job,配置文件注意缩进)

- job_name: hadoop-namenode
  static_configs:
  - targets: ['binamenode01:9222']
- job_name: hadoop-datanode
  static_configs:
  - targets: ['bidatanode01:9322']

3、用户hadoop 启动 prometheus 

cd /usr/local/prometheus-2.3.2.linux-amd64
./startPromethous.sh

4、http://master:9090/targets 查看是否添加成功(prometheus 执行默认端口9090)

通过点击http://bidatanode01:9222/metrics可以看到metrics数据

3、安装grafana以及配置文件

1、下载grafana,解压

cd /usr/local
wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.2.2.linux-amd64.tar.gz 
tar -zxvf grafana-5.2.2.linux-amd64.tar.gz 
chown -R hadoop:root grafana-5.2.2.linux-amd64

2、用户 hadoop 启动grafana

cd /usr/local/grafana-5.2.2/bin/
nohup ./grafana-server start &

3、启动后,即可通过http://master:3000/ 来访问了(默认账号密码是admin/admin,grafana默认端口3000)

4、关联Grafana和Prometheus

点击Data Sources 

点击Add data source,填写数据保存

4、配置grafana预警邮件发送

1、检查mailx是否安装 

rpm -qa | grep mailx

如果检查没有安装 则需要用一下命令安装

yum -y install mailx

2、编辑 /usr/local/grafana-5.2.2/conf/defaults.ini

...
 
#################################### SMTP / Emailing #####################
[smtp]
enabled = true
host = smtp.xx.com:587
user = sys_sender@xx.com
# 如果密码中包含#或者; 密码需要用三个双引号包围  例如:"""QWER123;4!@#$"""
password = xxxxxxx #此为邮箱密码
cert_file =
key_file =
skip_verify = true
from_address = sys_sender@xx.com
from_name = sys_sender
ehlo_identity =
[emails]
welcome_email_on_sign_up = false
templates_pattern = emails/*.html
 
...
 
#################################### Alerting ############################
[alerting]
# Disable alerting engine & UI features
enabled = true
# Makes it possible to turn off alert rule execution but alerting UI is visible
execute_alerts = true

 
 

3、测试 grafana 邮件发送

编辑发送邮件,点击测试 OK

=======================================================================================

2018-08-27追加:

对于yarn的接入也是大同小异 

对于${HADOOP_HOME}/etc/hadoop/yarn-env.sh 添加 metrics 开启信息并制定端口

export YARN_JMX_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=1236 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9422:/usr/local/prometheus_jmx_export_0.3.1/yarn.yaml"

然后修改${HADOOP_HOME}/bin/yarn 

elif [ "$COMMAND" = "resourcemanager" ] ; then
  CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/rm-config/log4j.properties
  CLASS='org.apache.hadoop.yarn.server.resourcemanager.ResourceManager'
  YARN_OPTS="$YARN_OPTS $YARN_JMX_OPTS $YARN_RESOURCEMANAGER_OPTS"
  if [ "$YARN_RESOURCEMANAGER_HEAPSIZE" != "" ]; then
    JAVA_HEAP_MAX="-Xmx""$YARN_RESOURCEMANAGER_HEAPSIZE""m"
  fi
......
elif [ "$COMMAND" = "nodemanager" ] ; then
  CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/nm-config/log4j.properties
  CLASS='org.apache.hadoop.yarn.server.nodemanager.NodeManager'
  YARN_OPTS="$YARN_OPTS $YARN_JMX_OPTS -server $YARN_NODEMANAGER_OPTS"
  if [ "$YARN_NODEMANAGER_HEAPSIZE" != "" ]; then
    JAVA_HEAP_MAX="-Xmx""$YARN_NODEMANAGER_HEAPSIZE""m"
  fi

 

重启 yarn

添加 prometheus_jmx_export下的yarn.yaml文件

修改配置文件 prometheus.yml 

- job_name: yarn
  static_configs:
  - targets: ['binamenode01:9422']

重启 prometheus,即可

=======================================================================================

2018-08-29 添加

对于hbase的监控:

修改配置文件 $HBASE_HOME/bin/hbase

在文件

# figure out which class to run

位置添加:

#prometheus jmx export start

HBASE_JMX_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=1237 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9522:/usr/local/prometheus_jmx_export_0.3.1/hbase.yaml"

#prometheus jmx export end
......
elif [ "$COMMAND" = "master" ] ; then
  CLASS='org.apache.hadoop.hbase.master.HMaster'
  if [ "$1" != "stop" ] && [ "$1" != "clear" ] ; then
    HBASE_OPTS="$HBASE_OPTS $HBASE_JMX_OPTS $HBASE_MASTER_OPTS"
  fi
elif [ "$COMMAND" = "regionserver" ] ; then
  CLASS='org.apache.hadoop.hbase.regionserver.HRegionServer'
  if [ "$1" != "stop" ] ; then
    HBASE_OPTS="$HBASE_OPTS $HBASE_JMX_OPTS $HBASE_REGIONSERVER_OPTS"
  fi

重启 hbase

添加 prometheus_jmx_export下的hbase.yaml文件

修改配置文件 prometheus.yml 

- job_name: hbase
  static_configs:
  - targets: ['binamenode01:9522']

重启 prometheus,即可

=======================================================================================

2018-09-01 添加

kylin 监控添加 

修改 kylin.sh文件,其启动项 添加 配置

-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=1239 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9722:/usr/local/prometheus_jmx_export_0.3.1/kylin.yaml \

重启 kylin

添加 prometheus_jmx_export下的kylin.yaml文件

修改配置文件 prometheus.yml 

- job_name: hbase
  static_configs:
  - targets: ['binamenode01:9722']

重启 prometheus,即可

=======================================================================================

2018-09-01 添加

hive 监控添加

修改文件 

${HIVE_HOME}/conf/hive-env.sh 添加如下代码

if [ "$SERVICE" = "hiveserver2" ] ; then
        HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1240 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9822:/usr/local/prometheus_jmx_export_0.3.1/hive_hiveserver2.yaml"
fi
if [ "$SERVICE" = "metastore" ] ; then
        HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1241 -javaagent:/usr/local/prometheus_jmx_export_0.3.1/jmx_prometheus_javaagent-0.3.1.jar=9922:/usr/local/prometheus_jmx_export_0.3.1/hive_metastore.yaml"
fi

添加 prometheus_jmx_export下的hive_metastore.yaml、hive_hiveserver2.yaml文件

 重启 hive的 metastore hiveserver2

修改配置文件 prometheus.yml  

- job_name: hbase
  static_configs:
  - targets: ['binamenode01:9822','binamenode01:9922']

 重启 prometheus,即可

引用

git: https://github.com/prometheus/jmx_exporter

官网:https://prometheus.io/docs/instrumenting/exporters/#other-monitoring-systems

  • 4
    点赞
  • 7
    评论
  • 14
    收藏
  • 一键三连
    一键三连
  • 扫一扫,分享海报

表情包
插入表情
评论将由博主筛选后显示,对所有人可见 | 还能输入1000个字符
©️2021 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值