cdh集群出现“无法发出查询:Host Monitor 未运行”
出现问题后,首先想到查看日志:
/var/log/cloudera-scm-server/cloudera-scm-server.log
2018-06-01 11:58:00,001 INFO 618348787@agentServer-2579:com.cloudera.server.common.MonitoringThreadPool: agentServer: waiting in queue stats: average=0ms, min=0ms, max=1ms.
2018-06-01 11:58:24,208 INFO ScmActive-0:com.cloudera.server.cmf.components.ScmActive: (119 skipped) ScmActive completed successfully.
2018-06-01 11:58:38,291 ERROR DatabaseSizeGauge-0:com.cloudera.enterprise.DatabaseSizeGauge: Failed to execute db size query.
java.lang.NullPointerException
at com.cloudera.enterprise.dbutil.DbUtil.getDatabaseSize(DbUtil.java:736)
at com.cloudera.enterprise.DatabaseSizeGauge.run(DatabaseSizeGauge.java:75)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2018-06-01 11:58:59,293 INFO CMMetricsForwarder-0:com.cloudera.server.cmf.components.ClouderaManagerMetricsForwarder: Failed to send metrics.
java.lang.ArrayIndexOutOfBoundsException: 0
然后查看了一下磁盘使用情况:
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_bjesbpmaster-lv_root
50G 50G 0 100% /
tmpfs 16G 68K 16G 1% /dev/shm
/dev/sda2 485M 69M 392M 15% /boot
/dev/sda1 200M 9.9M 190M 5% /boot/efi
/dev/mapper/vg_bjesbpmaster-lv_data
3.6T 33G 3.6T 1% /data
cm_processes 16G 38M 16G 1% /var/run/cloudera-scm-agent/process
根目录被占满。哎~~~~
解决思路:
1、先解决根目录磁盘满的问题;
2、尝试一次重启Cloudera Management Service,发现到一个行重要日志:
Removing any leveldbjni library files left over from previous runs
找下面两个文件:
cloudera-host-monitor
cloudera-service-monitor
我电脑上这两个文件夹放到了/data/var/lib目录下,然后将这两个文件夹改名备份一下,
mv cloudera-host-monitor/ cloudera-host-monitor_BAK
mv cloudera-service-monitor cloudera-service-monitor_BAK
在次尝试重启Cloudera Management Service,等待几秒钟后,启动成功:
到此,问题解决完成。