ambari 2.6.4
新装集群,发现每个组件的Metrics 是没有内容的
下图红框(已修复的):
之前都是No data no av.....
ambari-metrics-collector 依赖于hbase,收集的信息先保存在Hbase中,因为hbase没有安装,
安装hbase后,还是不行,重启hbase以及ambari-metrics 还是不行,查看日志,发现错误信息:
Table Namespace Manager not fully initialized, try again later
使用hbase shell,建表:create ‘t1’,'f1'
1、建表失败,可以查看日志找到处理方法,我使用的是进入zk 客户端删除hbase的znode
并删除hbase在hdfs的路径/apps/hbase,然后重启,建表 成功
2、建表成功,说明hbase没有问题,收集的信息放在hbase上市没有问题的,那么问题就是ambari-metrics的问题了,
在此进入ambari-metrics-collector的日志,错误依旧是Table Namespace Manager not fully initialized, try again later,
顿时束手无策,歇一会
后来瞄了下屏幕,发现新的错误:
(1)WARN org.apache.zookeeper.ClientCnxn: Session 0x169a43422df0001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
(2)com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:228)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:292)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:62896)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceState.isMasterRunning(ConnectionManager.java:1465)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.isKeepAliveMasterConnectedAndRunning(ConnectionManager.java:2146)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1728)
at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4427)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4416)
at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:752)
at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:673)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1065)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1418)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2190)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:872)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:194)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:330)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1421)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2382)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2330)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:78)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2330)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:237)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:205)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:270)
(3)ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x2cd2c8fe-0x169a43422df0010, quorum=cj-data3:61181, baseZNode=/ams-hbase-unsecure Received une
xpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:420)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:919)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.access$400(ConnectionManager.java:557)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1510)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1551)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1580)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1731)
at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4427)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4416)
at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:752)
at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:673)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1065)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1418)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2190)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:872)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:194)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:330)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1421)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2382)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2330)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:78)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2330)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:237)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
重点在第三个,
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure,
在ambari-metrics-collector的ams-hbase-size配置中:
在hbase中:
修改他们的值为一致, ambari-metrics-collector中的修改为/hbase-unsecure
重启metrics,建表成功,日志
2019-03-22 15:35:58,379 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.CATALOG
2019-03-22 15:36:06,020 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.SEQUENCE
2019-03-22 15:36:07,282 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.STATS
2019-03-22 15:36:08,534 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.FUNCTION
2019-03-22 15:36:09,791 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRICS_METADATA
2019-03-22 15:36:17,081 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created HOSTED_APPS_METADATA
2019-03-22 15:36:18,317 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created INSTANCE_HOST_METADATA
2019-03-22 15:36:19,568 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created CONTAINER_METRICS
2019-03-22 15:36:20,870 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_RECORD
2019-03-22 15:36:22,118 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_RECORD_MINUTE
2019-03-22 15:36:23,362 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_RECORD_HOURLY
2019-03-22 15:36:24,601 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_RECORD_DAILY
2019-03-22 15:36:25,838 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_AGGREGATE
2019-03-22 15:36:27,078 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_AGGREGATE_MINUTE
2019-03-22 15:36:28,314 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_AGGREGATE_HOURLY
2019-03-22 15:36:29,552 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRIC_AGGREGATE_DAILY
...........
######################### Cluster HA state ########################
CLUSTER: ambari-metrics-cluster
RESOURCE: METRIC_AGGREGATORS
PARTITION: METRIC_AGGREGATORS_0 cj-data3_12001 ONLINE
PARTITION: METRIC_AGGREGATORS_1 cj-data3_12001 ONLINE
##################################################
这次错误之前是没有遇到的,原因可能是hbase安装在metrics之后,导致两个配置没有同步,事情处理之后才会发现原因很简单