1,启动了hadoop的start-all.sh后,通过命令启动hiveserver2,但是控制台一直显示session id
Hive Session ID = 2957fc3d-ff23-4d2b-aab0-e73378a64d3e
Hive Session ID = 3f35108d-4a7c-4a1e-8b13-99227c21db15
Hive Session ID = 4bd02ce7-114a-42fb-ad03-96f68d7ba4f2
Hive Session ID = 7000dfd8-54e4-45f5-939b-aeb42a62b9d5
Hive Session ID = 8d8a8f4e-a99a-4a86-9447-3ce9269653e4
Hive Session ID = ae37c8cd-6e2d-46c4-bdc4-62258226700a
Hive Session ID = 2532996b-1434-44c5-879d-eb107a3842d5
Hive Session ID = 1b11f925-f588-4ef0-bab8-3e47b0af6e9b
Hive Session ID = d7cc4a64-6996-4d1a-b153-e953c988b8e1
Hive Session ID = 2eabb766-542f-43d4-955f-4e334325d1ee
2,查看端口号,里面并没有hiveserver2默认的端口号10000
[hadoop@node100 ~]$ netstat -nltp|grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.1:39173 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:8040 0.0.0.0:* LISTEN 2710/java
tcp 0 0 0.0.0.0:9864 0.0.0.0:* LISTEN 1912/java
tcp 0 0 192.168.5.100:9000 0.0.0.0:* LISTEN 1668/java
tcp 0 0 0.0.0.0:8042 0.0.0.0:* LISTEN 2710/java
tcp 0 0 0.0.0.0:9866 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:9867 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:9868 0.0.0.0:* LISTEN 2073/java
tcp 0 0 192.168.5.100:9870 0.0.0.0:* LISTEN 1668/java
tcp 0 0 0.0.0.0:41972 0.0.0.0:* LISTEN 2710/java
tcp 0 0 192.168.5.100:8088 0.0.0.0:* LISTEN 2299/java
tcp 0 0 0.0.0.0:13562 0.0.0.0:* LISTEN 2710/java
tcp 0 0 192.168.5.100:8030 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8031 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8032 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8033 0.0.0.0:* LISTEN 2299/java
3.进入hive CLI 本地客户端,然后发现报错不能创建一个目录,最后两行的大致意思是blocks没有达到总数据块的阈值,所以namenode主节点进入到了安全模式。
[hadoop@node100 ~]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/home/hadoop/bin:/home/hadoop/bin:/opt/module/jdk1.8.0_144/bin:/opt/module/hadoop-3.3.0/bin:/opt/module/hadoop-3.3.0/sbin:/opt/module/apache-hive-2.1.1-bin/bin:/opt/module/sqoop/bin:/opt/module/azkaban-2.5.0/azkaban-web-2.5.0/bin:/opt/module/azkaban-2.5.0/azkaban-executor-2.5.0/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/hadoop-3.3.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 7fa59ebc-f38c-42eb-a01b-f2369cdd5432
Logging initialized using configuration in jar:file:/opt/module/apache-hive-2.1.1-bin/lib/hive-common-3.0.0.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /tmp/hive/hadoop/7fa59ebc-f38c-42eb-a01b-f2369cdd5432. Name node is in safe mode.
The reported blocks 219 needs additional 1 blocks to reach the threshold 0.9990 of total blocks 221.
The minimum number of live datanodes is not required. Safe mode will be turned off automatically once the thresholds have been reached. NamenodeHostName:node100
4,查看hvie运行日志
<property>
<name>hive.querylog.location</name>
<value>${system:java.io.tmpdir}/${system:user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
从我的机器上来看就是/tmp/hadoop/hive.log,从报错日志上来看,Error starting HiveServer2 on attempt 2, will retry in 60000ms,启动hiveserver2的第二次尝试失败了,60秒后重试,然后重试了好多次,每次尝试都会有一个session id号,所以在控制台才会打印那么多session id号。这里报错原因也是因为数据块没有达到阈值导致了安全模式
2024-09-06T18:56:53,217 INFO [main] server.HiveServer2: Starting HiveServer2
2024-09-06T18:56:53,319 INFO [main] SessionState: Hive Session ID = a4fe191c-6fe4-4963-adc8-1c11c7cb3b8e
2024-09-06T18:56:53,367 INFO [main] server.HiveServer2: Shutting down HiveServer2
2024-09-06T18:56:53,367 INFO [main] server.HiveServer2: Stopping/Disconnecting tez sessions.
2024-09-06T18:56:53,367 WARN [main] server.HiveServer2: Error starting HiveServer2 on attempt 2, will retry in 60000ms
java.lang.RuntimeException: Error applying authorization policy on hive configuration: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /tmp/hive/hadoop/a4fe191c-6fe4-4963-adc8-1c11c7cb3b8e. Name node is in safe mode.
The reported blocks 219 needs additional 1 blocks to reach the threshold 0.9990 of total blocks 221.
The minimum number of live datanodes is not required. Safe mode will be turned off automatically once the thresholds have been reached. NamenodeHostName:node100
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1570)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1557)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3406)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1161)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:739)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)
5.关闭namenode的安全模式,这个命令是网上找的
hadoop dfsadmin -safemode leave
//搜了搜,使用这个命令需要时hdfs的超级管理员,而且dfsadmin过时了,尝试去执行hdfs dfsadmin去替代
//哈哈,不过也能执行
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.
Usage: hdfs dfsadmin
Note: Administrative commands can only be run as the HDFS superuser.
6.然后再去启动hiveserver2成功启动
tcp6 0 0 :::10000 :::* LISTEN 3555/java
tcp6 0 0 :::10002 :::* LISTEN 3555/java
7.问题的根源总结
归根结底是因为存储到datenode节点的某个文件块损坏了,导致hdfs的namenode节点触发了安全模式,然后安全模式下不能创建目录,导致启动hiveserver2启动不成功,然后一直尝试重新启动,每次尝试都会生成一个session id,然后安全模式不关,hiveserver2启动不了,然后远程客户端连接拒绝,因为连端口号也没有。
至于损坏的文件块要怎么办不知道。。。
8.特殊情况,root用户启动了hiveserver2,当其他用户使用netstat -nltp|grep java去查询的时候是找不到10000端口的,需要单独netstat -nltp|grep 10000去查询端口是否启用
//root用户
[root@node100 jartest]# netstat -nltp|grep java
tcp 0 0 127.0.0.1:39173 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:8040 0.0.0.0:* LISTEN 2710/java
tcp 0 0 0.0.0.0:9864 0.0.0.0:* LISTEN 1912/java
tcp 0 0 192.168.5.100:9000 0.0.0.0:* LISTEN 1668/java
tcp 0 0 0.0.0.0:8042 0.0.0.0:* LISTEN 2710/java
tcp 0 0 0.0.0.0:9866 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:9867 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:9868 0.0.0.0:* LISTEN 2073/java
tcp 0 0 192.168.5.100:9870 0.0.0.0:* LISTEN 1668/java
tcp 0 0 0.0.0.0:41972 0.0.0.0:* LISTEN 2710/java
tcp 0 0 192.168.5.100:8088 0.0.0.0:* LISTEN 2299/java
tcp 0 0 0.0.0.0:13562 0.0.0.0:* LISTEN 2710/java
tcp 0 0 192.168.5.100:8030 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8031 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8032 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8033 0.0.0.0:* LISTEN 2299/java
tcp6 0 0 :::10000 :::* LISTEN 6436/java
tcp6 0 0 :::10002 :::* LISTEN 6436/java
//hadoop用户 不会展示非自己当前用户的进程信息
[hadoop@node100 ~]$ netstat -nltp|grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.1:39173 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:8040 0.0.0.0:* LISTEN 2710/java
tcp 0 0 0.0.0.0:9864 0.0.0.0:* LISTEN 1912/java
tcp 0 0 192.168.5.100:9000 0.0.0.0:* LISTEN 1668/java
tcp 0 0 0.0.0.0:8042 0.0.0.0:* LISTEN 2710/java
tcp 0 0 0.0.0.0:9866 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:9867 0.0.0.0:* LISTEN 1912/java
tcp 0 0 0.0.0.0:9868 0.0.0.0:* LISTEN 2073/java
tcp 0 0 192.168.5.100:9870 0.0.0.0:* LISTEN 1668/java
tcp 0 0 0.0.0.0:41972 0.0.0.0:* LISTEN 2710/java
tcp 0 0 192.168.5.100:8088 0.0.0.0:* LISTEN 2299/java
tcp 0 0 0.0.0.0:13562 0.0.0.0:* LISTEN 2710/java
tcp 0 0 192.168.5.100:8030 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8031 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8032 0.0.0.0:* LISTEN 2299/java
tcp 0 0 192.168.5.100:8033 0.0.0.0:* LISTEN 2299/java
//但是单独输入进程号却可以,但是后面就没有显示是什么应用了
[hadoop@node100 ~]$ netstat -nltp|grep 10000
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::10000 :::* LISTEN -