2021-03-11 21:16:30,478 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.98.166:8485, 192.168.98.167:8485, 192.168.98.168:8485]. Skipping.
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1508)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1532)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:652)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
在namenode启动之后,过一段时间namenode就死掉了。默认情况下namenode启动10s(maxRetries=10, sleepTime=1000)后journalnode还没有启动,就会报上述错误。
处理方法:
在hdfs-site.xml中添加如下配置:
1 <!--修改core-site.xml中的ipc参数,防止出现连接journalnode服务ConnectException-->
2 <property>
3 <name>ipc.client.connect.max.retries</name>
4 <value>100</value>
5 <description>Indicates the number of retries a client will make to establish a server connection.</description>
6 </property>
7 <property>
8 <name>ipc.client.connect.retry.interval</name>
9 <value>10000</value>
10 <description>Indicates the number of milliseconds a client will wait for before retrying to establish a server connection.</description>
11 </property>
详细介绍可以参考文章(致谢):https://www.cnblogs.com/tibit/p/7447190.html