打算用3个节点搭建一个HA 集群,规划如下
NameNode | DataNode | JournalNode | |
node1 | 是 | 是 | 是 |
ndoe2 | 是 | 是 | 是 |
node3 | 是 | 是 |
hdfs-site.xml配置如下
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
</property>
<property>
<name>dfs.ha.nameservice.cluster1</name>
<value>node1,node2,node3</value>
</property>
<!-- ########################## namenode cluster start ########################## -->
<property>
<name>dfs.namenode.rpc-address.cluster1.node1</name>
<value>node1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.node1</name>
<value>node1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.node2</name>
<value>node2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.node2</name>
<value>node2:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.node3</name>
<value>node3:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.node3</name>
<value>node3:50070</value>
</property>
<!-- ########################## namenode cluster end ########################## -->
<property>
<name>dfs.ha.automatic-failover.enabled.cluster1</name>
<value>false</value>
</property>
<!-- ########################## journal node cluster start ########################## -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1:8485;node2:8485;node3:8485/cluster1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/data/journal_tmp_dir</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- ########################## journal node cluster end ########################## -->
</configuration>
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/data/hadoop_tmp_dir</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
slaves:
node1
node2
node3
执行bin/hdfs namenode -format 出现了下面的问题
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41; compiled by 'jenkins' on 2016-01-26T00:08Z
STARTUP_MSG: java = 1.7.0_76
************************************************************/
17/06/30 13:16:22 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/06/30 13:16:22 INFO namenode.NameNode: createNameNode [-format]
17/06/30 13:16:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-ea34a456-6ef6-4781-86bd-ef1fea9cf067
17/06/30 13:16:24 INFO namenode.FSNamesystem: No KeyProvider found.
17/06/30 13:16:24 INFO namenode.FSNamesystem: fsLock is fair:true
17/06/30 13:16:25 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
17/06/30 13:16:25 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
17/06/30 13:16:25 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
17/06/30 13:16:25 INFO blockmanagement.BlockManager: The block deletion will start around 2017 Jun 30 13:16:25
17/06/30 13:16:25 INFO util.GSet: Computing capacity for map BlocksMap
17/06/30 13:16:25 INFO util.GSet: VM type = 64-bit
17/06/30 13:16:25 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
17/06/30 13:16:25 INFO util.GSet: capacity = 2^21 = 2097152 entries
17/06/30 13:16:25 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
17/06/30 13:16:25 INFO blockmanagement.BlockManager: defaultReplication = 3
17/06/30 13:16:25 INFO blockmanagement.BlockManager: maxReplication = 512
17/06/30 13:16:25 INFO blockmanagement.BlockManager: minReplication = 1
17/06/30 13:16:25 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
17/06/30 13:16:25 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
17/06/30 13:16:25 INFO blockmanagement.BlockManager: encryptDataTransfer = false
17/06/30 13:16:25 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
17/06/30 13:16:25 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
17/06/30 13:16:25 INFO namenode.FSNamesystem: supergroup = supergroup
17/06/30 13:16:25 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/06/30 13:16:25 INFO namenode.FSNamesystem: Determined nameservice ID: cluster1
17/06/30 13:16:25 INFO namenode.FSNamesystem: HA Enabled: false
17/06/30 13:16:25 WARN namenode.FSNamesystem: Configured NNs:
17/06/30 13:16:25 ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:762)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:697)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:984)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1429)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
17/06/30 13:16:25 INFO namenode.FSNamesystem: Stopping services started for active state
17/06/30 13:16:25 INFO namenode.FSNamesystem: Stopping services started for standby state
17/06/30 13:16:25 WARN namenode.NameNode: Encountered exception during format:
java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:762)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:697)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:984)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1429)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
17/06/30 13:16:25 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:762)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:697)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:984)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1429)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
17/06/30 13:16:25 INFO util.ExitUtil: Exiting with status 1
17/06/30 13:16:25 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node1/172.16.73.143
************************************************************/
[root@node1 hadoop-2.7.2]#
根据 FSNamesystem initialization failed.在百度上查找之后,发现一个文章,说应该在hdfs-site.xml增加dfs.ha.namenodes.<>
2.7.2这个版本的文档对这个参数的解释是这样的:
dfs.ha.namenodes.EXAMPLENAMESERVICE
The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
就是要给出服务名称列表
在hdfs-stie.xml增加了这个参数
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>node1,node2,node3</value>
</property>
cluster1是JournalNode集群的名称,这里打算是用3个节点组成
执行bin/hdfs namenode -format 后出现了下面的问题,上面的FSNamesystem initialization failed这个异常消失了,出现了新的问题,似乎是在链接node1 node2,node3 上的8485端口
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41; compiled by 'jenkins' on 2016-01-26T00:08Z
STARTUP_MSG: java = 1.7.0_76
************************************************************/
17/06/30 13:05:35 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/06/30 13:05:35 INFO namenode.NameNode: createNameNode [-format]
17/06/30 13:05:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-ec2fe77c-4cb1-4248-b003-f6f9bfc1d82f
17/06/30 13:05:37 INFO namenode.FSNamesystem: No KeyProvider found.
17/06/30 13:05:37 INFO namenode.FSNamesystem: fsLock is fair:true
17/06/30 13:05:37 INFO blockmanageme