1.配置命名服务:hdfs-site.xml
<property>
<name>dfs.nameservices</name>
<value>hearain</value>
</property>
<property>
<name>dfs.ha.namenodes.hearain</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hearain.nn1</name>
<value>namenode1的地址:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hearain.nn2</name>
<value>namenode1的地址:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hearain.nn1</name>
<value>namenode1的地址:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.hearain.nn2</name>
<value>namenode1的地址:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://地址加端口,机器以分号相隔</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hearain</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hadoop/journal/data</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
2.配置core-site配置文件
<property>
<name>fs.defaultFS</name>
<value>hdfs://hearain(集群名字)</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>集群地址加端口号,机器之间以逗号分隔node1:2181</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp/</value>
</property>
3.配置和安装zookeeper
tickTime=2000
dataDir=/opt/zookeeper/tmp(若没有这个目录,则自己创建)
clientPort=2181
initLimit=5
syncLimit=2
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888
然后在相应的机器上创建myid文件,里面写上server后面带的相应的数字
4.分别在journal机器上启动
启动journalnode:
./hadoop-daemon.sh start journalnode
5.启动后在任意一台namenode机器上执行格式化(bin目录下):
hostname node1
./hdfs namenode -format
6.启动刚格式化的namenode机器
./hadoop-daemon.sh start namenode
7.在没有格式化的namenode上去执行
./hdfs namenode -bootstrapStandby
8.运行./dfs-stop.sh
9.运行./dfs-start.sh
10.运行jps发现ZKFC没有启动,原因有可能是没有格式化
11.在其中一个namenode上面格式化ZKFC,在bin目录下执行
12.重新执行8和9即可
13.配置mapreduce的配置文件
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
14.配置yarn的配置文件
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
15.开启resourcemanager和nodemanager命令:
./start-yarn.sh
jps查看是否都运行成功:如果其中一个进程运行失败,则去logs里面去查看详细日志
node1:
node2:
node3: