本文的实现是在前一篇博客hadoop的集群实现的基础上进行进一步操作,其中server1作为master,server5作为备用master,server2、server3和server4作为集群服务器
1.新开一个服务器server5并安装nfs-util,:
[root@server5 ~]# yum install nfs-utils -y
在五个服务端,没有hadoop用户的需要先新建用户:
[root@server4 ~]# useradd -u 800 hadoop
然后在五个服务端开启nfs服务,在server2、server3、server4、server5端将172.25.17.1:/home/hadoop目录挂载到本机/home/hadoop目录下:
[root@server2 ~]# /etc/init.d/rpcbind start
[root@server2 ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS mountd: [ OK ]
Starting NFS daemon: [ OK ]
Starting RPC idmapd: [ OK ]
[root@server2 ~]# mount 172.25.17.1:/home/hadoop/ /home/hadoop/
为了实验环境的干净,在五个服务端将之前的环境清理(不是必须):
[root@server2 ~]# rm -fr /tmp/*
2.在server1端,切换到hadoop用户,解压zookeeper安装包并将zoo_sample.cfg复制为zoo.cfg文件:
由于其他主机都使用了server1端的nfs文件系统,所以在server1端根目录下的hadoop目录里的所有操作,都会同步到其他4台主机里,也就是说5台机器的/home/hadoop目录内容完全一致。
[root@server1 ~]# su - hadoop
[hadoop@server1 ~]$ ls
hadoop hadoop-2.7.3.tar.gz jdk1.7.0_79 zookeeper-3.4.9.tar.gz
hadoop-2.7.3 java jdk-7u79-linux-x64.tar.gz
[hadoop@server1 ~]$ tar zxf zookeeper-3.4.9.tar.gz
[hadoop@server1 ~]$ cd zookeeper-3.4.9
[hadoop@server1 zookeeper-3.4.9]$ cd conf/
[hadoop@server1 conf]$ cp zoo_sample.cfg zoo.cfg
编辑zoo.cfg文件,写入集群的三台设备:
30 server.1=172.25.17.2:2888:3888
31 server.2=172.25.17.3:2888:3888
32 server.3=172.25.17.4:2888:3888
在三个集群主机端,新建目录并新建文件myid:
三个主机端myid内容与zoo.cfg文件写入的数字一致,比如server2为server.1就写1,server3为server.2就写2,依次类推
[hadoop@server2 ~]$ mkdir /tmp/zookeeper
[hadoop@server2 ~]$ cd /tmp/zookeeper/
[hadoop@server2 zookeeper]$ echo 1 > myid
3.在三个集群主机端开启服务并查看状态:server3为leader:
server2端:
[hadoop@server2 zookeeper-3.4.9]$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server2 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
server3端:
[hadoop@server3 zookeeper-3.4.9]$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server3 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader
server4端:
[hadoop@server4 zookeeper-3.4.9]$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server4 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
4.在server1端:
[hadoop@server1 ~]$ cd hadoop
[hadoop@server1 hadoop]$ cd etc/hadoop/
[hadoop@server1 hadoop]$ vim core-site.xml
17 <!-- Put site-specific property overrides in this file. -->
18
19 <configuration>
20 <property>
21 <name>fs.defaultFS</name>
22 <value>hdfs://masters</value>
23 </property>
24
25 <property>
26 <name>ha.zookeeper.quorum</name>
27 <value>172.25.17.2:2181,172.25.17.3:2181,172.25.17.4:2181</value>
28 </property>
29
30 </configuration>
编辑hdfs-site.xml文件:
[hadoop@server1 hadoop]$ vim hdfs-site.xml
19 <configuration>
20 <property>
21 <name>dfs.replication</name>
22 <value>3</value>
23 </property>
24 <property>
25 <name>dfs.nameservices</name>
26 <value>masters</value>
27 </property>
28
29 <property>
30 <name>dfs.ha.namenodes.masters</name>
31 <value>h1,h2</value>
32 </property>
33
34 <property>
35 <name>dfs.namenode.rpc-address.masters.h1</name>
36 <value>172.25.17.1:9000</value>
37 </property>
38
39 <property>
40 <name>dfs.namenode.http-address.masters.h1</name>
41 <value>172.25.17.1:50070</value>
42 </property>
43
44 <property>
45 <name>dfs.namenode.rpc-address.masters.h2</name>
46 <value>172.25.17.5:9000</value>
47 </property>
48
49 <property>
50 <name>dfs.namenode.http-address.masters.h2</name>
51 <value>172.25.17.5:50070</value>
52 </property>
53
54 <property>
55 <name>dfs.namenode.shared.edits.dir</name>
56 <value>qjournal://172.25.17.2:8485;172.25.17.3:8485;172.25.17.4:8485/masters</value>
57 </property>
58
59 <property>
60 <name>dfs.journalnode.edits.dir</name>
61 <value>/tmp/journaldata</value>
62 </property>
63
64 <property>
65 <name>dfs.ha.automatic-failover.enabled</name>
66 <value>true</value>
67 </property>
68
69 <property>
70 <name>dfs.client.failover.proxy.provider.masters</name>
71 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
72 </property>
73
74
75 <property>
76 <name>dfs.ha.fencing.methods</name>
77 <value>
78 sshfence
79 shell(/bin/true)
80 </value>
81 </property>
82
83 <property>
84 <name>dfs.ha.fencing.ssh.private-key-files</name>
85 <value>/home/hadoop/.ssh/id_rsa</value>
86 </property>
87
88 <property>
89 <name>dfs.ha.fencing.ssh.connect-timeout</name>
90 <value>30000</value>
91 </property>
92 </configuration>
编辑文件slaves:
[hadoop@server1 hadoop]$ vim slaves
1 172.25.17.2
2 172.25.17.3
3 172.25.17.4
4,在集群端启动jn:
server2端:
[hadoop@server2 hadoop]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server2.out
[hadoop@server2 hadoop]$ jps
1526 Jps
1475 JournalNode
1279 QuorumPeerMain
server3端:
[hadoop@server3 hadoop]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server3.out
[hadoop@server3 hadoop]$ jps
1320 JournalNode
1370 Jps
1233 QuorumPeerMain
server4端:
[hadoop@server4 hadoop]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server4.out
[hadoop@server4 hadoop]$ jps
1579 QuorumPeerMain
1660 JournalNode
1710 Jps
5.免密测试:
[hadoop@server1 hadoop]$ ssh server5
[hadoop@server1 hadoop]$ ssh 172.25.17.5
[hadoop@server1 hadoop]$ ssh 172.25.17.4
[hadoop@server1 hadoop]$ ssh server4
6.初始化
[hadoop@server1 hadoop]$ bin/hdfs namenode -format
复制:
[hadoop@server1 hadoop]$ scp -r /tmp/hadoop-hadoop/ 172.25.17.5:/tmp
fsimage_0000000000000000000 100% 353 0.3KB/s 00:00
seen_txid 100% 2 0.0KB/s 00:00
VERSION 100% 204 0.2KB/s 00:00
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
[hadoop@server1 hadoop]$ bin/hdfs zkfc -formatZK
[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [server1 server5]
server5: starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server5.out
server1: starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server1.out
The authenticity of host '172.25.17.4 (172.25.17.4)' can't be established.
RSA key fingerprint is 4c:43:a6:a9:2c:4c:df:20:9f:a5:f2:7e:0a:16:77:6e.
Are you sure you want to continue connecting (yes/no)? 172.25.17.3: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server3.out
172.25.17.2: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server2.out
yes
172.25.17.4: Warning: Permanently added '172.25.17.4' (RSA) to the list of known hosts.
172.25.17.4: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server4.out
Starting journal nodes [172.25.17.2 172.25.17.3 172.25.17.4]
172.25.17.4: journalnode running as process 1660. Stop it first.
172.25.17.3: journalnode running as process 1320. Stop it first.
172.25.17.2: journalnode running as process 1475. Stop it first.
Starting ZK Failover Controllers on NN hosts [server1 server5]
server1: starting zkfc, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-zkfc-server1.out
server5: starting zkfc, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-zkfc-server5.out
[hadoop@server1 hadoop]$ jps
1499 NameNode
2197 Jps
1793 DFSZKFailoverController
测试:server1为active:
server1端杀掉namennode进程:
[hadoop@server1 hadoop]$ jps
2573 Jps
1499 NameNode
1793 DFSZKFailoverController
[hadoop@server1 hadoop]$ kill -9 1499
server5接管为active:
重新开启server1的namenode:
[hadoop@server1 hadoop]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server1.out
server1为standby: