在至少有一个Leader存在的前提下,进行Zookeeper的在线增量、在线减量、在线迁移
在全过程中ZooKeeper不停止服务
1、 注意事项
首先,当我们要从3台扩充到5台时,应保证集群不停止服务。
3台不停止服务的最低限度是2台(X/2+1),而5台的最低限度是3台。
我们应该保证,集群中最低有3台ZooKeeper是启动的。
此外,重启时应保证先重启myid最小的机器,由小向大进行重启
Leader无论其myid大小,都放到最后重启
因为ZooKeeper的机制中,myid大的会向小的发起连接,而小的不会向大的发起连接。因此如果最后重启myid最小的机器,则其可能无法加入集群
2、环境情况
五台机器
IP | Hostname |
---|---|
192.168.2.102 | hadoop102 |
192.168.2.103 | hadoop103 |
192.168.2.104 | hadoop104 |
192.168.2.105 | hadoop105 |
192.168.2.106 | hadoop106 |
jdk
1.8.144
zookeeper
zookeeper-3.4.10
myid
1-5
配置文件
server.1=hadoop102:2888:3888
server.2=hadoop103:2888:3888
server.3=hadoop104:2888:3888
3、 配置一个3节点的ZooKeeper
hadoop102
[victor@hadoop102 bin]$./zkServer.shstatus
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode:follower
hadoop103
[victor@hadoop103 bin]$ ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader
hadoop104
[victor@hadoop104 bin]$./zkServer.shstatus
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode:follower
4、将其扩容为5节点的ZooKeeper
1)先查看原先的ZooKeeper集群情况
echo mntr|nc localhost 2181
这条4字命令可以查看集群的情况,其中follower的相关数据需要在Leader机器上才能查看
2)在hadoop103上查看
[victor@hadoop103 bin]$ echo mntr|nc localhost 2181
zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 3
zk_packets_sent 2
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leaderzk_znode_count 4
zk_watch_count 0zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 27
zk_max_file_descriptor_count 65535
zk_followers 2zk_synced_followers 2
zk_pending_syncs 0
3)启动另外两台机器的Zookeeper,另外两台机器的配置文件
zoo.cfg
server.1=hadoop102:2888:3888
server.2=hadoop103:2888:3888
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
4)启动
hadoop105
[victor@hadoop105 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower
hadoop106
[victor@hadoop106 bin]# ./zkServer.sh status
JMX enabledby default
Using config:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode:follower
5)再查看集群情况
仍然在hadoop103上查看
[victor@hadoop103 bin]$ echo mntr|nc localhost 2181
zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 4
zk_packets_sent 3
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 31
zk_max_file_descriptor_count 65535
zk_followers 4
zk_synced_followers 4
zk_pending_syncs 0
可以看到zk_followers为4,连接到的follower从2变为4了,而且zk_synced_followers为4,说明新加入的2个也都同步好了,接下来我们滚动重启myid为1-3的前三台机器先处理hadoop102关闭,如不放心请在关闭其间于Leader机器或后加入的两台机器上监控日志
6)关闭 hadoop102
[victor@hadoop102 bin]$./zkServer.shstop
JMXenabledbydefault
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stoppingzookeeper...STOPPED
7)修改其配置文件
由原来的
server.1=hadoop102:2888:3888
server.2=hadoop103:2888:3888
server.3=hadoop104:2888:3888
到新的
server.1=hadoop102:2888:3888
server.2=hadoop103:2888:3888
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
启动
[victor@hadoop102 bin]$ ./zkServer.sh start
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[victor@hadoop102 bin]$ ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower
然后跳过作为Leader的hadoop103,先处理hadoop104
8)hadoop104关闭
[victor@hadoop104bin]$./zkServer.shstop
JMXenabledbydefault
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stoppingzookeeper...STOPPED
修改其配置文件
server.1=hadoop102:2888:3888
server.2=hadoop103:2888:3888
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
启动
[victor@hadoop104bin]$./zkServer.sh start
JMX enabledby default
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Startingzookeeper...STARTED
[victor@hadoop104bin]$./zkServer.sh status
JMX enabledby default
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode:follower
9)最后处理原Leader的hadoop103
关闭
[victor@hadoop103 bin]$ ./zkServer.sh stop
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
新ZooKeeper Leader
查看新Leader ZooKeeper会尽可能的选择myid最大的机器为Leader,因此原本的hadoop106其myid为5变为了Leader
[victor@hadoop106 bin]# ./zkServer.sh status
JMXenabledbydefault
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode:leader
hadoop103修改配置文件
server.1=hadoop102:2888:3888
server.2=hadoop103:2888:3888
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
启动
[victor@hadoop103 bin]$./zkServer.sh start
JMX enabledby default
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Startingzookeeper...STARTED
[victor@hadoop103bin]$./zkServer.sh status
JMX enabledby default
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode:follower
在新的Leader上hadoop106查看集群情况
[victor@hadoop106 bin]# echo mntr|nc localhost 2181
zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
zk_avg_latency 1
zk_max_latency 4
zk_min_latency 0
zk_packets_received 12
zk_packets_sent 11
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 33
zk_max_file_descriptor_count 65535
zk_followers 4zk_synced_followers 4
zk_pending_syncs 0
一切正常到这里,我们已经将原本的3台扩展到了5台,成功了一半。然后只要将现在的5台再缩小到3台,且不包括原本myid为1-2的机器,就完成了迁移,将5台缩小回3台修改hadoop104根据前面的注意事项,我们此时5台集群中启动的数量不得少于3台,因此我们需要先修改3-5号机器的配置文件为3台,再关闭1-2号机器关闭
5、将其缩小为3台
1) hadoop104关闭
[victor@hadoop104 bin]$./zkServer.sh stop
JMX enabledby default
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stoppingzookeeper...STOPPED
2)修改hadoop104配置文件为
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
3)hadoop104启动
[victor@hadoop104 bin]$./zkServer.sh start
JMX enabledby default
Using config:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper...STARTED
[victor@hadoop104 bin]$./zkServer.sh status
JMX enabledby default
Using config:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode:follower
4)然后修改hadoop105
关闭
[victor@hadoop105 bin]# ./zkServer.sh stop
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper ...
5)hadoop105 stop 修改配置文件为
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
6)启动hadoop105
[victor@hadoop105 bin]$ ./zkServer.sh start
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[victor@hadoop105 bin]$ ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower
7)最后修改hadoop106
关闭hadoop106
[victor@hadoop106bin]$./zkServer.shstop
JMXenabledbydefault
Usingconfig:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stoppingzookeeper...STOPPED
8)修改hadoop106配置文件为
server.3=hadoop104:2888:3888
server.4=hadoop105:2888:3888
server.5=hadoop106:2888:3888
关闭后Leader移动到了myid第二大的hadoop105上
9)启动
[victor@hadoop106 bin]$./zkServer.sh start
JMX enabledby default
Using config:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper...STARTED
[victor@hadoop106 bin]$./zkServer.sh status
JMX enabledby default
Using config:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode:follower
10)在hadoop105 Leader中查看
[victor@hadoop105 bin]$ echo mntr| nc localhost 2181
zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 4
zk_packets_sent 3
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 27
zk_max_file_descriptor_count 65535
zk_followers 2zk_synced_followers 2
zk_pending_syncs 0
此时的zk_followers为2,说明Leader已经不认1-2号机器了,关闭1-2号机器
11)关闭hadoop102
[victor@hadoop102 bin]$./zkServer.shstop
JMX enabledby default
Using config:/usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper...STOPPED
12)关闭hadoop103
[victor@hadoop103 bin]$ ./zkServer.sh stop
JMX enabled by default
Using config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper ...
13)hadoop105 Leader 再查看
[victor@hadoop105 bin]$ echo mntr | nclocalhost 2181
zk_version 3.4.6-1569965,builton02/20/201409:09GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received5
zk_packets_sent4
zk_num_alive_connections 1
zk_outstanding_requests0
zk_server_stateleader
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count0
zk_approximate_data_size 27
zk_open_file_descriptor_count 27
zk_max_file_descriptor_count 65535
zk_followers 2
zk_synced_followers2
zk_pending_syncs 0