zookeeper是一个分布式(集群)应用程序协调系统,具有分布式以及开源的特性,也是大数据hadoop生态中的一个基础服务,但不去不止用于hadoop系统。本次实验记录zookeeper集群的部署,扩容,缩容,基本操作以及监控,本文档将实验结果记录下来,作为研究档案,供将来差缺补漏。
1. zookeeper集群结构与服务器信息
在前面一节记录 zookeeper集群管理(1) zookeeper集群部署 中搭建了一个3节点的集群,本节记录在不停服务的情况下对集群进行扩容操作。扩容后的集群节点信息如下(5节点集群):
主机名称 | 别名序号 | IP地址 | 系统 | 角色 | 状态 |
---|---|---|---|---|---|
zookeeper-node-1 | server.1 | CentOS7 | 10.120.67.19 | zookeeper | 已在线(follower) |
zookeeper-node-2 | server.2 | CentOS7 | 10.120.67.20 | zookeeper | 已在线(leader) |
zookeeper-node-3 | server.3 | CentOS7 | 10.120.67.21 | zookeeper | 已在线(follower) |
zookeeper-node-4 | server.4 | CentOS7 | 10.120.67.22 | zookeeper | 扩容节点 |
zookeeper-node-5 | server.5 | CentOS7 | 10.120.67.23 | zookeeper | 扩容节点 |
表格中最后2个节点是本次扩容操作的目标服务器,扩容过程中zookeeper对外提供服务不中断
2. 服务器准备工作
服务器准备工作包括如下操作,具体操作执行参考上一节内容: zookeeper集群管理(1) zookeeper集群部署
1. 关闭selinux 和 firewalld(iptables)
2. 配置主机名称与/etc/hosts文件(#实际在本例中并没有使用到)
之前的3个节点的/etc/hosts也需要修改,将新的节点加入进去
10.120.67.19 zookeeper-node-1
10.120.67.20 zookeeper-node-2
10.120.67.21 zookeeper-node-3
10.120.67.22 zookeeper-node-4
10.120.67.23 zookeeper-node-5
3. Linux系统参数调整
4. 节点NTP时钟同步
5. 所有节点安装JDK
3. 安装配置zookeeper集群
zookeeper的下载,安装配置,具体操作执行参考上一节内容: zookeeper集群管理(1) zookeeper集群部署 ,下载地址:zookeeper-3.4.12
注意:需要扩容的2个新节点,只配置但不启动
4.集群在线扩容节点
当前3节点集群的Leader节点是node2
[root@zookeeper-node-2 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/app/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[root@zookeeper-node-2 ~]# echo mntr|nc localhost 2181
zk_version 3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 6
zk_packets_sent 5
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 13
zk_watch_count 0
zk_ephemerals_count 2
zk_approximate_data_size 212
zk_open_file_descriptor_count 35
zk_max_file_descriptor_count 102400
zk_followers 2
zk_synced_followers 2
zk_pending_syncs 0
操作zookeeper创建znode观察扩容前后状态(新老节点)#创建临时节点,在扩容操作过程中保持连接。
启动2个新的扩容节点
查看2个新的扩容节点的状态
在Leader节点查看follower信息
[root@zookeeper-node-2 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/app/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[root@zookeeper-node-2 ~]# echo mntr|nc localhost 2181
zk_version 3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 10
zk_packets_sent 9
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 13
zk_watch_count 0
zk_ephemerals_count 2
zk_approximate_data_size 212
zk_open_file_descriptor_count 39
zk_max_file_descriptor_count 102400
zk_followers 4 #Here
zk_synced_followers 4 #Here
zk_pending_syncs 0 #Here
1-3节点依次重启(修改配置),
1节点重启(之前临时节点zkCli.sh连接是在1节点上的),临时节点还在
[zk: localhost:2181(CONNECTED) 13] 2018-05-31 00:53:10,417 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1161] - Unable to read additional data from server sessionid 0x1001f18ae870000, likely server has closed socket, closing socket connection and attempting reconnect
WATCHER::
WatchedEvent state:Disconnected type:None path:null
2018-05-31 00:53:11,727 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2018-05-31 00:53:11,730 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1165] - Socket error occurred: localhost/127.0.0.1:2181: Connection refused
2018-05-31 00:53:13,112 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2018-05-31 00:53:13,113 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1165] - Socket error occurred: localhost/127.0.0.1:2181: Connection refused
2018-05-31 00:53:14,256 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2018-05-31 00:53:14,256 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@878] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2018-05-31 00:53:14,272 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1001f18ae870000, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 13] ls /tmp
[tmp_0000000000, tmp_0000000001]
2节点,(Leader节点,有文档说最后重启这个,但是我这个在3节点钱重启也么有什么影响)
之后发现5节点变成了Leader
1节点上建立的连接还是有警告信息
[zk: localhost:2181(CONNECTED) 14] 2018-05-31 00:56:55,195 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1161] - Unable to read additional data from server sessionid 0x1001f18ae870000, likely server has closed socket, closing socket connection and attempting reconnect
WATCHER::
WatchedEvent state:Disconnected type:None path:null
2018-05-31 00:56:56,622 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2018-05-31 00:56:56,622 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@878] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2018-05-31 00:56:56,626 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1001f18ae870000, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 14] ls /tmp
[tmp_0000000000, tmp_0000000001]
3节点重启
在新的Leader节点查看状态
echo mntr| nc localhost 2181
zk_version 3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 2
zk_packets_sent 1
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 13
zk_watch_count 0
zk_ephemerals_count 2
zk_approximate_data_size 212
zk_open_file_descriptor_count 39
zk_max_file_descriptor_count 102400
zk_followers 4
zk_synced_followers 4
zk_pending_syncs 0
目前创建临时znode的连接依然正常,临时节点依然存在。现在停掉该连接后重连,临时节点被自动删除了。
一切正常,到这里集群已经有3节点扩容到5节点了。客户端配置文件可以在必要的时候修改zookeeper主机列表参数,将新的节点加入进去即可。
下一节我们进行集群节点缩容以及在线迁移操作。