背景:三台阿里云服务器,centos 7
1. 下载
链接:
http://zookeeper.apache.org/releases.html https://archive.apache.org/dist/zookeeper/
2. 上传
通过 MobaXterm stfp功能
上传文件到 /opt/pkg
1) 下载后上传
2)直接在线下载也可以
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
3. 解压
三台机器都需要安装,配置
tar -zxvf zookeeper-3.4.6.tar.gz -C /opt/software/ --解压到指定路径
4. 配置文件
cd /opt/software/zookeeper-3.4.6/conf
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg
注释掉自带的
clientPort=2181
dataDir=/tmp/zookeeper(tmp路径下的文件会被定期删除,dataDir不可配置在此路径下,否则报错)
clientPort=2181
dataDir=/opt/software/zookeeper/data
dataLogDir=/opt/software/zookeeper/logs
server.1=0.0.0.0:2881:3881 --阿里云 本机都配置0.0.0.0 其他两台机器,配公网ip
server.2=cdh02:2881:3881 --配了host映射,所以这里主机名代替ip
server.3=cdh03:2881:3881
第二台机器
clientPort=2181
dataDir=/opt/software/zookeeper/data
dataLogDir=/opt/software/zookeeper/logs
server.1=cdh01:2881:3881
server.2=0.0.0.0:2881:3881
server.3=cdh03:2881:3881
第三台机器
clientPort=2181
dataDir=/opt/software/zookeeper/data
dataLogDir=/opt/software/zookeeper/logs
server.1=cdh01:2881:3881
server.2=cdh02:2881:3881
server.3=0.0.0.0:2881:3881
5.创建数据存放目录,并创建myid文件
mkdir -p /opt/software/zookeeper/data
mkdir -p /opt/software/zookeeper/logs
cd /opt/software/zookeeper/data/
echo 1 > myid
echo 2 > myid --其他两台分别是2 和 3
echo 3 > myid --其他两台分别是2 和 3
6. 配置环境变量
vim /etc/profile.d/my_env.sh
export ZOOKEEPER_HOME=/opt/software/zookeeper-3.4.6
export PATH=$PATH:$ZOOKEEPER_HOME/bin
source /etc/profile.d/my_env.sh --刷新
7.启动
三台依次启动
cd /opt/software/zookeeper-3.4.6/
bin/zkServer.sh start
查看状态
bin/zkServer.sh status
8.报错
1)端口占用
报错信息
ERROR [/0.0.0.0:3881:QuorumCnxManager$Listener@517] - Exception while listening java.net.BindException: Address already in use (Bind failed)
查看端口,果然被占用(同理,查看2181也被占用了)
netstat -tunlp | grep 3881
修改zoo.cfg
1)2181修改为2189 (经查看,2181也被占用了)
2)2881 修改为 2888,3881修改为3888(三台都需要修改)
重新启用成功
2)ip地址不是本机网卡
云服务器采用虚拟化的技术,而监听的网卡是属于物理网关的网卡。虚拟化机内部无此网卡,所以监听不到
解决方案:监听所有网卡,以下两个方式都可行,二选一即可
quorumListenOnAllIPs=true --开启此参数(ip正常配为公网ip,并开启此参数)
server.1=0.0.0.0:2888:3888 --或将本机ip改为0.0.0.0(完整内容见上面配置zoo.cfg)
注:阿里云,记得在安全组开放端口
3)报错:zookeeper Unexpected exception, tries=3, connecting to cdh03…
ERROR [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@230] - Unexpected exception
java.net.SocketTimeoutException: connect timed out
[root@cdh01 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
[root@cdh01 bin]# ./zkServer.sh start
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 15869.
cd /opt/software/zookeeper-3.4.6
#查看日志
[root@cdh01 zookeeper-3.4.6]# cat zookeeper.out
#jps查看 在运行,杀掉进程
[root@cdh01 bin]# jps
1056 ResourceManager
32224 DataNode
2225 AlertPublisher
1041 JobHistoryServer
32227 NameNode
1989 HeadlampServer
32229 SecondaryNameNode
1990 EventCatcherService
17165 Jps
32223 QuorumPeerMain
[root@cdh01 bin] kill 32223
4)报错
报错信息: It is probably not running.
[root@cdh02 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
查看日志
[root@cdh02 bin]# cat zookeeper.out
日志内容:内存不够
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x000000008ee00000, 1237319680, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1237319680 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /opt/software/zookeeper-3.4.6/bin/hs_err_pid24458.log
处理办法,见此博客:https://blog.csdn.net/u014039577/article/details/49148063/
查看大小超过100M的文件,删log需慎重
[root@cdh02 bin]# find / -size +100M -exec ls -lh {} \;
清缓存
[root@cdh01 ~]# free -m
total used free shared buff/cache available
Mem: 7552 1545 872 0 5134 5707
Swap: 0 0 0
[root@cdh01 ~]# echo 1 > /proc/sys/vm/drop_caches
[root@cdh01 ~]# free -m
total used free shared buff/cache available
Mem: 7552 1545 5390 0 616 5748
Swap: 0 0 0
[root@cdh01 ~]# echo 2 > /proc/sys/vm/drop_caches
[root@cdh01 ~]# free -m
total used free shared buff/cache available
Mem: 7552 1543 5889 0 118 5815
Swap: 0 0 0
[root@cdh01 ~]# echo 3 > /proc/sys/vm/drop_caches
[root@cdh01 ~]# free -m
total used free shared buff/cache available
Mem: 7552 1545 5889 0 117 5814
Swap: 0 0 0
报错
[root@cdh02 bin]# cat zookeeper.out
Connection broken for id 1, my id = 2, error =
java.net.SocketException: Connection reset
大概是和zookeeper的启动有关(注意:确实需要一起启动,至于为什么要一起启动,现在还不清楚)
所以stop其他两台的zookeeper,三台一起重新启动
[root@cdh02 bin]# jps
19843 QuorumPeerMain
2678 DatartServerApplication
32250 Jps
[root@cdh02 bin]# kill 19843
[root@cdh02 bin]# jps
32275 Jps
2678 DatartServerApplication
三台一起stop,一起start
[root@cdh01 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower
[root@cdh01 bin]# ./zkServer.sh stop
[root@cdh02 bin]# ./zkServer.sh stop
[root@cdh03 bin]# ./zkServer.sh stop
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
[root@cdh01 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
[root@cdh01 bin]# ./zkServer.sh start
[root@cdh02 bin]# ./zkServer.sh start
[root@cdh03 bin]# ./zkServer.sh start
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
启动成功
[root@cdh01 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader
[root@cdh02 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower
[root@cdh03 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower