ZooKeeper安装很简单。下载安装介质后,解压到目录:
/home/dcc/zookeeper-3.4.6
相应的,数据存放目录为:
/home/dcc/zookeeper-3.4.6-data
进入到conf目录,发现有一个zoo_sample.cfg,该文件是自解释的,我们直接拷贝此文件zoo.cfg
进去后只需要做如下配置:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/dcc/zookeeper-3.4.6-data/
# the port at which the clients will connect
clientPort=7181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
#zookeeper cluster
server.1=hdh1:57001:57701
server.2=hdh2:57001:57701
server.3=hdh3:57001:57701
server.4=hdh4:57001:57701
server.5=hdh5:57001:57701
上面,唯一要配置的是
dataDir=/home/dcc/zookeeper-3.4.6-data/
这指定了ZooKeeper的数据存放目录。
而#zookeeper cluster的信息为支持集群配置。
把配置信息,分别发完其他节点,并且在每个节点的dataDir下新建一个myId文件,里边记录”server.X=host_ip:port_A:port_B“中的X。
完成后,进入每一台的机器的目录下,运行:
bin/zkServer.sh start
启动成功后,可以在hdh1测试,是否连接成功:
bin/zkCli.sh -server hdh1:7181
然后输入命令help可以得到提示。
这里有一个问题,为何ZooKeeper没有像Hadoop一种,进入其中之一个节点运行命令就启动了整个集群呢?