1环境
操作系统:CentOS 7
JDK: java 1.8
kafka 版本:kafka_2.11-2.2.1.tar.gz
zookeeper版本: zookeeper-3.4.8.tar.gz
2架构总览
zookeeper集群中包含三个zookeeper服务。
3部署操作
3.1部署zookeeper集群
zookeeper-3.4.8.tar.gz安装目录统一为/usr/local/zookeeper
3.1.1解压文件
cp zookeeper-3.4.8.tar.gz /usr/local/
cd /usr/local/
tar -zxvf zookeeper-3.4.8.tar.gz
mv zookeeper-3.4.8 zookeeper
3.1.2修改配置文件
# The number of milliseconds of each
ticktickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just# example sakes.
dataDir=/data/zookeeper/data
dataLogDir=/data/zookeeper/log
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more
clientsmaxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=192.168.0.209:2888:3888
server.2=192.168.0.134:2888:3888
server.3=192.168.0.135:2888:3888
说明如下dataDir 是zookeeper存放myid, zookeeper_server.pid的目录,以及zookeeper运行过程中必须保存的其他数据。对于zookeeper集群架构比较重要的,就是myid文件。
clientPort 是zookeeper提供给kafka使用的客户端端口号。kafka是zookeeper的客户端,因此需要提供这个端口号。默认是2181。
server.1=192.168.0.209:2888:3888
server.2=192.168.0.134:2888:3888
server.3=192.168.0.135:2888:3888
这三行配置是zookeeper之间进行leader选举所需的。
zookeeper启动监听两个端口,一个是2181,另一个就是3888。2181用于kafka broker之间进行集群管理,3888就是zookeeper的节点之间进行集群管理。
server.1的意思是编号为1的zookeeper配置,而zookeeper的编号是通过myid文件配置的。和server.1对应的myid的内容就是一个数字1。2,3的配置依次类推。
3.1.3复制配置文件
将zoo.cfg文件分别复制到三台服务器/usr/local/zookeeper/conf目录下。内容完全一致。
3.1.4 创建myid文件
在三台服务器中的/data/zookeeper/data目录中创建myid文件。
3.1.5 防火墙开放相应端口号
如果启动了防火墙,zookeeper启动时的leader选举会失败,或者kafka服务无法正常启动,所以需要通过如下命令开放对应的端口:
firewall-cmd --zone=public --add-port=2181/tcp --permanent
firewall-cmd --reload
firewall-cmd --zone=public --add-port=3888/tcp --permanent
firewall-cmd --reload
3.1.6 启动zookeeper
在三台服务器的/usr/local/zookeeper/bin目录中,执行如下命令启动zookeeper
cd /usr/local/zookeeper/bin
./zkServer.sh start &
3.2部署kafka集群
3.2.1解压文件
kafka的安装目录统一为/usr/local/kafka
cp kafka_2.11-2.2.1.tar.gz /usr/local/
cd /usr/local/
tar -zxvf kafka_2.11-2.2.1.tar.gz
mv kafka_2.11-2.2.1 kafka
3.2.2修改配置文件
kafka服务的配置文件在/usr/local/kafka/config/server.properties
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
host.name=192.168.0.209
listeners=PLAINTEXT://192.168.0.209:9092
advertised.listeners=PLAINTEXT://192.168.0.209:9092
log.dirs=/data/kafka/data
zookeeper.connect=192.168.0.209:2181,192.168.0.134:2181,192.168.0.135:2181
broker.id 是每个kafka实例的唯一编号,从0开始编号,依次递增。
host.name 是kafka实例所在的主机名
listeners 和 advertised.listeners 是外部访问kakfa所使用的IP和端口号
log.dirs 是kafka保存topic数据的目录
zookeeper.connect 是每个kafka实例访问zookeeper的连接信息,用逗号分隔的多个ip:port。
注意:如果考虑服务器性能,线程和内存使用情况优化时需要修改kafka-serve-start.sh文件
找到KAFKA_HEAP_OPTS
默认为-Xmx1G -Xms1G
最好设置为内存的一半
#!/bin/bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
if [ $# -lt 1 ];
then
echo "USAGE: $0 [-daemon] server.properties [--override property=value]*"
exit 1
fi
base_dir=$(dirname $0)
if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/log4j.properties"
fi
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xmx16G -Xms16G"
fi
EXTRA_ARGS=${EXTRA_ARGS-'-name kafkaServer -loggc'}
COMMAND=$1
case $COMMAND in
-daemon)
EXTRA_ARGS="-daemon "$EXTRA_ARGS
shift
;;
*)
;;
esac
exec $base_dir/kafka-run-class.sh $EXTRA_ARGS kafka.Kafka "$@"
3.2.3 复制配置文件
将配置文件拷贝到每个kafka服务器所在的目录中
修改其中的broker.id,host.name, listeners,advertised.listeners。
3.2.4 启动kafka集群
启动每个kafka实例,当每个实例启动成功,就完成了kafka集群的部署。
cd /usr/local/kafka/bin
./kafka-server-start.sh ../config/server.properties &