环境准备
三台虚拟机(192.168.209.188,192.168.209.189,192.168.209.190)
kafka_2.10-0.10.0.1
zookeeper-3.4.12.tar.gz
下载地址
Zookeeper:
http://mirror.bit.edu.cn/apache/zookeeper/current/
Scala:
http://www.scala-lang.org/download/2.11.8.html
Kafka:
http://kafka.apache.org/downloads
部署zookeeper集群
下载解压后zookeeper与kafka目录为 /opt/software
[root@hadoop002 software]# ll
总用量 1108584
lrwxrwxrwx. 1 root root 19 2月 28 22:30 kafka -> kafka_2.10-0.10.0.1
drwxr-xr-x. 7 root root 118 3月 2 15:39 kafka_2.10-0.10.0.1
-rw-r--r--. 1 root root 32609012 8月 9 2016 kafka_2.10-0.10.0.1.tgz
drwxr-xr-x. 11 root root 4096 3月 2 15:11 zookeeper
-rw-r--r--. 1 root root 36667596 4月 25 2018 zookeeper-3.4.12.tar.gz
进入zookeeper/conf配置zookeeper
[root@hadoop002 software]# cd zookeeper/conf
[root@hadoop002 conf]# cp zoo_sample.cfg zoo.cfg
[root@hadoop002 conf]# ll
总用量 16
-rw-rw-r--. 1 root root 535 3月 27 2018 configuration.xsl
-rw-rw-r--. 1 root root 2161 3月 27 2018 log4j.properties
-rw-r--r--. 1 root root 1042 3月 2 15:14 zoo.cfg
-rw-rw-r--. 1 root root 922 3月 27 2018 zoo_sample.cfg
[root@hadoop002 conf]#
[root@hadoop002 conf]# vi zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/software/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=192.168.209.188:2888:3888
server.2=192.168.209.189:2888:3888
server.3=192.168.209.190:2888:3888
~
"zoo.cfg" 32L, 1042C
三台虚拟机zookeeper配置一样,可以scp过去
[root@hadoop002 conf]# cd ../
[root@hadoop002 zookeeper]# mkdir data
[root@hadoop002 zookeeper]# touch data/myid
[root@hadoop002 zookeeper]# echo 1 > data/myid
[root@hadoop002 zookeeper]#
[root@hadoop002 zookeeper]# echo 2 > data/myid
[root@hadoop002 zookeeper]# echo 3 > data/myid
###切记不可echo 3>data/myid,将>前后空格保留,否则无法将 3 写入myid文件
接下来启动zookeeper集群
[root@hadoop002 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hadoop002 bin]#
[root@hadoop002 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hadoop002 bin]#
[root@hadoop002 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hadoop002 bin]#
查看状态
[root@hadoop002 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[root@hadoop002 bin]#
[root@hadoop002 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[root@hadoop002 bin]#
[root@hadoop002 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[root@hadoop002 bin]#
其中192.168.209.189这台被选举为leader,由于我虚拟机是复制的,主机名没改,所以看起来都一样
进入客户端
[zk: localhost:2181(CONNECTED) 1] ls /
[zookeeper, kafka]
[zk: localhost:2181(CONNECTED) 2] help
ZooKeeper -server host:port cmd args
stat path [watch]
set path data [version]
ls path [watch]
delquota [-n|-b] path
ls2 path [watch]
setAcl path acl
setquota -n|-b val path
history
redo cmdno
printwatches on|off
delete path [version]
sync path
listquota path
rmr path
get path [watch]
create [-s] [-e] path data acl
addauth scheme auth
quit
getAcl path
close
connect host:port
[zk: localhost:2181(CONNECTED) 3]
下载kafka之前要下载scala
scala安装部署好后
下载基于Scala 2.11的kafka版本为0.10.0.1
创建logs目录和修改server.properties
[root@hadoop002 software]# cd kafka
[root@hadoop002 kafka]# mkdir logs
[root@hadoop002 kafka]# cd config/
[root@hadoop002 config]# vi server.properties
broker.id=1
port=9092
host.name=192.168.209.188
log.dirs=/opt/software/kafka/logs
zookeeper.connect=192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka
环境变量
[root@hadoop001 config]# vi /etc/profile
export KAFKA_HOME=/opt/software/kafka
export PATH=$KAFKA_HOME/bin:$PATH
[root@hadoop001 config]# source /etc/profile
启动它的启动脚本
[root@hadoop002 bin]# ./kafka-server-start.sh
USAGE: ./kafka-server-start.sh [-daemon] server.properties [--override property=value]*
[root@hadoop002 bin]#
提示可以用-daemon使用守护进程启动 后面跟上配置文件
[root@hadoop002 bin]# ./kafka-server-start.sh -daemon ../config/server.properties
[root@hadoop002 bin]# jps
8676 QuorumPeerMain
9252 Jps
9193 Kafka
[root@hadoop002 bin]#
kafka已经启动可以使用bin/kafka-server-stop.sh停止
在一个终端,启动Producer,并向我们上面创建的名称为my-replicated-topic5的Topic中生产消息,执行如下脚本:
bin/kafka-console-producer.sh \
--broker-list 192.168.209.188:9092,192.168.209.189:9092,192.168.209.190:9092 --topic my-topic
在另一个终端,启动Consumer,并订阅我们上面创建的名称为my-replicated-topic5的Topic中生产的消息,执行如下脚本:
bin/kafka-console-consumer.sh \
--zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
--from-beginning --topic my-topic
可以在Producer终端上输入字符串消息行,就可以在Consumer终端上看到消费者消费的消息内容
生产
[root@hadoop002 kafka]# bin/kafka-console-producer.sh \
> --broker-list 192.168.209.188:9092,192.168.209.189:9092,192.168.209.190:9092 --topic my-topic
hello kafka I am faker
[2019-03-02 21:08:14,090] WARN Error while fetching metadata with correlation id 0 : {my-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
消费
[root@hadoop002 kafka]# bin/kafka-console-consumer.sh \
> --zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
> --from-beginning --topic my-topic
hello kafka I am faker
核心概念
创建主题
[root@hadoop002 kafka]# bin/kafka-topics.sh --create \
> --zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
> --replication-factor 3 --partitions 3 --topic topic-first
Created topic "topic-first".
Topic: topic-first 主题 1----10
partitions: 3个分区 下标是从0开始的 物理上的分区
replication: 3个副本 指的是一个分区被复制3份
3台机器:
192.168.209.188 192.168.209.189 192.168.209.190
0分区: topic-first-0 topic-first-0 topic-first-0
1分区: topic-first-1 topic-first-1 topic-first-1
2分区: topic-first-2 topic-first-2 topic-first-2
描述
[root@hadoop002 bin]# ./kafka-topics.sh --describe \
> --zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
> --topic topic-first
Topic:topic-first PartitionCount:3 ReplicationFactor:3 Configs:
Topic: topic-first Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: topic-first Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: topic-first Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
[root@hadoop002 bin]#
Partition: 0
Leader: 3 指的是broker.id=3 读写
Replicas: 复制该分区数据的节点列表,第一位代表leader 静态表述
Isr: in-sync Replicas 正在复制的副本节点列表 动态表述
当leader挂了,会从这个列表选举出leader
broker:Kafka实例的节点
一般生产上:
–replication-factor 3 --partitions 3
–partitions 取决于你的broker数量
consumer group:
一个消费组可以包含一个或多个消费者,分区只能被一个消费组的其中一个消费者去消费
正常在企业开发使用多分区方式去提高计算的能力
常用命令
### 创建
./kafka-topics.sh --create \
--zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
--replication-factor 3 --partitions 3 --topic test
### 查看
./kafka-topics.sh --list \
--zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka
### 查看某个topic
./kafka-topics.sh --describe \
--zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
--topic test
### 删除
./kafka-topics.sh --delete \
--zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
--topic test
Topic test is marked for deletion.
Note: This will have no impact if delete.topic.enable is not set to true.
删除后还能发送和接受数据,因为这是假删除,即使将delete.topic.enable=true配置好也是不行的
这时就要将kafka的元数据以及数据删除,kafka元数据在zookeeper中 数据在磁盘
进入zookeeper客户端,删除元数据
rmr /config/topics/topic-first
rmr /brokers/topics/topic-first
rmr /admin/delete_topics/topic-first
删除数据
rm -rf $KAFKA_HOME/logs/topic-first-*
这样就彻底删完了
调整/修改
./kafka-topics.sh --alter \
--zookeeper 192.168.209.188:2181,192.168.209.189:2181,192.168.209.190:2181/kafka \
--partitions 4 --topic topic-two
这里的分区只能增大不能减小