kafka是Java写的,需要依赖jvm
先安装jdk或者jre
略去jdk安装部分
kafka需要zookeeper,先安装zookeeper
统一使用CDH5.7.0的包作为学习(http://archive.cloudera.com/cdh5/cdh/5/)
配置zookeeper到环境变量
export ZOOKEEPER_HOME=/Users/xiejundong/software/zookeeper-3.4.5-cdh5.7.0
export PATH=$PATH:$ZOOKEEPER/bin
拷贝zookeeper配置文件,将$ZOOKEEPER_HOME/conf/zoo_sample.cfg拷贝一份,命名为zoo.cfg
修改配置文件$ZOOKEEPER_HOME/conf/zoo.cfg中的dataDir参数(默认在tmp目录下,重启会丢失数据的)
dataDir=/Users/xiejundong/software/tmp/zookeeper
启动zookeeper
zkServer.sh start
下载kafka(使用0.9.0.0)安装包,解压
配置kafka的环境变量
export KAFKA_HOME=/Users/xiejundong/software/kafka_2.11-0.9.0.0
export PATH=$PATH:$KAFKA_HOME/bin
修改$KAFKA_HOME/config/server.properties
修改以下参数
#一个broker就是一个kafka的实例,各个实例的broker.id一定不能重复
broker.id=0
#默认在/tmp下,需要改成其他工作目录
log.dirs = log.dirs=/Users/xiejundong/software/tmp/kafka-logs
#监听端口,如果同一台机器启动多个kafka实例,注意端口不要重复
listeners=PLAINTEXT://:9092
#主机名,可以不配置
host.name=notebook
#配置zookeeper的地址
#zookeeper集群时,配置多个127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002
zookeeper.connect=zookeeper.connect=notebook:2181
启动kafka,启动时需要指定kafka配置文件properties文件
kafka-server-start.sh $KAFKA_HOME/config/server.properties
如果要以daemen形式后台启动,需要加参数daemen
kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties
用jps -m可以看到kafka
创建kafka的topic
kafka-topics.sh --create --zookeeper notebook:2181 --replication-factor 1 --partitions 1 --topic hello_topic
注意:创建topic的时候是需要指定 --zookeeper的地址的,这里先创建一个分区一个副本数的topic
查看所有的topic
kafka-topics.sh --list --zookeeper notebook:2181
利用kafka自带的生产kafka-console-producer.sh和kafka-console-consumer.sh测试一下topic是否正常
启动生产者
注意:生产消息时,用的是broker-list,直接发给kafka,没有经过zookeeper
kafka-console-producer.sh --broker-list notebook:9092 --topic hello_topic
启动消费者
注意:消费消息时,使用的是zookeeper的地址,--from-beginning指定从头开始消费
kafka-console-consumer.sh --zookeeper notebook:2181 --topic hello_topic --from-beginning
查看topic详情
xiejundongsiMac:~ xiejundong$ kafka-topics.sh --describe --zookeeper notebook:2181
Topic:hello_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: hello_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
配置多个broker的kafka集群
多个kafka的集群操作非常简单,只需要把$KAFKA_HOME/config/server.properties拷贝多几份,把部分参数改掉,不同kafka实例启动时指定不同的properties文件即可
cp $KAFKA_HOME/config/server.properties $KAFKA_HOME/config/server1.properties
cp $KAFKA_HOME/config/server.properties $KAFKA_HOME/config/server2.properties
cp $KAFKA_HOME/config/server.properties $KAFKA_HOME/config/server3.properties
主要修改下面参数,注意不要相同
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
listeners=PLAINTEXT://:9092
# A comma seperated list of directories under which to store log files
log.dirs=/Users/xiejundong/software/tmp/kafka-logs
启动多个kafka实例
kafka-server-start.sh -daemon $KAFKA_HOME/config/server1.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server2.properties &
kafka-server-start.sh -daemon $KAFKA_HOME/config/server3.properties &
查看进程
xiejundongsiMac:zookeeper-3.4.5-cdh5.7.0 xiejundong$ jps -m
1107 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server3.properties
1109 Jps -m
876 QuorumPeerMain /Users/xiejundong/software/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
1102 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server1.properties
1103 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server2.properties
创建3个副本的topic,并测试kafka的容错性
kafka-topics.sh --create --zookeeper notebook:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic
查看当前topic情况
xiejundongsiMac:zookeeper-3.4.5-cdh5.7.0 xiejundong$ kafka-topics.sh --describe --zookeeper notebook:2181
Topic:hello_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: hello_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
可见my-replicated-topic有3个副本,一个分区,leader是broker.id为1的实例,replicas在1,2,3都有副本,isr存活在1,2,3
测试能否正常发送接收消息
启动生产者
kafka-console-producer.sh --broker-list notebook:9093,notebook:9094,notebook:9095 --topic my-replicated-topic
启动消费者
kafka-console-consumer.sh --zookeeper notebook:2181 --topic my-replicated-topic
现在把其中一个非leader干掉
xiejundongsiMac:~ xiejundong$ jps -m
1120 ConsoleProducer --broker-list notebook:9093,notebook:9094,notebook:9095 --topic my-replicated-topic
1107 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server3.properties
1124 Jps -m
876 QuorumPeerMain /Users/xiejundong/software/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
1102 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server1.properties
1103 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server2.properties
xiejundongsiMac:~ xiejundong$ kill -9 1103
结果还是可以正常
再来kill一个leader kafka实例
xiejundongsiMac:~ xiejundong$ jps -m
1138 ConsoleProducer --broker-list notebook:9093,notebook:9094,notebook:9095 --topic my-replicated-topic
1107 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server3.properties
1141 ConsoleConsumer --zookeeper notebook:2181 --topic my-replicated-topic
1148 Jps -m
876 QuorumPeerMain /Users/xiejundong/software/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
1102 Kafka /Users/xiejundong/software/kafka_2.11-0.9.0.0/config/server1.properties
xiejundongsiMac:~ xiejundong$ kill -9 1102
查看topic情况
xiejundongsiMac:~ xiejundong$ kafka-topics.sh --describe --zookeeper notebook:2181
Topic:hello_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: hello_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 3 Replicas: 1,2,3 Isr: 3
可以看到控制台虽然有报错信息,但是程序依然能正常收发消息,说明kafka是非常健壮的
下面是kafka的log工作目录