Kafka简单介绍

Kafka:

 Kafka是一个高吞吐量,分布式的发布—订阅消息系统。据kafka官网介绍,当前的kafka已经定位为一个分布流式处理平台,它可以水平扩展,也具有高吞吐量,越来越多开源分布式处理系统(Flume,Apache Storm,Spark)支持与kafka集成。

 kafka是一个分布式消息列队,kafka对消息保存时根据topic进行归类,发送消息者称为producer,消息接收者称为consumer,此外kafka集群有多个kafka组成,每个实例(server)称为broker。

 kafka集群依赖zookeeper,zookeeper会保存一些meta信息,来保证系统的可用性。

组件:
  • kafka server:消息系统中间件,接收生产者产生的消息,接收消费者的订阅。
  • Topic:用来存放数据,可以理解为一个列队。
  • Broker:一台kafka服务器就是一个broker,是集群的节点,一个集群是由多个broker组成,一个broker可以容纳多个topic。
  • Leader:一个分区就是一个Leader,用来写数据,响应消费者。
  • Follower:用来备份leader的数据,可以处理读请求,如果leader挂了,follower会被选举为leader。
  • Consumer group:kafka用来实现一个topic消息的广播和单播的手段,这是一个逻辑的消费者,每个组中有多个消费的实例,用来处理leader的消息,一个leader中的消息只能有每个消费者组中的一个实例处理,避免一个消息在消费者组中被重复执行。
  • consumer:消息消费者,对kafka中的消息进行订阅。
  • producer:消息生产者,用来生产消息。
  • partition:存在消息的载体,类似于rabbitmq的queue。
  • offset:偏移量,对分区的数据进行标识,如果消息被消费,信息仍然会在append log中临时保存。
  • consumer id regitry:每个消费者都有自己的标识id,这个用来存储消费者的标识id。
  • consumer offset tracking:用来追踪每个消费者消费的最大的offset。
  • partition owner registry:用来标记partition被哪个消费者消费。
消息的传送机制:
  • at most once:消息最多发送一次,发送一次,无论成败,不再重发。
  • at least once:消息至少发送一次,如果没有成功则再次发送。
  • exactly once:消息只发送一次。
流程图:

安装:
Host1:
[root@localhost ~]# tar -zxvf zookeeper-3.4.5.tar.gz  -C /usr/src/
[root@localhost ~]# cd /usr/src/
[root@localhost src]# mv zookeeper-3.4.5 /usr/local/zookeeper
[root@localhost src]# cd /usr/local/zookeeper/conf/
[root@localhost conf]# cp zoo_sample.cfg zoo.cfg
[root@localhost conf]# vim zoo.cfg 
修改:
dataDir=/usr/local/zookeeper/data               
添加:
dataLogDir=/usr/local/zookeeper/datalog         
server.1=192.168.43.176:2888:3888               
server.2=192.168.43.104:2888:3888
server.3=192.168.43.23:2888:3888
配置文件说明:
# The number of milliseconds of each tick
tickTime=2000                                   #zookeeper集群中各个节点发送心跳包的间隔时间
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10                                    #当有新的follower加入,10*2000在这个事件内复制主上面的信息,单位个。
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5                                     #节点之间超时的等待时间。
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/usr/local/zookeeper/data               #指定数据文件的存储路径

dataLogDir=/usr/local/zookeeper/datalog         #指定数据日志的存储路径。
server.1=192.168.43.176:2888:3888               #指定集群中各个节点的信息,2888为节点间通信的接口,3888为节点之间选举leader的接口。
server.2=192.168.43.104:2888:3888
server.3=192.168.43.23:2888:3888

# the port at which the clients will connect
clientPort=2181                                 #指定zookeeper的端口。
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
~                          
Host1:
[root@localhost conf]# cd ../
[root@localhost zookeeper]# mkdir data
[root@localhost zookeeper]# mkdir datalog
[root@localhost zookeeper]# cd data
[root@localhost data]# echo "1" > myid
[root@localhost data]# scp -r /usr/local/zookeeper root@192.168.43.104:/usr/local/
[root@localhost data]# scp -r /usr/local/zookeeper root@192.168.43.23:/usr/local/
[root@localhost data]# iptables -F
[root@localhost data]# systemctl stop firewalld
[root@localhost data]# setenforce 0
[root@localhost data]# iptables-save 
Host2:
[root@localhost ~]# cd /usr/local/zookeeper/data
[root@localhost data]# vim myid 
2
[root@localhost data]# iptables -F
[root@localhost data]# systemctl stop firewalld
[root@localhost data]# setenforce 0
[root@localhost data]# iptables-save 
Host2:
[root@localhost ~]# cd /usr/local/zookeeper/data
[root@localhost data]# vim myid 
3
[root@localhost data]# iptables -F
[root@localhost data]# systemctl stop firewalld
[root@localhost data]# setenforce 0
[root@localhost data]# iptables-save 
Host1:
[root@localhost data]# cd ../bin/
[root@localhost bin]# ./zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
Host2:
[root@localhost data]# cd ../bin/
[root@localhost bin]# ./zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
Host3:
[root@localhost data]# cd ../bin/
[root@localhost bin]# ./zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
Host1:
[root@localhost bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
Host2:
[root@localhost bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: leader
Host3:
[root@localhost bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
Host1:
[root@localhost ~]# tar -zxvf kafka_2.12-2.1.1.tgz  -C /usr/src/
[root@localhost src]# mv kafka_2.12-2.1.1 /usr/local/kafka
[root@localhost src]# cd /usr/local/kafka/config/
[root@localhost config]# vim server.properties 
修改:
broker.id=1                                     #kafka标识id。
listeners=PLAINTEXT://192.168.43.176:9092       #监听的端口号。
log.dirs=/usr/local/kafka/data                  #数据存放路径。
log.retention.hours=168                         #分区消息保存的时间,单位秒。
添加:
message.max.byte=1024000                        #生产者推送消息的最大容量。
default.replication.factor=2                    #对leader备份的个数。
replica.fetch.max.bytes=102400                  #消息单个消费的最大容量,单位字节。
修改:
zookeeper.connect=192.168.43.176:2181,192.168.43.104:2181,192.168.43.23:2181   
                                                #声明各个节点的zookeeper连接端口号。
num.rtition=1                                   #默认给topic中创建一个分区。
[root@localhost config]# mkdir ../data
[root@localhost config]# scp -r /usr/local/kafka root@192.168.43.104:/usr/local/
[root@localhost config]# scp -r /usr/local/kafka root@192.168.43.23:/usr/local/
Host2:
[root@localhost ~]# vim /usr/local/kafka/config/server.properties 
修改:
broker.id=2
listeners=PLAINTEXT://192.168.43.104:9092
Host3:
[root@localhost ~]# vim /usr/local/kafka/config/server.properties 
修改:
broker.id=3
listeners=PLAINTEXT://192.168.43.23:9092
Host1:
[root@localhost config]# cd ../bin/
[root@localhost bin]# ./kafka-server-start.sh -daemon ../config/server.properties 
[root@localhost bin]# netstat -anput | grep 9092
tcp6       0      0 192.168.43.176:9092     :::*                    LISTEN      8434/java           
tcp6       0      0 192.168.43.176:36610    192.168.43.176:9092     ESTABLISHED 8434/java           
tcp6       0      0 192.168.43.176:9092     192.168.43.176:36610    ESTABLISHED 8434/java           
Host2:
[root@localhost ~]# cd /usr/local/kafka/bin/
[root@localhost bin]# ./kafka-server-start.sh -daemon ../config/server.properties 
[root@localhost bin]# netstat -anput | grep 9092
tcp6       0      0 192.168.43.104:9092     :::*                    LISTEN      8052/java           
tcp6       0      0 192.168.43.104:9092     192.168.43.176:43400    ESTABLISHED 8052/java 
Host3:
[root@localhost ~]# cd /usr/local/kafka/bin/
[root@localhost bin]# ./kafka-server-start.sh -daemon ../config/server.properties 
[root@localhost bin]# netstat -anput | grep 9092
tcp6       0      0 192.168.43.23:9092      :::*                    LISTEN      5978/java           
tcp6       0      0 192.168.43.23:9092      192.168.43.176:54410    ESTABLISHED 5978/java           
Host1:
创建Topic:
[root@localhost bin]# ./kafka-topics.sh  --create --zookeeper 192.168.43.176:2181 --partitions 2 --replication-factor 2 --topic topic
Created topic "topic".
查看创建的Topic:
[root@localhost bin]# ./kafka-topics.sh --list --zookeeper 192.168.43.176:2181
topic
查看topic的详细信息:
[root@localhost bin]# ./kafka-topics.sh --zookeeper 192.168.43.176 --describe
Topic:__consumer_offsets	PartitionCount:50	ReplicationFactor:1	Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
    Topic: __consumer_offsets	Partition: 0	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 1	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 2	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 3	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 4	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 5	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 6	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 7	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 8	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 9	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 10	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 11	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 12	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 13	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 14	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 15	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 16	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 17	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 18	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 19	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 20	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 21	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 22	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 23	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 24	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 25	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 26	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 27	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 28	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 29	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 30	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 31	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 32	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 33	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 34	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 35	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 36	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 37	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 38	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 39	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 40	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 41	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 42	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 43	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 44	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 45	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 46	Leader: 1	Replicas: 1	Isr: 1
    Topic: __consumer_offsets	Partition: 47	Leader: 2	Replicas: 2	Isr: 2
    Topic: __consumer_offsets	Partition: 48	Leader: 3	Replicas: 3	Isr: 3
    Topic: __consumer_offsets	Partition: 49	Leader: 1	Replicas: 1	Isr: 1
Topic:topic	PartitionCount:2	ReplicationFactor:2	Configs:
	Topic: topic	Partition: 0	Leader: 3	Replicas: 3,1	Isr: 3,1
	Topic: topic	Partition: 1	Leader: 1	Replicas: 1,2	Isr: 1,2
生产消息:
[root@localhost bin]# ./kafka-console-producer.sh  --broker-list 192.168.43.104:9092 192.168.43.23:9092 --topic topic
Host2:
消费消息:
[root@localhost bin]# ./kafka-console-consumer.sh  --bootstrap-server 192.168.43.104:9092 --topic topic --from-beginning
test
Host3:
消费消息:
[root@localhost bin]# ./kafka-console-consumer.sh  --bootstrap-server 192.168.43.104:9092 --topic topic --from-beginning
test
命令说明:
Topic:
kafka-topics.sh
--create 				        #表明创建一个topic
--zookeeper			            #KAFKA集群的创建依赖于zookeeper,同样topic的信息也要写入到zookeeper,这里声明zookeeper信息(本机的ip:2181端口号)
--partitions			        #指定分几个区。
--replication-factor		    #指定副本数,不能大于broker数。
--topic				            #指定topic的名字。
创建topic:
./kafka-topics.sh --create --zookeeper 本机ip:2181	--partitions 2 --replication-factor  2  --topic first
查看topic:
./kafka-topics.sh --list --zookeeper 本机ip:2181
查看topic的详细信息:
./kafka-topics.sh  --zookeeper 本机ip:2181	--describe
删除topic:
./kafka-topics.sh --delete  --zookeeper zookeeper集群地址 --topic 要删除的topic的名字
Producer命令:
kafka-console-consumer.sh
--broker-list  指定broker,端口号为9092
--topic		指定topic
Consumer命令:
kafka-console-consumer.sh 
--zookeeper 		#指定zookeeper。
--topic			    #指定topic。
--from-beginning	#消费从topic创建开始,所有生产的数据。
--bootstrap-sever	#端口号是9092,指定bootstrap,因为Kafka本身会和Leader通信,如果offset信息存在在zookeeper中,会zookeeper频繁通信,影响效率。如果连接到的是bootstrap(kafka集群),会生成一个叫consumer_offsets的topic。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值