-
kafka初接触(适合小白实践,比如说我)
(此文章为本小白初接触学习过程,若侵权,请提示删除)简介:Kafka 是一种高吞吐量的分布式发布订阅消息系统,有如下特性:
通过O(1)的磁盘数据结构提供消息的持久化,这种结构对于即使数以TB的消息存储也能够保持长时间的稳定性能。 高吞吐量 :即使是非常普通的硬件Kafka也可以支持每秒数百万 [2] 的消息。
支持通过Kafka服务器和消费机集群来分区消息。 支持Hadoop并行数据加载。kafka相关术语介绍 Broker
Kafka集群包含一个或多个服务器,这种服务器被称为broker Topic
每条发布到Kafka集群的消息都有一个类别,这个类别被称为Topic。(物理上不同Topic的消息分开存储,逻辑上一个Topic的消息虽然保存于一个或多个broker上但用户只需指定消息的Topic即可生产或消费数据而不必关心数据存于何处)
Partition
Partition是物理上的概念,每个Topic包含一个或多个Partition. Producer
负责发布消息到Kafka broker Consumer
消息消费者,向Kafka broker读取消息的客户端。 Consumer Group
每个Consumer属于一个特定的Consumer Group(可为每个Consumer指定group name,若不指定group name则属于默认的group)。*初步准备:*VMvareWorkstation14 pro
Centos7虚拟机一台
JDK版本:建议1.8.0_181
zookeeper3.5.5版本(建议)http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.5.5/apache-zookeeper-3.5.5-bin.tar.gz
kafka2.12-2.3.1 https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.3.1/kafka_2.12-2.3.1.tgz安装部署(切入正题)
一 zookeeper安装 【1】zookeeper环境变量配置 tar -zxvf
zookeeper*****.tar.gz vi /etc/profile
添加zookeeper缺省配置#zookeeper环境变量设置 -
ZOOKEEPER_HOME=/usr/local/logdeal/zookeeper-3.4.6 PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$ZOOKEEPER_HOME/lib: TOMCAT_HOME=/usr/local/tomcat7 CATALINA_HOME=/usr/local/tomcat7 export ZOOKEEPER_HOME export JAVA_HOME export PATH export CLASSPATH export TOMCAT_HOME export CATALINA_HOME export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH **配置文件放在$ZOOKEEPER_HOME/conf/目录下,将zoo_sample.cfd文件名称改为zoo.cfg,
缺省的配置内容如下**
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/tmp/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 ``` 以上配置完成之后切换到$zookeeper目录下 ./zkServer.sh start #启动 netstat -tunlp|grep 2181 #查看zookeeper端口 ./zkServer.sh stop #停止 **二 kafka安装配置** 【1】切换到config目录下配置server.properties 配置文件内容如下(自行添加缺省配置): d=0 listeners=PLAINTEXT://:9092 port=9092 host.name=192.168.4.166 num.network.threads=4 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/tmp/kafka-logs2 num.partitions=5 num.recovery.threads.per.data.dir=1 log.retention.hours=168 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 log.cleaner.enable=false zookeeper.connect=localhost:2181 zookeeper.connection.timeout.ms=6000 queued.max.requests =500 log.cleanup.policy = delete 配置完成之后wq保存退出,切换到kafka目录下执行 bin/kafka-server-start.sh config/server.properties & ##启动kafka 稍等几s,不报错即启动完毕,检测2181与9092端口 netstat -tunlp|egrep "(2181|9092)" 或 lsof -i:2181/lsof -i:9092(有输出即为端口正常开启) ***做单机连通性测试*** 启动两个CRT客户端,一个作为生产者发送消息,一个作为消费者接受消息 运行producer bin/kafka-console-producer.sh --broker-list 192.168.4.166:9092 --topic test 随便敲点东西,发送给队列 运行consumer bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning 可以看到刚才发送的消息列表。 **搭建一个多个broker的**“集群”**** 进入到kafka目录下, cd config touch server1.properties server2.properties ##创建两个broker配置文件 给每个broker添加配置 vi server1.properties d=1 listeners=PLAINTEXT://:9093 port=9093 host.name=192.168.4.166 num.network.threads=4 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/tmp/kafka-logs1 num.partitions=5 num.recovery.threads.per.data.dir=1 log.retention.hours=168 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 log.cleaner.enable=false zookeeper.connect=192.168.4.166:2181 zookeeper.connection.timeout.ms=6000 queued.max.requests =500 log.cleanup.policy = delete
wq保存退出
编辑 server2.properties
vi server2.properties
d=2
listeners=PLAINTEXT://:9094
port=9094
host.name=192.168.4.166
num.network.threads=4
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs2
num.partitions=5
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
queued.max.requests =500
log.cleanup.policy = delete
启动broker
bin/kafka-server-start.sh config/server0.properties & #启动broker0
bin/kafka-server-start.sh config/server1.properties & #启动broker1
bin/kafka-server-start.sh config/server2.properties & #启动broker2
查看2181,9090,9093,9094端口
netstat -tunlp|egrep “(2181|9092|9093|9094)”
一个zookeeper在2181端口上监听,3个kafka cluster(broker)分别在端口9092,9093,9094监听。
创建topic
bin/kafka-topics.sh --create --topic topic_1 --partitions 1 --replication-factor 3 --zookeeper localhost:2181
查看topic情况
bin/kafka-topics.sh --list --zookeeper localhost:2181
bin/kafka-topics.sh --describe --zookeeper localhost:2181
模拟客户端发送消息
bin/kafka-console-producer.sh --topic topic_1 --broker-list 192.168.4.166.181:9092,192.168.4.166:9093,192.168.4.166:9094
模拟客户端接受消息
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic_1 --from-beginning
此时producer将topic发布到了3个broker中,此处可体会到分布式的味道了
执行不报错且消息传递成功