环境准备:Zookeeper,Scala,Kafka, JDK
下载地址:
Zookeeper:
http://mirror.bit.edu.cn/apache/zookeeper/current/
Scala:
http://www.scala-lang.org/download/2.11.8.html
Kafka:
http://kafka.apache.org/downloads
一.Zookeeper部署
1.下载解压zookeeper-3.4.6.tar.gz
2.修改配置
zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/software/zookeeper/data
clientPort=2181
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
zookeeper]# mkdir data
zookeeper]# touch data/myid
zookeeper]# echo 1 > data/myid
3.hadoop002/003,也修改配置,如下
software]# scp -r zookeeper 192.168.137.141:/opt/software/
software]# scp -r zookeeper 192.168.137.142:/opt/software/
zookeeper]# echo 2 > data/myid
zookeeper]# echo 3 > data/myid
###切记不可echo 3>data/myid,将>前后空格保留,否则无法将 3 写入myid文件
4.启动Zookeeper集群
bin]# ./zkServer.sh start
bin]# ./zkServer.sh start
bin]# ./zkServer.sh start
5.查看Zookeeper状态
bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Mode: follower
bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Mode: leader
bin]# ./zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper/bin/../conf/zoo.cfg
Mode: follower
二.Kafka部署
1.解压并配置Scala,设置环境变量
2.下载基于Scala 2.11的kafka版本为0.10.0.1
software]# tar -xzvf kafka_2.11-0.10.0.1.tgz
software]# ln -s kafka_2.11-0.10.0.1 kafka //软链接
3.创建logs目录和修改server.properties
kafka]# mkdir logs
kafka]# cd config/
config]# vi server.properties
broker.id=1
port=9092
host.name=***.***.***.***
log.dirs=/opt/software/kafka/logs
zookeeper.connect=***.***.***.***:2181,***.***.***.***:2181,***.***.***.***:2181/kafka
4.配置kafka环境变量
5.另外两台机器如上操作
6.启动/停止
kafka]# nohup kafka-server-start.sh config/server.properties &
kafka]# nohup kafka-server-start.sh config/server.properties &
kafka]# nohup kafka-server-start.sh config/server.properties &
###停止
bin/kafka-server-stop.sh
---------------------------------------------------------------------------------------------------------------------------------------------
Kafka: 消息中间件 -->分布式流式平台
Kafka vs Flume
生产者 source
Broker channel
消费者 sink
软连接
ln -s 物理文件夹/文件 快捷的文件夹/文件
1.删除 快捷的文件夹/文件 ,增加安全系数
2.多版本管理
3.硬连接
常用命令:
创建topic:
kafka-topics.sh
bin/kafka-topics.sh --create \
--zookeeper yws85:2181,yws86:2181,yws87:2181/kafka \
--replication-factor 3 \
--partitions 3 \
--topic test
列出所有topic:
bin/kafka-topics.sh --list \
--zookeeper yws85:2181,yws86:2181,yws87:2181/kafka
生产者:
bin/kafka-console-producer.sh \
--broker-list yws85:9092,yws86:9092,yws87:9092 \
--topic test
消费者
bin/kafka-console-consumer.sh \
--zookeeper yws85:2181,yws86:2181,yws87:2181/kafka \
--topic test \
--from-beginning
查看topic详细描述:
bin/kafka-topics.sh \
--describe \
--zookeeper yws85:2181,yws86:2181,yws87:2181/kafka \
--topic test
修改命令:
bin/kafka-topics.sh \
--alter \
--zookeeper yws85:2181,yws86:2181,yws87:2181/kafka \
--topic test \
--partitions 4
干净删除topic:
bin/kafka-topics.sh --delete \
--zookeeper yws85:2181,yws86:2181,yws87:2181/kafka \
--topic test
Zookeeper:
rmr /kafka/admin/delete_topics/test
rmr /kafka/config/topics/test
rmr /kafka/brokers/topics/test
Kafka logs:
rm -rf logs/test-*
Kafka概念:
Kafka概念和架构
Kafka基本概念及原理
Kafka
分区内:有序
全局: 无序
那么如何保证生产上数据的有序性?
MySQL BINLOG 日志文件 按顺序----有序----->Kafka test 3个分区
举例:
ruozedata.stu //发送对该表的以下操作日志到kafka
id name age
insert into stu values(1,'jepson',18);
insert into stu values(2,'ruoze',28);
update stu set age=26 where id=1;
delete from stu where id=1;
按分区存储
test-0:
insert into stu values(1,'jepson',18);
test-1:
insert into stu values(2,'ruoze',28);
delete from stu where id=1;
test-2:
update stu set age=26 where id=1;
消费时就会发生如下的错误,最后id=1的数据依然存在
insert into stu values(2,'ruoze',28);
delete from stu where id=1;
insert into stu values(1,'jepson',18);
update stu set age=26 where id=1;
解决方案核心点: 共性数据发送到同一个topic的1个分区,以保持有序
拼装的特征Key: ruozedata_stu_id=1 自定义分区 hash(ruozedata_stu_id=1 ) 取模 0,1,2
于是各个分区里的存储变成了下列的样子,有序性得到了保证
test-0:
insert into stu values(1,'jepson',18);
update stu set age=26 where id=1;
delete from stu where id=1;
test-1:
insert into stu values(2,'ruoze',28);
test-2: