flume+kafka+zookeeper+storm实时计算环境搭建(二)

最新推荐文章于 2024-07-06 18:15:16 发布

weixin_43033836

最新推荐文章于 2024-07-06 18:15:16 发布

阅读量449

点赞数

分类专栏： storm kafka flume 文章标签： storm kafka flume zookeeper flume-ng-sql-source

本文链接：https://blog.csdn.net/weixin_43033836/article/details/83276681

版权

storm 同时被 3 个专栏收录

6 篇文章 0 订阅

订阅专栏

kafka

2 篇文章 0 订阅

订阅专栏

flume

2 篇文章 0 订阅

订阅专栏

搭建flume+kafka+storm环境

这里，我的flume是采集mysql的数据再存入kafka，我用mysql作为我的source，内存memory作为channels，kafka作为sink，这个要借助一个插件source-ng-sql手机mysql的数据

环境准备：
1.flume1.8.0
2.kafka2.0.0
3.storm1.2.2
4.zookeeper3.4.13

flume
下载：
http://flume.apache.org/download.html
下载
apache-flume-1.8.0-bin.tar.gz
解压
tar -zxvf ~/Desktop/apache-flume-1.8.0-bin.tar.gz
sudo mv apache-flume-1.8.0-bin flume1.8.0
mkdir -p bigdata
sudo mv flume1.8.0 /usr/local/bigdata
改一下拥有者
sudo chown bigdata -R storm：storm
修改flume的conf配置文件
sudo cp flume-conf.properties.template flume.conf
sudo vim flume.conf

a1.sources=sql
a1.channels=mem
a1.sinks=kafka

1.sources.sql.type=org.keedio.flume.source.SQLSource
a1.sources.sql.hibernate.connection.url=jdbc:mysql://数据库地址

a1.sources.sql.hibernate.connection.user=root
a1.sources.sql.hibernate.connection.password=123456
a1.sources.sql.hibernate.connection.autocommit=true
a1.sources.sql.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
a1.sources.sql.hibernate.connection.driver_class=com.mysql.jdbc.Driver
a1.sources.sql.run.query.delay=5000
a1.sources.sql.status.file.path=sql状态文件保存路径
a1.sources.sql.status.file.name=sql状态文件名
# Custom query
a1.sources.sql.start.from=0
a1.sources.sql.custom.query=数据库查询语句
a1.sources.sql.batch.size=1000
a1.sources.sql.max.rows=1000
a1.sources.sql.hibernate.connection.provider_class=org.hibernate.connection.C3P0ConnectionProvider
a1.sources.sql.hibernate.c3p0.min_size=1
a1.sources.sql.hibernate.c3p0.max_size=10
# use memory as the channels
a1.channels.mem.type=memory
a1.channels.mem.capacity=1000
a1.channels.mem.transactionCapacity=1000
a1.channels.mem.byteCapacityBufferPercentage=20
a1.channels.mem.byteCapacity=800000

a1.sinks.kafka.type=org.apache.flume.sink.kafka.KafkaSink
a1.sinks.kafka.topic=创建的kafka topic名
a1.sinks.kafka.brokerList=master:9092,slave1:9092,slave2:9092
a1.sinks.kafka.requireAcks=1
a1.sinks.kafka.batchSize=20
a1.sinks.kafka.partitionIdHeader=0

a1.sources.sql.channels=mem
a1.sinks.kafka.channel=mem

配置环境变量
sudo vim /etc/profile
在这里插入图片描述

flume-ng-sql-source插件
下载
https://github.com/xiaomoo/oeasy
在flume的目录下创建目录plugins.d
在这里插入图片描述
目录结构如上
lib放编译好的flume-ng-sql-source.jar
libext放mysql的jar

zookeeper
下载：
http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.13/
解压：
tar -zxvf zookeeper-3.4.13.tar.gz
sudo mv zookeeper-3.4.13 zookeeper3.4.13
sudo zookeeper3.4.13 /usr/local/bigdata
修改配置文件
sudo vim zoo.cfg
单节点配置

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataLogDir=/usr/local/bigdata/zookeeper3.4.13/logs
dataDir=/usr/local/bigdata/zookeeper3.4.13/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

然后新建data目录，单节点为standalone模式

zookeeper集群配置

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataLogDir=/usr/local/bigdata/zookeeper3.4.13/logs
dataDir=/usr/local/bigdata/zookeeper3.4.13/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=0.0.0.0:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888

新建data目录
mkdir -p data
cat 1 > myid
通过ssh将zookeeper目录发送给slave1，slave2
scp -t zookeeper3.4.13 storm@slave1:/usr/local/bigdata/
scp -t zookeeper3.4.13 storm@slave2:/usr/local/bigdata/
分别将slave1，slave2新建data目录
slave1：
cat 2 > myid
修改zoo.cfg

server.1=master:2888:3888
server.2=0.0.0.0:2888:3888
server.3=slave2:2888:3888

slave2：
cat 3 > myid
修改zoo.cfg

server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=0.0.0.0:2888:3888

zookeeper集群会选择一个leader，其他为follower
配置环境变量
在这里插入图片描述

通过bin/zkServer.sh start启动zookeeper
bin/zkServer.sh stop关闭zookeeper
bin/zkServer.sh status查看集群状态

kafka
下载：
http://kafka.apache.org/downloads
解压：
tar -zxvf kafka_2.11-2.0.0.tgz
sudo mv kafka_2.11-2.0.0 kafka2.0.0
sudo mv kafka2.0.0 /usr/local/bigdata
cd /usr/local/bigdata/kafka2.0.0/config
修改kafka配置server.properties
sudo vim server.properties

broker.id=0
host.name=本机的ip地址

listeners=PLAINTEXT://192.168.5.211:9092
log.dirs=/usr/local/bigdata/kafka2.0.0/logs

num.network.threads=3
num.io.threads=8

socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400

socket.request.max.bytes=104857600
delete.topic.enable=true

log.dirs=/usr/local/bigdata/kafka2.0.0/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1

offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

zookeeper.connect=你的zookeeper地址，可以使单节点也可以是集群
zookeeper.connection.timeout.ms=6000
roup.initial.rebalance.delay.ms=0

打开zookeeper.properties
sudo vim zookeeper.properties

dataDir=你的zookeeper的data目录路径
clientPort=2181

将kafka发送到slave1，slave2
scp -t kafka2.0.0 storm@slave1:/usr/local/bigdata/
scp -t kafka2.0.0 storm@slave2:/usr/local/bigdata/
分别修改slave1，slave2的kafka中config的server.properties
slave1：
将broker.id改为1
slave2：
将broker.id改为2

配置环境变量
在这里片描述

storm
下载
http://storm.apache.org/downloads.html
解压
tar -zxvf apache-storm-1.2.2.tar.gz
sudo mv apache-storm-1.2.2 storm1.2.2
sudo mv storm1.2.2 /usr/local/bigdata

修改配置文件,在conf目录下
sudo vim storm.yaml

storm.local.dir: "/usr/local/bigdata/storm1.2.2/localdir"
storm.zookeeper.servers:
     - zookeeper地址
     
nimbus.seeds: ["master"]
ui.host: 0.0.0.0
ui.port: 8093
supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703

drpc.servers:
     - "master"
     - "slave1"
     - "slave2"

将storm发送给slave1，slave2
scp -t storm1.2.2 storm@slave1:/usr/local/bigdata/
scp -t storm1.2.2 storm@slave2:/usr/local/bigdata/

配置环境变量
在这里插入图片描述

测试环境
首先启动zookeeper
$ZK_HOME/bin/zkServer.sh start(我用的单节点)
$ZK_HOME/bin/zkServer.sh status查看状态
在这里插入图片描述

启动kafka集群
$KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties &

启动storm
master：
storm nimbus &
storm ui &

slave1:
storm supervisor &

slave2:
storm supervisor &

jps命令查看启动进程
master：
在这里插入图片描述

slave1：
在这里插入图片描述

slave2：
在这里插入图片描述

至此，环境就搭建完成了。

weixin_43033836

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
flume+kafka+zookeeper+storm实时计算环境搭建(二)

搭建flume+kafka+storm环境这里，我的flume是采集mysql的数据再存入kafka，我用mysql作为我的source，内存memory作为channels，kafka作为sink，这个要借助一个插件source-ng-sql手机mysql的数据环境准备：1.flume1.8.02.kafka2.0.03.storm1.2.24.zookeeper3.4.13flu...
复制链接

扫一扫

专栏目录