flume+kafka+zookeeper+storm实时计算环境搭建(二)

2 篇文章 0 订阅
2 篇文章 0 订阅

搭建flume+kafka+storm环境

这里,我的flume是采集mysql的数据再存入kafka,我用mysql作为我的source,内存memory作为channels,kafka作为sink,这个要借助一个插件source-ng-sql手机mysql的数据

环境准备:
1.flume1.8.0
2.kafka2.0.0
3.storm1.2.2
4.zookeeper3.4.13

flume
下载:
http://flume.apache.org/download.html
下载
apache-flume-1.8.0-bin.tar.gz
解压
tar -zxvf ~/Desktop/apache-flume-1.8.0-bin.tar.gz
sudo mv apache-flume-1.8.0-bin flume1.8.0
mkdir -p bigdata
sudo mv flume1.8.0 /usr/local/bigdata
改一下拥有者
sudo chown bigdata -R storm:storm
修改flume的conf配置文件
sudo cp flume-conf.properties.template flume.conf
sudo vim flume.conf

a1.sources=sql
a1.channels=mem
a1.sinks=kafka

1.sources.sql.type=org.keedio.flume.source.SQLSource
a1.sources.sql.hibernate.connection.url=jdbc:mysql://数据库地址

a1.sources.sql.hibernate.connection.user=root
a1.sources.sql.hibernate.connection.password=123456
a1.sources.sql.hibernate.connection.autocommit=true
a1.sources.sql.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
a1.sources.sql.hibernate.connection.driver_class=com.mysql.jdbc.Driver
a1.sources.sql.run.query.delay=5000
a1.sources.sql.status.file.path=sql状态文件保存路径
a1.sources.sql.status.file.name=sql状态文件名
# Custom query
a1.sources.sql.start.from=0
a1.sources.sql.custom.query=数据库查询语句
a1.sources.sql.batch.size=1000
a1.sources.sql.max.rows=1000
a1.sources.sql.hibernate.connection.provider_class=org.hibernate.connection.C3P0ConnectionProvider
a1.sources.sql.hibernate.c3p0.min_size=1
a1.sources.sql.hibernate.c3p0.max_size=10
# use memory as the channels
a1.channels.mem.type=memory
a1.channels.mem.capacity=1000
a1.channels.mem.transactionCapacity=1000
a1.channels.mem.byteCapacityBufferPercentage=20
a1.channels.mem.byteCapacity=800000

a1.sinks.kafka.type=org.apache.flume.sink.kafka.KafkaSink
a1.sinks.kafka.topic=创建的kafka topic名
a1.sinks.kafka.brokerList=master:9092,slave1:9092,slave2:9092
a1.sinks.kafka.requireAcks=1
a1.sinks.kafka.batchSize=20
a1.sinks.kafka.partitionIdHeader=0

a1.sources.sql.channels=mem
a1.sinks.kafka.channel=mem

配置环境变量
sudo vim /etc/profile
在这里插入图片描述

flume-ng-sql-source插件
下载
https://github.com/xiaomoo/oeasy
在flume的目录下创建目录plugins.d
在这里插入图片描述
目录结构如上
lib放编译好的flume-ng-sql-source.jar
libext放mysql的jar

zookeeper
下载:
http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.13/
解压:
tar -zxvf zookeeper-3.4.13.tar.gz
sudo mv zookeeper-3.4.13 zookeeper3.4.13
sudo zookeeper3.4.13 /usr/local/bigdata
修改配置文件
sudo vim zoo.cfg
单节点配置

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataLogDir=/usr/local/bigdata/zookeeper3.4.13/logs
dataDir=/usr/local/bigdata/zookeeper3.4.13/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

然后新建data目录,单节点为standalone模式

zookeeper集群配置

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataLogDir=/usr/local/bigdata/zookeeper3.4.13/logs
dataDir=/usr/local/bigdata/zookeeper3.4.13/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=0.0.0.0:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888

新建data目录
mkdir -p data
cat 1 > myid
通过ssh将zookeeper目录发送给slave1,slave2
scp -t zookeeper3.4.13 storm@slave1:/usr/local/bigdata/
scp -t zookeeper3.4.13 storm@slave2:/usr/local/bigdata/
分别将slave1,slave2新建data目录
slave1:
cat 2 > myid
修改zoo.cfg

server.1=master:2888:3888
server.2=0.0.0.0:2888:3888
server.3=slave2:2888:3888

slave2:
cat 3 > myid
修改zoo.cfg

server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=0.0.0.0:2888:3888

zookeeper集群会选择一个leader,其他为follower
配置环境变量
在这里插入图片描述

通过bin/zkServer.sh start启动zookeeper
bin/zkServer.sh stop关闭zookeeper
bin/zkServer.sh status查看集群状态

kafka
下载:
http://kafka.apache.org/downloads
解压:
tar -zxvf kafka_2.11-2.0.0.tgz
sudo mv kafka_2.11-2.0.0 kafka2.0.0
sudo mv kafka2.0.0 /usr/local/bigdata
cd /usr/local/bigdata/kafka2.0.0/config
修改kafka配置server.properties
sudo vim server.properties

broker.id=0
host.name=本机的ip地址

listeners=PLAINTEXT://192.168.5.211:9092
log.dirs=/usr/local/bigdata/kafka2.0.0/logs

num.network.threads=3
num.io.threads=8

socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400

socket.request.max.bytes=104857600
delete.topic.enable=true

log.dirs=/usr/local/bigdata/kafka2.0.0/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1

offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

zookeeper.connect=你的zookeeper地址,可以使单节点也可以是集群
zookeeper.connection.timeout.ms=6000
roup.initial.rebalance.delay.ms=0

打开zookeeper.properties
sudo vim zookeeper.properties

dataDir=你的zookeeper的data目录路径
clientPort=2181

将kafka发送到slave1,slave2
scp -t kafka2.0.0 storm@slave1:/usr/local/bigdata/
scp -t kafka2.0.0 storm@slave2:/usr/local/bigdata/
分别修改slave1,slave2的kafka中config的server.properties
slave1:
将broker.id改为1
slave2:
将broker.id改为2

配置环境变量
在这里片描述

storm
下载
http://storm.apache.org/downloads.html
解压
tar -zxvf apache-storm-1.2.2.tar.gz
sudo mv apache-storm-1.2.2 storm1.2.2
sudo mv storm1.2.2 /usr/local/bigdata

修改配置文件,在conf目录下
sudo vim storm.yaml

storm.local.dir: "/usr/local/bigdata/storm1.2.2/localdir"
storm.zookeeper.servers:
     - zookeeper地址
     
nimbus.seeds: ["master"]
ui.host: 0.0.0.0
ui.port: 8093
supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703

drpc.servers:
     - "master"
     - "slave1"
     - "slave2"

将storm发送给slave1,slave2
scp -t storm1.2.2 storm@slave1:/usr/local/bigdata/
scp -t storm1.2.2 storm@slave2:/usr/local/bigdata/

配置环境变量
在这里插入图片描述

测试环境
首先启动zookeeper
$ZK_HOME/bin/zkServer.sh start(我用的单节点)
$ZK_HOME/bin/zkServer.sh status查看状态
在这里插入图片描述

启动kafka集群
$KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties &

启动storm
master:
storm nimbus &
storm ui &

slave1:
storm supervisor &

slave2:
storm supervisor &

jps命令查看启动进程
master:
在这里插入图片描述

slave1:
在这里插入图片描述

slave2:
在这里插入图片描述

至此,环境就搭建完成了。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值