CentOS6.5搭建zookeeper-Kafka-Storm消息系统

最新推荐文章于 2023-12-15 00:10:08 发布

llgyzb

最新推荐文章于 2023-12-15 00:10:08 发布

阅读量797

点赞数

分类专栏：分布式架构

分布式架构专栏收录该内容

5 篇文章 0 订阅

订阅专栏

环境:CentOS6.5

zookeeper-3.4.8

kafka_2.11-0.10.1.1

apache-storm-0.10.0.tar

依赖工具安装

centOS安装ZeroMQ所需组件及工具：

yum install gcc

yum install gcc-c++

yum install make

yum install uuid-devel

yum install libuuid-devel

yum install libtool

安装Python:

wget http://www.python.org/ftp/python/2.7.2/Python-2.7.2.tgz
tar zxvf Python-2.7.2.tgz
cd Python-2.7.2
./configure - -without-libsodium
make
make install
vi /etc/ld.so.conf
追加/usr/local/lib/
sudo ldconfig

1.搭建zookeeper

1.下载zookeeper二进制安装包 v3.4.6,下载地址：
http://apache.fayea.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
2.解压，linux命令：
sudo tar -zxvf zookeeper-3.4.3.tar.gz
3.配置环境变量，linux命令：vi ~/.bashrc ,添加ZOOKEEPER_HOME

export JAVA_HOME=/usr/java/jdk1.8.0_60

export ZOOKEEPER_HOME=/opt/software/zookeeper-3.4.6

export STORM_HOMW=/opt/software/apache-storm-0.9.5
export PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$STORM_HOME/bin:$PATH

4.conf设置，dataDir、clientPort、最下面的server注意下.
记得手动创建Dir的目录，不然会启动失败。

[root@localhost conf]# cat zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/var/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

server.1=192.168.3.160:2888:3888
server.2=192.168.3.161:2888:3888
server.3=192.168.3.162:2888:3888

5.常用命令
zookeeper-3.4.6/bin/zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}
先start之后，status查看状态

[root@localhost zookeeper-3.4.6]# bin/zkServer.sh start
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@localhost software]# zookeeper-3.4.6/bin/zkServer.sh status
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: standalone

[root@localhost zookeeper-3.4.6]# bin/zkServer.sh stop
JMX enabled by default
Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED

安装zeromq以及jzmq:
jzmq的安装貌似是依赖zeromq的，所以应该先装zeromq，再装jzmq。
1）安装zeromq：
wget http://download.zeromq.org/zeromq-2.2.0.tar.gz
tar zxf zeromq-2.2.0.tar.gz
cd zeromq-2.2.0
./configure
make
make install
sudo ldconfig (更新LD_LIBRARY_PATH)
zeromq安装完成。
注意：如有有依赖报错，需要安装：
jzmq dependencies 依赖包
sudo yum install uuid*
sudo yum install libtool
sudo yum install libuuid
sudo yum install libuuid-devel
2）安装jzmq
yum install git
git clone git://github.com/nathanmarz/jzmq.git
cd jzmq
./autogen.sh
./configure
make
make install
然后，jzmq就装好了.
注意：在./autogen.sh这步如果报错：autogen.sh:error:could not find libtool is required to run autogen.sh，这是因为缺少了libtool，可以用#yum install libtool*来解决。

2.Kafka搭建
- 1.下载地址：http://apache.fayea.com/kafka/0.8.2.2/kafka_2.11-0.8.2.2.tgz
- 2.解压，linux命令：
  sudo tar -zxvf kafka_2.11-0.8.2.2.tgz
- 3.conf设置，注意host.name=192.168.3.160，zookeeper.connect=192.168.3.163:2181
```
############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0

############################# Socket Server Settings #############################

# The port the socket server listens on
port=9092

# Hostname the broker will bind to. If not set, the server will bind to all interfaces
host.name=192.168.3.160

...
############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=192.168.3.163:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
```
- 4.启动kafka server: nohup bin/kafka-server-start.sh config/server.properties
  此命令忽略hangup可以在远程连接关闭后继续运行。
kafka集群的多个broke连接到同一个zookeeper，生产者往一个broke发送消息，消费者从zookeeper获得订阅。

3.Storm搭建

1.下载地址:http://apache.fayea.com/storm/apache-storm-0.9.5/apache-storm-0.9.5.tar.gz
2.解压，linux命令：
sudo tar -zxvf apache-storm-0.9.5.tar.gz
3.conf设置

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

########### These MUST be filled in for a storm configuration
# storm.zookeeper.servers:
#     - "server1"
#     - "server2"
storm.zookeeper.servers:
  - "192.168.3.161"

storm.zookeeper.port: 2181

#
# nimbus.host: "nimbus"
#
nimbus.host: "192.168.3.160"
nimbus.childopts: -Xmx1024m -Djava.net.preferIPv4Stack=true

ui.childopts: -Xmx768m -Djava.net.preferIPv4Stack=true

ui.host: 0.0.0.0
ui.port: 8080

supervisor.childopts: -Djava.net.preferIPv4Stack=true
worker.childopts: -Xmx768m -Dfile.encoding=utf-8 -Djava.net.preferIPv4Stack=true

supervisor.slots.ports:
- 6700
  - 6701
  - 6702
  - 6703

storm.local.dir: /data/cluster/storm

storm.log.dir: /data/cluster/storm/logs

logviewer.port: 8000
#
# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
#     - org.mycompany.MyType
#     - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
#     - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
#     - "server1"
#     - "server2"
drpc.servers:
  - "192.168.3.160"

## Metrics Consumers
# topology.metrics.consumer.register:
#   - class: "backtype.storm.metric.LoggingMetricsConsumer"
#     parallelism.hint: 1
#   - class: "org.mycompany.MyMetricsConsumer"
#     parallelism.hint: 1
#     argument:
#       - endpoint: "metrics-collector.mycompany.org"

这个脚本文件写的不咋地，所以在配置时一定注意在每一项的开始时要加空格，冒号后也必须要加空格，否则storm就不认识这个配置文件了。
说明一下：storm.local.dir表示storm需要用到的本地目录。nimbus.host表示那一台机器是master机器，即nimbus。storm.zookeeper.servers表示哪几台机器是zookeeper服务器。storm.zookeeper.port表示zookeeper的端口号，这里一定要与zookeeper配置的端口号一致，否则会出现通信错误，切记切记。当然你也可以配superevisor.slot.port，supervisor.slots.ports表示supervisor节点的槽数，就是最多能跑几个worker进程（每个sprout或bolt默认只启动一个worker，但是可以通过conf修改成多个）。

supervisor.slots.ports: 对于每个Supervisor工作节点，需要配置该工作节点可以运行的worker数量。每个worker占用一个单独的端口用于接收消息，该配置选项即用于定义哪些端口是可被worker使用的。默认情况下，每个节点上可运行4个workers，分别在6700、6701、6702和6703端口.
- 4.常用命令
```
# bin/storm nimbus （启动主节点）
# bin/storm supervisor (启动从节点)
# bin/storm ui (启动UI，可以通过浏览器 ip:8080查看运行情况)
#
#bin/storm jar storm-demo-1.0.jar io.sterm.demo.topology.WordCountTopology（此命令的作用就是用storm将jar发送给storm去执行，后面的test是定义的toplogy名称）
#bin/storm kill word-count
```
- 启动nimbus后台运行：bin/storm nimbus
- 启动supervisor后台运行：bin/storm supervisor
- 启动ui后台运行：bin/storm ui
- 然后就可以通过http://ip:8080访问界面了
第五步，测试一下本地模式的WordCount
下载storm-starter 编译，并导入eclipse 工程:
(http://blog.csdn.net/guoqiangma/article/details/7212677)
1. 下载strom starter的代码 git clone https://github.com/nathanmarz/storm-starter.git
2. 使用mvn -f m2-pom.xml package 进行编译
3. 复制 storm-starter目录下的m2_pom.xml 为pom.xml ，因为eclipse需要pom.xml
4. 使用mvn eclipse：eclipse编译成eclipse工程
5. 在Eclipse 中import 选择storm-starter 的路径，一般导入项目后，会需要设置相应的M2_查看工程是否无误，可能会需要配置M2_REPO变量，
M2_REPO配置方法：工程上右键->Properties->Java Build Path->Libraries->AddVariable->Configure Variable->New
输入Name:M2_REPO , Path:localRepository路径->ok刷新工程，代码无误了，可以进行开发了
6. 编译无误后，现在本地跑storm.starter目录下的WordCountTopology，看到如下的截屏，代表本地的local模式可以跑通过
使用eclipse的export功能导出项目的jar包，便于以后分布式的情况下，提交相应的逻辑
Strom-Starter构建失败，缺少twitter4j包的解决办法:
(http://www.cnblogs.com/zeutrap/archive/2012/10/11/2720528.html)
修改Storm-Starter的pom文件m2-pom.xml ，修改dependency中twitter4j-core 和 twitter4j-stream两个包的依赖版本，如下：
```
org.twitter4j
twitter4j-core
[2.2,)

org.twitter4j
twitter4j-stream
[2.2,)
```