CentOS6.5搭建zookeeper-Kafka-Storm消息系统

环境:CentOS6.5

zookeeper-3.4.8
kafka_2.11-0.10.1.1
apache-storm-0.10.0.tar

依赖工具安装

centOS安装ZeroMQ所需组件及工具:

yum install gcc

yum install gcc-c++

yum install make

yum install uuid-devel

yum install libuuid-devel

yum install libtool

安装Python:
wget http://www.python.org/ftp/python/2.7.2/Python-2.7.2.tgz 
tar zxvf Python-2.7.2.tgz 
cd Python-2.7.2 
./configure - -without-libsodium 
make 
make install 
vi /etc/ld.so.conf 
追加/usr/local/lib/ 
sudo ldconfig

1.搭建zookeeper

  • 1.下载zookeeper二进制安装包 v3.4.6,下载地址: 
    http://apache.fayea.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
  • 2.解压,linux命令: 
    sudo tar -zxvf zookeeper-3.4.3.tar.gz
  • 3.配置环境变量 ,linux命令:vi ~/.bashrc ,添加ZOOKEEPER_HOME
  • export JAVA_HOME=/usr/java/jdk1.8.0_60
    
    export ZOOKEEPER_HOME=/opt/software/zookeeper-3.4.6
    
    export STORM_HOMW=/opt/software/apache-storm-0.9.5
    export PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$STORM_HOME/bin:$PATH
    • 4.conf设置,dataDir、clientPort、最下面的server注意下. 
      记得手动创建Dir的目录,不然会启动失败。
    • [root@localhost conf]# cat zoo.cfg 
      # The number of milliseconds of each tick
      tickTime=2000
      # The number of ticks that the initial 
      # synchronization phase can take
      initLimit=10
      # The number of ticks that can pass between 
      # sending a request and getting an acknowledgement
      syncLimit=5
      # the directory where the snapshot is stored.
      # do not use /tmp for storage, /tmp here is just 
      # example sakes.
      dataDir=/var/zookeeper/data
      # the port at which the clients will connect
      clientPort=2181
      # the maximum number of client connections.
      # increase this if you need to handle more clients
      #maxClientCnxns=60
      #
      # Be sure to read the maintenance section of the 
      # administrator guide before turning on autopurge.
      #
      # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
      #
      # The number of snapshots to retain in dataDir
      #autopurge.snapRetainCount=3
      # Purge task interval in hours
      # Set to "0" to disable auto purge feature
      #autopurge.purgeInterval=1
      
      server.1=192.168.3.160:2888:3888
      server.2=192.168.3.161:2888:3888
      server.3=192.168.3.162:2888:3888
      • 5.常用命令 
        zookeeper-3.4.6/bin/zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd} 
        先start之后,status查看状态
      [root@localhost zookeeper-3.4.6]# bin/zkServer.sh start
      JMX enabled by default
      Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
      Starting zookeeper ... STARTED
      
      [root@localhost software]# zookeeper-3.4.6/bin/zkServer.sh status
      JMX enabled by default
      Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
      Mode: standalone
      
      [root@localhost zookeeper-3.4.6]# bin/zkServer.sh stop
      JMX enabled by default
      Using config: /opt/software/zookeeper-3.4.6/bin/../conf/zoo.cfg
      Stopping zookeeper ... STOPPED

    • 安装zeromq以及jzmq:

      jzmq的安装貌似是依赖zeromq的,所以应该先装zeromq,再装jzmq。 
      1)安装zeromq: 
      wget http://download.zeromq.org/zeromq-2.2.0.tar.gz 
      tar zxf zeromq-2.2.0.tar.gz 
      cd zeromq-2.2.0 
      ./configure 
      make 
      make install 
      sudo ldconfig (更新LD_LIBRARY_PATH) 
      zeromq安装完成。 
      注意:如有有依赖报错,需要安装: 
      jzmq dependencies 依赖包 
      sudo yum install uuid* 
      sudo yum install libtool 
      sudo yum install libuuid 
      sudo yum install libuuid-devel 
      2)安装jzmq 
      yum install git 
      git clone git://github.com/nathanmarz/jzmq.git 
      cd jzmq 
      ./autogen.sh 
      ./configure 
      make 
      make install 
      然后,jzmq就装好了. 
      注意:在./autogen.sh这步如果报错:autogen.sh:error:could not find libtool is required to run autogen.sh,这是因为缺少了libtool,可以用#yum install libtool*来解决。

      2.Kafka搭建

      ############################# Server Basics #############################
      
      # The id of the broker. This must be set to a unique integer for each broker.
      broker.id=0
      
      ############################# Socket Server Settings #############################
      
      # The port the socket server listens on
      port=9092
      
      # Hostname the broker will bind to. If not set, the server will bind to all interfaces
      host.name=192.168.3.160
      
      ...
      ############################# Zookeeper #############################
      
      # Zookeeper connection string (see zookeeper docs for details).
      # This is a comma separated host:port pairs, each corresponding to a zk
      # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
      # You can also append an optional chroot string to the urls to specify the
      # root directory for all kafka znodes.
      zookeeper.connect=192.168.3.163:2181
      
      # Timeout in ms for connecting to zookeeper
      zookeeper.connection.timeout.ms=6000

      • 4.启动kafka server: nohup bin/kafka-server-start.sh config/server.properties
        此命令忽略hangup可以在远程连接关闭后继续运行。
      kafka集群的多个broke连接到同一个zookeeper,生产者往一个broke发送消息,消费者从zookeeper获得订阅。
  • 3.Storm搭建

    # Licensed to the Apache Software Foundation (ASF) under one
    # or more contributor license agreements.  See the NOTICE file
    # distributed with this work for additional information
    # regarding copyright ownership.  The ASF licenses this file
    # to you under the Apache License, Version 2.0 (the
    # "License"); you may not use this file except in compliance
    # with the License.  You may obtain a copy of the License at
    #
    # http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    ########### These MUST be filled in for a storm configuration
    # storm.zookeeper.servers:
    #     - "server1"
    #     - "server2"
    storm.zookeeper.servers:
      - "192.168.3.161"
    
    storm.zookeeper.port: 2181
    
    #
    # nimbus.host: "nimbus"
    #
    nimbus.host: "192.168.3.160"
    nimbus.childopts: -Xmx1024m -Djava.net.preferIPv4Stack=true
    
    ui.childopts: -Xmx768m -Djava.net.preferIPv4Stack=true
    
    ui.host: 0.0.0.0
    ui.port: 8080
    
    supervisor.childopts: -Djava.net.preferIPv4Stack=true
    worker.childopts: -Xmx768m -Dfile.encoding=utf-8 -Djava.net.preferIPv4Stack=true
    
    supervisor.slots.ports:
    - 6700
      - 6701
      - 6702
      - 6703
    
    storm.local.dir: /data/cluster/storm
    
    storm.log.dir: /data/cluster/storm/logs
    
    logviewer.port: 8000
    #
    # ##### These may optionally be filled in:
    #
    ## List of custom serializations
    # topology.kryo.register:
    #     - org.mycompany.MyType
    #     - org.mycompany.MyType2: org.mycompany.MyType2Serializer
    #
    ## List of custom kryo decorators
    # topology.kryo.decorators:
    #     - org.mycompany.MyDecorator
    #
    ## Locations of the drpc servers
    # drpc.servers:
    #     - "server1"
    #     - "server2"
    drpc.servers:
      - "192.168.3.160"
    
    ## Metrics Consumers
    # topology.metrics.consumer.register:
    #   - class: "backtype.storm.metric.LoggingMetricsConsumer"
    #     parallelism.hint: 1
    #   - class: "org.mycompany.MyMetricsConsumer"
    #     parallelism.hint: 1
    #     argument:
    #       - endpoint: "metrics-collector.mycompany.org"

  • 这个脚本文件写的不咋地,所以在配置时一定注意在每一项的开始时要加空格,冒号后也必须要加空格,否则storm就不认识这个配置文件了。 
    说明一下:storm.local.dir表示storm需要用到的本地目录。nimbus.host表示那一台机器是master机器,即nimbus。storm.zookeeper.servers表示哪几台机器是zookeeper服务器。storm.zookeeper.port表示zookeeper的端口号,这里一定要与zookeeper配置的端口号一致,否则会出现通信错误,切记切记。当然你也可以配superevisor.slot.port,supervisor.slots.ports表示supervisor节点的槽数,就是最多能跑几个worker进程(每个sprout或bolt默认只启动一个worker,但是可以通过conf修改成多个)。 

    supervisor.slots.ports: 对于每个Supervisor工作节点,需要配置该工作节点可以运行的worker数量。每个worker占用一个单独的端口用于接收消息,该配置选项即用于定义哪些端口是可被worker使用的。默认情况下,每个节点上可运行4个workers,分别在6700、6701、6702和6703端口.

    • 4.常用命令
    # bin/storm nimbus (启动主节点)
    # bin/storm supervisor (启动从节点)
    # bin/storm ui (启动UI,可以通过浏览器 ip:8080查看运行情况)
    #
    #bin/storm jar storm-demo-1.0.jar io.sterm.demo.topology.WordCountTopology(此命令的作用就是用storm将jar发送给storm去执行,后面的test是定义的toplogy名称)
    #bin/storm kill word-count

    • 启动nimbus后台运行:bin/storm nimbus
    • 启动supervisor后台运行:bin/storm supervisor
    • 启动ui后台运行:bin/storm ui
    • 然后就可以通过http://ip:8080访问界面了
    第五步,测试一下本地模式的WordCount 
    下载storm-starter 编译,并导入eclipse 工程: 
    (http://blog.csdn.net/guoqiangma/article/details/7212677
    1. 下载strom starter的代码 git clone https://github.com/nathanmarz/storm-starter.git 
    2. 使用mvn -f m2-pom.xml package 进行编译 
    3. 复制 storm-starter目录下的m2_pom.xml 为pom.xml ,因为eclipse需要pom.xml 
    4. 使用mvn eclipse:eclipse编译成eclipse工程 
    5. 在Eclipse 中import 选择storm-starter 的路径,一般导入项目后,会需要设置相应的M2_查看工程是否无误,可能会需要配置M2_REPO变量, 
    M2_REPO配置方法:工程上右键->Properties->Java Build Path->Libraries->AddVariable->Configure Variable->New 
    输入Name:M2_REPO , Path:localRepository路径->ok刷新工程,代码无误了,可以进行开发了 
    6. 编译无误后,现在本地跑storm.starter目录下的WordCountTopology,看到如下的截屏,代表本地的local模式可以跑通过 
    使用eclipse的export功能导出项目的jar包,便于以后分布式的情况下,提交相应的逻辑 
    Strom-Starter构建失败,缺少twitter4j包 的解决办法: 
    (http://www.cnblogs.com/zeutrap/archive/2012/10/11/2720528.html
    修改Storm-Starter的pom文件m2-pom.xml ,修改dependency中twitter4j-core 和 twitter4j-stream两个包的依赖版本,如下:
    org.twitter4j
    twitter4j-core
    [2.2,)
    
    org.twitter4j
    twitter4j-stream
    [2.2,)



  • 
    



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值