Storm安装部署与应用(1)

最近在使用Storm实时计算框架,总结一下学习到的知识。以下陈述纯属个人观点,如有错误,请斧正。

关于Storm是做什么的?Storm是一个流式实时计算框架。何为流式?简单的说流水线模式,一个接一个的向下一个流转。何为实时?关于实时,就是Storm能够做到毫秒级甚至纳秒级梳理一条数据(注:这里的处理时间与业务逻辑和服务器性能有关)。

能够做到相当短的时间内处理一条数据。下面我介绍一下干货。

1、Storm的安装部署(集群)

a:首先第一步需要先安装Zookeeper,首先先去Apache上下载zookeeper的安装文件。上传到服务器

#tar -zxvf zookeeper.x.xx.tar.gz

然后进入zookeeper的conf文件下,将zoo_sample.cfg 修改成zoo.cfg

# cd zookeeper.x.xx/conf/
# cp zoo_sample.cfg  zoo.cfg

然后修改zoo.cfg中配置(注:配置一台机器,其他机器配置文件相同,ps:除了myid文件

# vim zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
#dataDir=/tmp/zookeeper
dataDir=/home/storm/zookeeper/data

# the port at which the clients will connect
clientPort=2181
server.1=xx.xx.xx.01:2888:3888
server.2= xx.xx.xx.02:2888:3888
server.3= xx.xx.xx.03:2888:3888

# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=1
注: 红色字体部分需要做修改或新增

dataDir=/home/storm/zookeeper/data这种目录要手工创建并有读写权限

autopurge.snapRetainCount=3、autopurge.purgeInterval=1这两项是配置zookeeper自动删除临时文件,只保留最新的三个

dataDir目录下创建myid文件,server.x中其中x代表几,就在myid中写几,例如xx.xx.xx.01代表1myid中写1,启动zookeeper.

nohup ./bin/zkServer.sh start &
#jps

当出现QuorumPeerMain进程代表zookeeper启动成功

2、Storm的安装配置

多台服务器配置相同,配置好一台复制到另外几台即可,我这里是三台。

a:下载Storm安装文件,上传到服务器,进行解压。

# tar -zxvf apache-Storm-xx.xx.tar.gz
进入Storm的配置文件夹下,将storm.yaml进行备份
# cd apache-storm-x.x.x/conf/
# cp storm.yaml  storm_bak.yaml
# vim storm.yaml
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
    - "xx.xx.xx.01"
    - "xx.xx.xx.02"
    - "xx.xx.xx.03"

nimbus.host: "xx.xx.xx.01"
# nimbus.host: "nimbus"
#
storm.local.dir: "/home/storm_data"
supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703
#  supervisor.childopts: "-Xmx1024m"
worker.childopts: "-Xmx2048m"
#  topology.state.synchronization.timeout.secs: 60
topology.message.timeout.secs: 150
#  topology.enable.message.timeouts: true
topology.max.spout.pending: 8000
#  topology.ackers: 0
#


# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
#     - org.mycompany.MyType
#     - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
#     - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
#     - "server1"
#     - "server2"

## Metrics Consumers
# topology.metrics.consumer.register:
#   - class: "backtype.storm.metric.LoggingMetricsConsumer"
#     parallelism.hint: 1
#   - class: "org.mycompany.MyMetricsConsumer"
#     parallelism.hint: 1
#     argument:
#       - endpoint: "metrics-collector.mycompany.org"

注:storm.zookeeper.servers:是配置的Zookeeper的地址

storm.local.dir: "/home/storm_data"目录需要手工创建

supervisor.slots.ports:

    - 6700

    - 6701

    - 6702

    - 6703

固定配置,每台机器最多启动4个进程,他们的端口号

worker.childopts: "-Xmx2048m"每个进程虚拟机内存

topology.message.timeout.secs:150消息150秒没有Act就认为失败,然后重发

topology.max.spout.pending: 8000  spout限流,每个spout实例中的没有act和失败的最大待处理消息条数。

启动Storm,输入以下命令:

# nohup storm nimbus > myout_numbus.file 2>&1 & 
# nohup storm supervisor > myout_sup.file 2>&1 & 
#  nohup storm ui  > myout_ui.file 2>&1 &
当看到
以下几个进程后即为安装成功。













  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值