Centos 下Storm集群的搭建
zookeeper集群搭建在hadoop0,hadoop1,hadoop2机子上。hadoop0 为nimbus主节点,hadoop1,hadoop2为supervisor从节点。
1.官网下载storm包
apache-storm-0.9.3.tar.gz
在/usr/softinstall/ 目录下。
2.解压并且重命名为storm
tar -zxvf
apache-storm-0.9.3.tar.gz
mv
apache-storm-0.9.3 storm
|
3.在/etc/profile 中设置环境变量。
4.
修改配置文件usr/softinstall/storm/conf/storm.yarml,修改为如下:
storm.zookeeper.servers:
- "hadoop0"
- "hadoop1"
- "hadoop2"
nimbus.host: "hadoop0"
storm.local.dir: "/usr/softinstall/storm/workdir"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
|
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "hadoop0"
- "hadoop1"
- "hadoop2"
nimbus.host: "hadoop0"
storm.local.dir: "/usr/softinstall/storm/workdir"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
#
#
# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
# - org.mycompany.MyType
# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
# - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
# - "server1"
# - "server2"
## Metrics Consumers
# topology.metrics.consumer.register:
# - class: "backtype.storm.metric.LoggingMetricsConsumer"
# parallelism.hint: 1
# - class: "org.mycompany.MyMetricsConsumer"
# parallelism.hint: 1
# argument:
# - endpoint: "metrics-collector.mycompany.org"
|
5.拷贝到各个节点。
scp -r strom hadoop1:/usr/softinstall/
scp -r strom hadoop2:/usr/softinstall/
|
6.启动zookeeper集群。在hadoop0,hadoop1,hadoop2节点zookeeper下执行命令
bin/zkServer.sh start
|
7.
主控节点上启动nimbus:
|
在Storm各个工作节点上运行:
|
在Storm主控节点上启动ui:
|
启动后可以通过http://{nimbus host}:8080观察集群的worker资源使用情况、Topologies的运行状态等信息
|
8.观察日志消息 ,启动log后台程序, 并放到后台执行
nohup bin/storm logviewer >/dev/null 2>&1 &
|
启动后可以通过http://{host}:8000观察日志信息。(nimbus节点可以不用启动logviewer进程,因为logviewer进程主要是为了方便查看任务的执行日志,这些执行日志都在supervisor节点上。)
storm的 Ja
va客户端环境搭建
统计单词个数
1.建立maven项目,加入storm的jar包
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>0.9.3</version>
</dependency>
|
2.建立
SimpleDataSourceSpout 读取数据发送
package com.east.storm.example;
import java.io.File;
import java.io.IOException;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.Random;
import org.apache.commons.io.FileUtils;
import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichSpout;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Values;
public class SimpleDataSourceSpout extends BaseRichSpout {
private SpoutOutputCollector collector;
private static String[] info = new String[]{
"hello world are you ok",
"let go to you home",
"we are frend yes or no",
"big china",
"I LOVE YOU",
"DO YOU LOVE ME",
"I LOVE china",
"YOU LOVE china",
"WE LOVE china",
};
Random random=new Random();
@Override
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
this.collector = collector;
}
@Override
public void nextTuple() {
try {
String msg = info[random.nextInt(9)];
// 调用发射方法
collector.emit(new Values(msg));
// 模拟等待2000ms
Thread.sleep(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("line"));
}
}
|
package com.east.storm.example;
import java.util.Map;
import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Tuple;
import backtype.storm.tuple.Values;
public class SimpleSplitBolt extends BaseRichBolt {
private OutputCollector collector;
@Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
@Override
public void execute(Tuple input) {
String value = input.getStringByField("line");
String[] splits = value.split("\t");
for (String word : splits) {
this.collector.emit(new Values(word));
}
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
|
4.建立SimpleSumBolt 类,统计单词个数。
package com.east.storm.example;
import java.util.HashMap;
import java.util.Map;
import java.util.Map.Entry;
import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Tuple;
public class SimpleSumBolt extends BaseRichBolt {
private OutputCollector collector;
@Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
HashMap<String, Integer> hashMap = new HashMap<String, Integer>();
@Override
public void execute(Tuple input) {
String word = input.getStringByField("word");
Integer value = hashMap.get(word);
if (value == null) {
value = 0;
}
value++;
hashMap.put(word, value);
System.out.println("========================================");
for (Entry<String, Integer> entry : hashMap.entrySet()) {
System.out.println(entry.getKey() + "===" + entry.getValue());
}
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
}
}
|
5.建立Topology,连接spout和bolt,入口函数。
package com.east.storm.example;
import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.generated.AlreadyAliveException;
import backtype.storm.generated.InvalidTopologyException;
import backtype.storm.topology.TopologyBuilder;
public class SimpleWordCountTopology {
public static void main(String[] args) throws AlreadyAliveException, InvalidTopologyException {
TopologyBuilder topologyBuilder = new TopologyBuilder();
SimpleDataSourceSpout dataSourceSpout = new SimpleDataSourceSpout();
SimpleSplitBolt splitBolt = new SimpleSplitBolt();
SimpleSumBolt sumBolt = new SimpleSumBolt();
topologyBuilder.setSpout("a1", dataSourceSpout);
topologyBuilder.setBolt("b1", splitBolt).shuffleGrouping("a1");
topologyBuilder.setBolt("b2", sumBolt).shuffleGrouping("b1");
// StormTopology topologybean = topologyBuilder.createTopology();
Config config = new Config();
config.setDebug(true);
if (args != null && args.length > 0) {
config.setNumWorkers(1);
StormSubmitter.submitTopology(args[0], config, topologyBuilder.createTopology());
} else {
// 这里是本地模式下运行的启动代码。
config.setMaxTaskParallelism(1);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("topology_name", config, topologyBuilder.createTopology());
}
}
}
|
storm部署
1.把写好的Java文件打成jar文件上传
Nimbus机器上。
2.执行命令启动
storm jar jar文件设置目录 Java文件main类全名 参数1 参数2 参数3 storm jar allmycode.jar org.me.MyTopology arg1 arg2 arg3 在jar文件的当前目录下启动命令。 |
3.参考连接:
Storm经验总结http://www.cnblogs.com/lwb314/articles/4757200.html
http://blog.itpub.net/28912557/viewspace-1450885/(Storm实战之WordCount )
http://www.cnblogs.com/wjoyxt/p/4333194.html(storm集群相关资料)
http://www.cnblogs.com/405845829qq/p/4480510.html(使用Storm实现实时大数据分析
)
http://segmentfault.com/a/1190000000583408(Storm集群安装部署步骤
)
|