Storm wordcount

前言:

1个Spout得到数据源

2个bolt,其中一个用来把获取到的数据进行切分为单词,另一个bolt用来统计词频


创建java工程,导入storm lib包下的jar 或者通过maven方式进行包管理

Spout代码:

package com.storm.stu01;

import java.util.Map;
import java.util.Random;

import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;

public class WordSpout extends BaseRichSpout {

	private SpoutOutputCollector collector;
	private static final String[] msgs = new String[] { "I have a dream",
			"my dream is to be a data analyst",
			"you know do what you are dreaming", "don't give up your dreams",
			"it's just so so",
			"We need change the traditional ideas and practice boldly",
			"Storm enterprise real time calculation of actual combat",
			"you can be what uou want be" };
	private static final Random random = new Random();

	@Override
	public void open(Map conf, TopologyContext context,
			SpoutOutputCollector collector) {
		// TODO Auto-generated method stub
		this.collector = collector;
	}

	@Override
	public void nextTuple() {
		// TODO Auto-generated method stub
		Utils.sleep(1000);
		String sentence = msgs[random.nextInt(msgs.length)];
		collector.emit(new Values(sentence));
	}

	@Override
	public void declareOutputFields(OutputFieldsDeclarer declarer) {
		// TODO Auto-generated method stub
		declarer.declare(new Fields("sentence"));
	}

}

切分单词的bolt代码:

package com.storm.stu01;

import java.util.Map;

import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.BasicOutputCollector;
import org.apache.storm.topology.IBasicBolt;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;

public class SplitSentenceBolt implements IBasicBolt {

	@Override
	public void declareOutputFields(OutputFieldsDeclarer declarer) {
		// TODO Auto-generated method stub
		declarer.declare(new Fields("word"));
	}

	@Override
	public Map<String, Object> getComponentConfiguration() {
		// TODO Auto-generated method stub
		return null;
	}

	@Override
	public void prepare(Map stormConf, TopologyContext context) {
		// TODO Auto-generated method stub

	}

	@Override
	public void execute(Tuple input, BasicOutputCollector collector) {
		// TODO Auto-generated method stub
		String sentence = input.getString(0);
		for (String word : sentence.split(" ")) {
			collector.emit(new Values(word));
		}
	}

	@Override
	public void cleanup() {
		// TODO Auto-generated method stub

	}

}


统计词频bolt代码:

package com.storm.stu01;

import java.util.Map;
import java.util.Random;

import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;

public class WordSpout extends BaseRichSpout {

	private SpoutOutputCollector collector;
	private static final String[] msgs = new String[] { "I have a dream",
			"my dream is to be a data analyst",
			"you know do what you are dreaming", "don't give up your dreams",
			"it's just so so",
			"We need change the traditional ideas and practice boldly",
			"Storm enterprise real time calculation of actual combat",
			"you can be what uou want be" };
	private static final Random random = new Random();

	@Override
	public void open(Map conf, TopologyContext context,
			SpoutOutputCollector collector) {
		// TODO Auto-generated method stub
		this.collector = collector;
	}

	@Override
	public void nextTuple() {
		// TODO Auto-generated method stub
		Utils.sleep(1000);
		String sentence = msgs[random.nextInt(msgs.length)];
		collector.emit(new Values(sentence));
	}

	@Override
	public void declareOutputFields(OutputFieldsDeclarer declarer) {
		// TODO Auto-generated method stub
		declarer.declare(new Fields("sentence"));
	}

}


运行代码:

package com.storm.stu01;

import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.generated.Nimbus.AsyncProcessor.submitTopology;
import org.apache.storm.generated.TopologyActionOptions;
import org.apache.storm.topology.TopologyBuilder;
import org.apache.storm.tuple.Fields;

public class Tese {
	public static void main(String[] args) throws AlreadyAliveException,
			InvalidTopologyException, AuthorizationException {
		TopologyBuilder builder = new TopologyBuilder();
		builder.setSpout("1", new WordSpout(), 1);
		builder.setBolt("2", new SplitSentenceBolt(), 10).shuffleGrouping("1");
		builder.setBolt("3", new WordCountBolt(), 1).fieldsGrouping("2",
				new Fields("word"));

		Config config = new Config();
		config.setDebug(true);
		config.setNumWorkers(2);

//		// 本地模式
//		LocalCluster cluster = new LocalCluster();
//		cluster.submitTopology("mywordcount", config, builder.createTopology());

		// 集群模式
		StormSubmitter.submitTopology("wordcount", config,
				builder.createTopology());

	}
}


运行方式:

1.本地模式下直接run即可

2.分布式模式下,把工程打入jar包

在nimbus服务器上执行

./storm jar  打包的jar文件路径  执行的class路径 此次topology名字


运行结果:

在本地模式下有如下console结果输出:

====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, up=1, what=2, actual=1}
13140 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [traditional, 1]
13141 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [traditional]
13141 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [traditional] TASK: 12 DELTA: 
13141 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [ideas]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, up=1, what=2, actual=1}
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [ideas, 1]
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [ideas]
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [ideas] TASK: 12 DELTA: 
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [and]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, and=1, up=1, what=2, actual=1}
13143 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [and, 1]
13143 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [and]
13143 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [and] TASK: 12 DELTA: 
13144 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [practice]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, and=1, up=1, what=2, practice=1, actual=1}
13144 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [practice, 1]
13144 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [practice]
13145 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [practice] TASK: 12 DELTA: 
13145 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [boldly]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, boldly=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, and=1, up=1, what=2, practice=1, actual=1}
13145 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [boldly, 1]
13146 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [boldly]
13146 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, strea


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值