在storm的collector中(如SpoutOutputCollector)会发现 collector.emit() 方法最终调用的是
public List<Integer> emit(String streamId, List<Object> tuple, Object messageId) {
return this._delegate.emit(streamId, tuple, messageId);
}
即有3个参数, stremId , tuple , messageId 。 其中tuple和 messageId 比较好理解,tuple是要发送的 List , messageId是事务id 。下面的 代码来探究一下 streamId是什么
下面的代码是一个简单的 WordCount的程序, 只写了一个语句的发送RandomSentenceSpout 和处理语句的SplitSentenceBolt, 处理语句仅仅把收到的tuple打印出来。
public class WorkCountTopologySimple {
public static class RandomSentenceSpout extends BaseRichSpout {
SpoutOutputCollector collector;
Random rand;
String[] sentences =null;
public void open(Map map, TopologyContext topologyContext, SpoutOutputCollector spoutOutputCollector) {
this.collector = spoutOutputCollector;
rand = new Random();
sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" };
}
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
outputFieldsDeclarer.declare(new Fields("sentence"));
}
public void nextTuple() {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
String sentence = sentences[rand.nextInt(sentences.length)];
// System.out.println(sentence);
this.collector.emit(new Values(sentence));
}
}
public static class SplitSentenceBolt extends BaseRichBolt {
private OutputCollector collector;
public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
collector= outputCollector;
}
public void execute(Tuple tuple) {
String sentence = tuple.getStringByField("sentence");
System.out.println(sentence);
}
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
outputFieldsDeclarer.declare(new Fields("word"));
}
}
public static void main(String[] args) throws InterruptedException {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 1);
builder.setBolt("split", new SplitSentenceBolt(), 2).shuffleGrouping("spout");
Config conf = new Config();
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("word-count", conf, builder.createTopology());
Thread.sleep(60000);
cluster.shutdown();
}
}
观察 SpoutOutputCollector的 emit方法, 发现最终调用的方法是
public List<Integer> emit(List<Object> tuple, Object messageId) {
return this.emit("default", tuple, messageId);
}
其中default即 streamid , 由此可推知:默认情况下,Spout发送到下游Bolt的stream-id,以及Bolt发送到下游Bolt或者接收上游Spout/Bolt的stream-id都是default。
下面的代码指定了 streamid
public class WorkCountTopologySimple {
public static class RandomSentenceSpout extends BaseRichSpout {
SpoutOutputCollector collector;
Random rand;
String[] sentences =null;
public void open(Map map, TopologyContext topologyContext, SpoutOutputCollector spoutOutputCollector) {
this.collector = spoutOutputCollector;
rand = new Random();
sentences = new String[]{ "A the cow jumped over the moon", "B an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" };
}
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
outputFieldsDeclarer.declare(new Fields("sentence"));
outputFieldsDeclarer.declareStream("A-split", new Fields("sentence"));
}
public void nextTuple() {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
String sentence = sentences[rand.nextInt(sentences.length)];
if(sentence.startsWith("A")){
this.collector.emit("A-split",new Values(sentence));
}else {
this.collector.emit(new Values(sentence));
}
}
}
public static class SplitSentenceBolt extends BaseRichBolt {
private OutputCollector collector;
public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
collector= outputCollector;
}
public void execute(Tuple tuple) {
String sentence = tuple.getStringByField("sentence");
System.out.println(sentence);
}
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
outputFieldsDeclarer.declare(new Fields("word"));
}
}
public static void main(String[] args) throws InterruptedException {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 1);
builder.setBolt("split", new SplitSentenceBolt(), 2).shuffleGrouping("spout","A-split");
Config conf = new Config();
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("word-count", conf, builder.createTopology());
Thread.sleep(60000);
cluster.shutdown();
}
}
观察输出结果,发现只有A开头的语句被发送到 SplitSentenceBolt
结论
当你声明一个bolt的输入流时,你总是以另一个组件的指定流作为输入。如果你想订阅另一个组件的所有流,你必须分别订阅每一个流。InputDeclarer提供了使用默认流ID订阅流的语法糖,调用declarer.shuffleGrouping(“1”)订阅组件“1”上的默认流,作用等同于declarer.shuffleGrouping(“1”, DEFAULT_STREAM_ID)。
emit方法在默认时,发送的stream_id即为default