Storm 中 IRichBolt 与 IBasicBolt
在Storm中,如果要保证消息发送成功,每个处理tuple,都必须进行ack或者fail。因为storm会追踪每个tuple要占用内存。所以如果你不ack/fail每一个tuple,那么最终年会看到OutOfMemory错误。
对于SplitSentence这一部分,如果用IRichBolt来做(想写得更少,可以直接继承BaseRichBolt):
public class SplitSentence implements IRichBolt {
OutputCollector _collector;
public void prepare(Map conf,
TopologyContext context,
OutputCollector collector) {
_collector = collector;
}
public void execute(Tuple tuple) {
String sentence = tuple.getString(0);
for(String word: sentence.split(" ")) {
_collector.emit(tuple, new Values(word));
}
_collector.ack(tuple);
}
public void cleanup() {
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
IRichBolt继承自IBolt ,它使用的了OutputCollector来发送tuple
OutputCollector有两个emit方法
/the next component's tasks will send Ack response to Acker.
List<Integer> emit(String streamId, Tuple anchor, List<Object> tuple)
//the next component's tasks won't send Ack response to Acker.
List<Integer> emit(String streamId, List<Object> tuple)
大多数Bolt遵循这样的规律:读取一个tuple;发射一些新的tuple;在execute的结束的时候ack这个tuple。这些Bolt往往是一些过滤器或者简单函数。Storm为这类规律封装了一个BasicBolt类。如果用BasicBolt来做, 上面那个SplitSentence可以写成这样:
public class SplitSentence implements IBasicBolt {
public void prepare(Map conf,
TopologyContext context) {
}
public void execute(Tuple tuple,
BasicOutputCollector collector) {
String sentence = tuple.getString(0);
for(String word: sentence.split(" ")) {
collector.emit(new Values(word));
}
}
public void cleanup() {
}
public void declareOutputFields(
OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
IBasicBolt 使用BasicOutputCollector 来发送tuple ,它只有一种方法来发送tuple,但是它封装了OutputCollector的第一种emit方法
List<Integer> emit(String streamId, List<Object> tuple) {
return out.emit(streamId, inputTuple, tuple);
}