Storm-druid源码地址:https://github.com/apache/storm/tree/master/external/storm-druid/src/main/java/org/apache/storm/druid/trident Storm + Druid的事物实现逻辑:首先需要在pom.xml文件中引入storm-druid依赖
<! - https://mvnrepository.com/artifact/org.apache.storm/storm-druid - > < dependency > < groupId > org.apache。 storm </ groupId > < artifactId > storm-druid </ artifactId > < version > 1.1.2 </ version > </ dependency >
其次在创建拓扑的时候记住需要使用TridentTopology来创建拓扑结构:
SimpleBatchSpout:在Trident中需要主要Spout与Storm中的spout的些不是一样的,在Trident中是spout仅仅是提供BacthCoordinator和Emmiter函数的入口,其中BatchCoordinator是用于管理元数据和批次的,Emmiter方式是方法是用来批量发送元组的并且其实用的是TridentCollector发射器,在实现事物spout是可以继承IBatchSpout来实现事物spout
public class SimpleBatchSpout implements IBatchSpout {
private int batchSize;
private final Map<Long, List<List<Object>>> batches = new HashMap<>();
public SimpleBatchSpout(int batchSize) {
this.batchSize = batchSize;
}
@Override
public void open(Map<String, Object> conf, TopologyContext context) {
}
@Override
public void emitBatch(long batchId, TridentCollector collector) {
List<List<Object>> values;
if(batches.containsKey(batchId)) {
values = batches.get(batchId);
} else {
values = new ArrayList<>();
for (int i = 0; i < batchSize; i++) {
List<Object> value = new ArrayList<>();
Map<String, Object> event = new LinkedHashMap<>();
event.put("timestamp", new DateTime().toString());
event.put("publisher", "foo.com");
event.put("advertiser", "google.com");
event.put("click", i);
value.add(event);
values.add(value);
}
batches.put(batchId, values);
}
for (List<Object> value : values) {
collector.emit(value);
}
}
@Override
public void ack(long batchId) {
batches.remove(batchId);
}
@Override
public void close() {
}
@Override
public Map<String, Object> getComponentConfiguration() {
Config conf = new Config();
conf.setMaxTaskParallelism(1);
return conf;
}
@Override
public Fields getOutputFields() {
return SimpleSpout.DEFAULT_FIELDS;
}
}
SampleDruidBoltTridentTopology:在创建topology的时候主意要使用TridentTopology来创建topology结构,另外对于Trident Topology数据持久化的实现是调用 stream.partitionPersist(new DruidBeamStateFactory
public class SampleDruidBoltTridentTopology {
private static final Logger LOG = LoggerFactory.getLogger(SampleDruidBoltTridentTopology.class);
public static void main(String[] args) throws Exception {
if(args.length == 0) {
throw new IllegalArgumentException("There should be at least one argument. Run as `SampleDruidBoltTridentTopology <zk-url>`");
}
TridentTopology tridentTopology = new TridentTopology();
DruidBeamFactory druidBeamFactory = new SampleDruidBeamFactoryImpl(new HashMap<String, Object>());
ITupleDruidEventMapper<Map<String, Object>> eventMapper = new TupleDruidEventMapper<>(TupleDruidEventMapper.DEFAULT_FIELD_NAME);
final Stream stream = tridentTopology.newStream("batch-event-gen", new SimpleBatchSpout(10));
stream.peek(new Consumer() {
@Override
public void accept(TridentTuple input) {
LOG.info("########### Received tuple: [{}]", input);
}
}).partitionPersist(new DruidBeamStateFactory<Map<String, Object>>(druidBeamFactory, eventMapper), new Fields("event"), new DruidBeamStateUpdater());
Config conf = new Config();
conf.setDebug(true);
conf.put("druid.tranquility.zk.connect", args[0]);
if (args.length > 1) {
conf.setNumWorkers(3);
StormSubmitter.submitTopologyWithProgressBar(args[1], conf, tridentTopology.build());
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("druid-test", conf, tridentTopology.build());
Thread.sleep(30000);
cluster.shutdown();
System.exit(0);
}
}
}
补充:
Strom-druid源码实现逻辑分析:
DruidBeamStateFactory 主要是先是创建一个DruidBeamState对象.
public class DruidBeamStateFactory<E> implements StateFactory {
DruidBeamFactory beamFactory = null;
ITupleDruidEventMapper druidEventMapper = null;
public DruidBeamStateFactory(DruidBeamFactory<E> beamFactory, ITupleDruidEventMapper<E> druidEventMapper) {
this.beamFactory = beamFactory;
this.druidEventMapper = druidEventMapper;
}
@Override
public State makeState(Map conf, IMetricsContext metrics, int partitionIndex, int numPartitions) {
return new DruidBeamState(beamFactory.makeBeam(conf , metrics), druidEventMapper);
}
}
DruidBeamState主要实现了Trident Storm中State接口,在DruidBeamState对象中的update方法中使用DruidBeam对象将数据发送值Druid中.
public class DruidBeamState<E> implements State {
private static final Logger LOG = LoggerFactory.getLogger(DruidBeamState.class);
private Beam<E> beam = null;
private ITupleDruidEventMapper<E> druidEventMapper = null;
public DruidBeamState(Beam<E> beam, ITupleDruidEventMapper<E> druidEventMapper) {
this.beam = beam;
this.druidEventMapper = druidEventMapper;
}
public List<E> update(List<TridentTuple> tuples, TridentCollector collector) {
List<E> events = new ArrayList<>(tuples.size());
for (TridentTuple tuple: tuples) {
events.add(druidEventMapper.getEvent(tuple));
}
LOG.info("Sending [{}] events", events.size());
scala.collection.immutable.List<E> scalaList = scala.collection.JavaConversions.collectionAsScalaIterable(events).toList();
Collection<Future<SendResult>> futureList = scala.collection.JavaConversions.asJavaCollection(beam.sendAll(scalaList));
List<E> discardedEvents = new ArrayList<>();
int index = 0;
for (Future<SendResult> future : futureList) {
try {
SendResult result = Await.result(future);
if (!result.sent()) {
discardedEvents.add(events.get(index));
}
} catch (Exception e) {
LOG.error("Failed in writing messages to Druid", e);
}
index++;
}
return discardedEvents;
}
public void close() {
try {
Await.result(beam.close());
} catch (Exception e) {
LOG.error("Error while closing Druid beam client", e);
}
}
@Override
public void beginCommit(Long txid) {
}
@Override
public void commit(Long txid) {
}
DruidBeamStateUpdater继承了BaseStateUpdater对象,以及多重继承.在DruidBeamUpdater中重写了updateState方法,并且调用了DruidBeamState的update方法,将获取的tuple数据通过DruidBeam写入到Druid中.
public class DruidBeamStateUpdater<E> extends BaseStateUpdater<DruidBeamState<E>> {
private static final Logger LOG = LoggerFactory.getLogger(DruidBeamStateUpdater.class);
@Override
public void updateState(DruidBeamState<E> state, List<TridentTuple> tuples, TridentCollector collector) {
List<E> discardedTuples = state.update(tuples, collector);
processDiscardedTuples(discardedTuples);
}
/**
* Users can override this method to process the discarded Tuples
* @param discardedTuples
*/
protected void processDiscardedTuples(List<E> discardedTuples) {
LOG.debug("discarded messages : [{}]" , discardedTuples);
}
}