2021SC@SDUSC
Trident 中的 Bolt 执行器
2021SC@SDUSC
类似于事务Topology中的协调Bolt, Trident中利用TridentBoltExecutor来执行Trident中的SubTopologyBolt。
ITridentBatchBolt.java
public interface ITridentBatchBolt extends IComponent {
void prepare(Map<String, Object> conf, TopologyContext context, BatchOutputCollector collector);
void execute(BatchInfo batchInfo, Tuple tuple);
void finishBatch(BatchInfo batchInfo);
Object initBatchState(String batchGroup, Object batchId);
void cleanup();
}
与IBatchBolt接口相比,这个接口多了initBatchState方法和cleanup方法。Trident避免了每次都需要反序列化产生一个新的Bolt对象这种情况的发生,在收到节点组中一个事务的第一条消息后,将调用initBatchState方法初
始化事务,并在结束时调用finishBatch方法,从而实现了事务隔离。
TrackedBatch类
SubTopologyBolt节点中可能会同时处理来自多个事务的消息,类TrackedBatch用于跟踪Bolt中正在处理的事务
public static class TrackedBatch {
int attemptId;
BatchInfo info;
CoordCondition condition;
int reportedTasks = 0;
int expectedTupleCount = 0;
int receivedTuples = 0;
Map<Integer, Integer> taskEmittedTuples = new HashMap<>();
boolean failed = false;
boolean receivedCommit;
Tuple delayedAck = null;
public TrackedBatch(BatchInfo info, CoordCondition condition, int attemptId) {
this.info = info;
this.condition = condition;
this.attemptId = attemptId;
receivedCommit = condition.commitStream == null;
}
}
TridentBoltExecutor
数据成员以及prepare方法
Map<GlobalStreamId, String> batchGroupIds;
Map<String, CoordSpec> coordSpecs;
Map<String, CoordCondition> coordConditions;
ITridentBatchBolt bolt;
long messageTimeoutMs;
long lastRotate;
RotatingMap<Object, TrackedBatch> batches;
OutputCollector collector;
CoordinatedOutputCollector coordCollector;
BatchOutputCollector coordOutputCollector;
TopologyContext context;
@Override
public void prepare(Map<String, Object> conf, TopologyContext context, OutputCollector collector) {
messageTimeoutMs = context.maxTopologyMessageTimeout() * 1000L;
lastRotate = System.currentTimeMillis();
batches = new RotatingMap<>(2);
this.context = context;
this.collector = collector;
coordCollector = new CoordinatedOutputCollector(collector);
coordOutputCollector = new BatchOutputCollectorImpl(new OutputCollector(coordCollector));
coordConditions = (Map) context.getExecutorData("__coordConditions");
if (coordConditions == null) {
coordConditions = new HashMap<>();
for (String batchGroup : coordSpecs.keySet()) {
CoordSpec spec = coordSpecs.get(batchGroup);
CoordCondition cond = new CoordCondition();
cond.commitStream = spec.commitStream;
cond.expectedTaskReports = 0;
for (String comp : spec.coords.keySet()) {
CoordType ct = spec.coords.get(comp);
if (ct.equals(CoordType.single())) {
cond.expectedTaskReports += 1;
} else {
cond.expectedTaskReports += context.getComponentTasks(comp).size();
}
}
cond.targetTasks = new HashSet<>();
for (String component : Utils.get(context.getThisTargets(),
coordStream(batchGroup),
new HashMap<String, Grouping>()).keySet()) {
cond.targetTasks.addAll(context.getComponentTasks(component));
}
coordConditions.put(batchGroup, cond);
}
context.setExecutorData("coordConditions", coordConditions);
}
bolt.prepare(conf, context, coordOutputCollector);
}
batchGroupsIds:
用来存储从全局流到节点组序号的映射关系。
coordSpecs;
用来存储每个节点组内部节点协调消息的接收关系。
coordConditions:
用来存储每个节点组内部的协调条件。
batches:
用来跟踪该节点正在处理的事务。
bolt:
为TridentBoltExecutor所代理的Bolt节点,目前为SubTopologyBolt类型。
messageTimeoutMs、astRotate:
设置消息的超时,与系统的消息超时设置相同。
prepare方法:
主要用来完成对成员变量的初始化。处于同一个Executor的Task会属于相
同的组件,这些Task的协调消息设置是相同的。Trident将该设置存储于
Executor的共享数据中。由于Executor中的Task是按照顺序进行初始化的,故该Executor中的其他Task将直接获得这些设置,并不需要重新计算。
主要成员方法分析
execute
@Override
public void execute(Tuple tuple) {
if (TupleUtils.isTick(tuple)) {
long now = System.currentTimeMillis();
if (now - lastRotate > messageTimeoutMs) {
batches.rotate();
lastRotate = now;
}
return;
}
String batchGroup = batchGroupIds.get(tuple.getSourceGlobalStreamId());
if (batchGroup == null) {
coordCollector.setCurrBatch(null);
bolt.execute(null, tuple);
collector.ack(tuple);
return;
}
IBatchID id = (IBatchID) tuple.getValue(0);
TrackedBatch tracked = (TrackedBatch) batches.get(id.getId());
if (tracked != null) {
if (id.getAttemptId() > tracked.attemptId) {
batches.remove(id.getId());
tracked = null;
} else if (id.getAttemptId() < tracked.attemptId) {
// no reason to try to execute a previous attempt than we've already seen
return;
}
}
if (tracked == null) {
tracked =
new TrackedBatch(new BatchInfo(batchGroup, id, bolt.initBatchState(batchGroup, id)), coordConditions.get(batchGroup),
id.getAttemptId());
batches.put(id.getId(), tracked);
}
coordCollector.setCurrBatch(tracked);
//System.out.println("TRACKED: " + tracked + " " + tuple);
TupleType t = getTupleType(tuple, tracked);
if (t == TupleType.COMMIT) {
tracked.receivedCommit = true;
checkFinish(tracked, tuple, t);
} else if (t == TupleType.COORD) {
int count = tuple.getInteger(1);
tracked.reportedTasks++;
tracked.expectedTupleCount += count;
checkFinish(tracked, tuple, t);
} else {
tracked.receivedTuples++;
boolean success = true;
try {
bolt.execute(tracked.info, tuple);
if (tracked.condition.expectedTaskReports == 0) {
success = finishBatch(tracked, tuple);
}
} catch (FailedException e) {
failBatch(tracked, e);
}
if (success) {
collector.ack(tuple);
} else {
collector.fail(tuple);
}
}
coordCollector.setCurrBatch(null);
}
if (TupleUtils.isTick(tuple)):
Bolt需要跟踪所有正在被处理的事务,但由于某些事务处理失败,被跟踪的事务可能不会被及时清理,长此下去将导致内存泄露。此部分对这种情况进行了处理,这非常类似于Spout节点中的消息超时技术,即当收到从Tick流发来的消息时,系统会对batch中较老的数据进行超时处理。
if (batchGroup == null):
若消息的来源流不属于任何一个节点组,则不对该消息进行跟踪,而是直接
调用代理类的execute方法,并对输人的消息进行Ack。
if(t == TupleType.COMMIT)
若收到的消息类型为事务提交消息,则设置receivedComnit为true,这是事务处理结束的条件之一。
在收到控制消息后,将调用checkFinish方法来检查事务处理是否已经结束
private void checkFinish(TrackedBatch tracked, Tuple tuple, TupleType type) {
if (tracked.failed) {
failBatch(tracked);
collector.fail(tuple);
return;
}
CoordCondition cond = tracked.condition;
boolean delayed = tracked.delayedAck == null
&& (cond.commitStream != null && type == TupleType.COMMIT
|| cond.commitStream == null);
if (delayed) {
tracked.delayedAck = tuple;
}
boolean failed = false;
if (tracked.receivedCommit && tracked.reportedTasks == cond.expectedTaskReports) {
if (tracked.receivedTuples == tracked.expectedTupleCount) {
finishBatch(tracked, tuple);
} else {
//TODO: add logging that not all tuples were received
failBatch(tracked);
collector.fail(tuple);
failed = true;
}
}
if (!delayed && !failed) {
collector.ack(tuple);
}
}