在写spark streaming job时,有时候我们需要对job信息进行监控,比如监控当前streaming job的每个batch的process time和delay time等,当然通过spark提供的管理后台(默认4040端口)可以查看job的详情,但是并不太方便,我们可以将信息全部输出到自定义的metrics里,然后进一步统计。实现代码如下:
JavaStreamingContext javaStreamingContext =
new JavaStreamingContext(sparkConfig, batchInterval);
do xxx
javaStreamingContext.addStreamingListener(new JobListener());
javaStreamingContext.start();
try {
javaStreamingContext.awaitTermination();
} catch (InterruptedException e) {
logger.error(e.getStackTrace().toString());
}
如上述,添加一个JobListener即可,JobListener实现如下
private static class JobListener implements StreamingListener {
@Override
public void onBatchCompleted(StreamingListenerBatchCompleted batchCompleted) {
try {
batchCompleted.batchInfo().totalDelay().get().toString();
batchCompleted.batchInfo().processingDelay().get().toString();
batchCompleted.batchInfo().schedulingDelay().get().toString();
batchCompleted.batchInfo().numRecords();
Map<Object, OutputOperationInfo> map = JavaConverters.mapAsJavaMapConverter(batchCompleted.batchInfo().outputOperationInfos()).asJava();
for (OutputOperationInfo outputOperationInfo : map.values()) {
System.out.println(outputOperationInfo.name()),
Double.valueOf(outputOperationInfo.duration().get().toString());
}
} catch (Exception e) {
logger.error("JobListener onBatchCompleted", e);
}
}
@Override
public void onReceiverStarted(StreamingListenerReceiverStarted receiverStarted) {
}
@Override
public void onReceiverError(StreamingListenerReceiverError receiverError) {
}
@Override
public void onReceiverStopped(StreamingListenerReceiverStopped receiverStopped) {
}
@Override
public void onBatchSubmitted(StreamingListenerBatchSubmitted batchSubmitted) {
}
@Override
public void onBatchStarted(StreamingListenerBatchStarted batchStarted) {
}
@Override
public void onOutputOperationCompleted(StreamingListenerOutputOperationCompleted arg0) {
}
@Override
public void onOutputOperationStarted(StreamingListenerOutputOperationStarted arg0) {
}
}
上面是在onBatchComplted的时候实现的,是针对batch级别的监控,当然还有更好的实现方式,那就是在onOutputOperationCompleted,这是针对每个batch里的job output输出监控
try {
MetricsWriter metricsWriter = new MetricsWriter();
metricsWriter.addRequest(arg0.outputOperationInfo().name(), Double.valueOf(arg0.outputOperationInfo().duration().get().toString()) / 1000);
metricsWriter.send();
}catch (Exception e){
logger.error("JobListener onOutputOperationCompleted", e);
}
注意,以上代码需要进一步修改,只是个思路,需要自己加入实现的metrics