StreamSets(3.22.2) Pipeline运行源码解析
运行链路
MetricSafeScheduledExecutorService:run
- StandardAloneRunner:start()
- ProductionPipelineRunnable.java:run()
- ProductionPipelineRunner.java:run()
- Pipeline.java:run()
- PipelineRunner.java:run()
- ProductionPipelineRunner.java:run()
- StagePipe:process()
- StageRuntime:execute()
- Source:produce()
- Processer:process()
- Executor:write()
- Target:write()
一条管道(pipeline
)的由Stage
组成, Stage
分为以下四类
Source
Processor
Target
Executor
1. Stage
以上4中组件在代码中都有对应的接口, 最终派生于
Stage
接口
Origins: 代表管道的源。一个管道中只能有一个
origin
。对应接口Source
,Source
中的主要方法时produce
produce
: 运行管道时,Data Collector 从Source阶段调用此方法以获取一批记录进行处理。如果没有数据, Source阶段不应在此方法中无限期阻塞。 他们应该有一个内部超时,之后他们会产生一个空批次。 通过这样做,它让管道中的其他阶段有机会知道管道仍然健康但没有数据; 并可能允许通知外部系统。Processors: 表示要执行的一种数据处理类型, 一个管道中可以有任意数量的
processor
。对应接口Processor
,Processor
中主要方法为process
process
: 运行管道时,数据收集器从Processor阶段调用此方法并处理一批记录。Destations: 代表一个管道的目标, 一个管道可以有任意数量的
destation
。对应接口Target
,Target
中主要方法为write
write
: 运行管道时,数据收集器从Target阶段调用此方法以将一批记录写入外部系统。Executors:
Executor
继承了Target
。在接受到事件时触发. 使用executors
作为事件流中数据流触发器的一部分来执行事件驱动的、与管道相关的任务,例如在目标关闭时移动完全写入的文件。Executor是数据收集器目标的特例。 它不像在正常目标中那样持久保存记录,而是在传入记录中存在的基于外部系统的数据中进行操作。对应接口Executor
,Executor
中主要方法为和Target
相同, 为write
write
: 运行管道时,数据收集器从Target阶段调用此方法以将一批记录写入外部系统。
2. Pipeline
2.1 Method
init
@SuppressWarnings("unchecked") public List<Issue> init(boolean productionExecution) { PipeContext pipeContext = new PipeContext(); this.runner.setRuntimeConfiguration(pipeContext,pipelineConf,pipelineBean.getConfig()); List<Issue> issues = new ArrayList<>(); // 发布事件 if (productionExecution) { LineageEvent event = createLineageEvent(LineageEventType.START, runner.getRuntimeInfo().getBaseHttpUrl(true)); lineagePublisherTask.publishEvent(event); } // 错误和统计信息聚合 try { issues.addAll(badRecordsHandler.init(pipeContext)); } catch (Exception ex) { LOG.warn(ContainerError.CONTAINER_0700.getMessage(), ex.toString(), ex); issues.add(IssueCreator.getStage(badRecordsHandler.getInstanceName()).create(ContainerError.CONTAINER_0700,ex.toString())); } if (statsAggregationHandler != null) { try { issues.addAll(statsAggregationHandler.init(pipeContext)); } catch (Exception ex) { LOG.warn(ContainerError.CONTAINER_0703.getMessage(), ex.toString(), ex); issues.add(IssueCreator.getStage(statsAggregationHandler.getInstanceName()).create(ContainerError.CONTAINER_0703,ex.toString())); } } // 管道生命周期开始事件 if (startEventStage != null) { IssueCreator issueCreator = IssueCreator.getStage(startEventStage.getInfo().getInstanceName()); boolean validationSuccessful = false; // 初始化 try { List<Issue> startIssues = startEventStage.init(); validationSuccessful = startIssues.isEmpty(); issues.addAll(startIssues); } catch (Exception ex) { LOG.warn(ContainerError.CONTAINER_0790.getMessage(), ex.toString(), ex); issues.add(issueCreator.create(ContainerError.CONTAINER_0790, ex.toString())); } // 在生产模式下运行 try { if (productionExecution && validationSuccessful) { LOG.info("Processing lifecycle start event with stage"); runner.runLifecycleEvent(createStartEvent(), startEventStage); } } catch (Exception ex) { LOG.warn(ContainerError.CONTAINER_0791.getMessage(), ex.toString(), ex); issues.add(issueCreator.create(ContainerError.CONTAINER_0791, ex.toString())); } } if (stopEventStage != null) { IssueCreator issueCreator = IssueCreator.getStage(stopEventStage.getInfo().getInstanceName()); // 初始化 try { issues.addAll(stopEventStage.init()); stopEventStageInitialized = issues.isEmpty(); } catch (Exception ex) { LOG.warn(ContainerError.CONTAINER_0790.getMessage(), ex.toString(), ex); issues.add(issueCreator.create(ContainerError.CONTAINER_0790, ex.toString())); } } // 初始化 origin issues.addAll(initPipe(originPipe, pipeContext)); int runnerCount = 1; // 如果是推送源,我们需要初始化剩余的无源管道 if (originPipe.getStage().getStage() instanceof PushSource) { Preconditions.checkArgument(pipes.size() == 1, "There are already more runners then expected"); // runners的有效数量 - 源线程的数量或来自用户的预定义值,以较少者为准 runnerCount = ((PushSource) originPipe.getStage().getStage()).getNumberOfThreads(); int pipelineRunnerCount = pipelineBean.getConfig().maxRunners; if (pipelineRunnerCount > 0) { runnerCount = Math.min(runnerCount, pipelineRunnerCount); } // 确保它不会超过配置的阈值 int sdcRunnerMax = configuration.get(MAX_RUNNERS_CONFIG_KEY, MAX_RUNNERS_DEFAULT); boolean createAdditionalRunners = true; if (runnerCount > sdcRunnerMax) { createAdditionalRunners = false; issues.add(IssueCreator.getPipeline().create(ContainerError.CONTAINER_0705, runnerCount, sdcRunnerMax)); } // 除非请求的runners数量无效,否则创建 if (createAdditionalRunners) { try { for (int runnerId = 1; runnerId < runnerCount; runnerId++) { List<Issue> localIssues = new ArrayList<>(); // 创建 Stage bean 列表 PipelineStageBeans beans = PipelineBeanCreator.get().duplicatePipelineStageBeans(stageLib,pipelineBean.getPipelineStageBeans(),interceptorContextBuilder,originPipe.getStage().getConstants(),userContext.getUser(),connections,localIssues); // 如果在创建时出现问题, break停止 if (!localIssues.isEmpty()) { issues.addAll(localIssues); // 为了创建 bean,我们已经有了类加载器,所以我们需要释放类加载器(否则类加载器会泄漏, 因为 bean 对象没有在任何地方持久化). beans.getStages().forEach(StageBean::releaseClassLoader); break; } // 初始化并转换为无源管道运行器 pipes.add(createSourceLessRunner(stageLib,name,rev,configuration,pipelineConf,runner,stageInfos,userContext,pipelineBean,originPipe.getStage(),runnerId,beans,observer,scheduledExecutor,runnerSharedMaps,startTime,blobStore,lineagePublisherTask,statsCollector)); } } catch (PipelineRuntimeException e) { LOG.error("Can't create additional source-less pipeline runner number {}: {}", runnerCount, e.toString(), e); issues.add(IssueCreator.getPipeline().create(ContainerError.CONTAINER_0704, e.toString())); } } } // 初始化所有无源管道运行程序(pipeline runners) final int finalRunnerCount = runnerCount; for (PipeRunner pipeRunner : pipes) { pipeRunner.forEach("Starting", pipe -> { ((StageContext) pipe.getStage().getContext()).setPipelineFinisherDelegate((PipelineFinisherDelegate) runner); ((StageContext) pipe.getStage().getContext()).setRunnerCount(finalRunnerCount); issues.addAll(initPipe(pipe, pipeContext)); }); } ((StageContext) originPipe.getStage().getContext()).setRunnerCount(runnerCount); ((StageContext) originPipe.getStage().getContext()).setPipelineFinisherDelegate((PipelineFinisherDelegate) runner); return issues; }
run
public void run() throws StageException, PipelineRuntimeException { this.running = true; try { runner.setObserver(observer); // 管道只能有一个源(originPipe), pipes(对应一个或多个(processor和target)) runner.run(originPipe, pipes, badRecordsHandler, statsAggregationHandler); } finally { this.running = false; } }
destory
// 销毁管道。 由于阶段允许在销毁阶段生成事件,destroy() 被委托给运行器(ProductionPipelineRunner 和 PreviewPipelineRunner),这意味着我们将运行最后一批以确保所有事件都被正确传播和处理。 public void destroy( SourcePipe originPipe, List<PipeRunner> pipes, BadRecordsHandler badRecordsHandler, StatsAggregationHandler statsAggregationHandler ) throws StageException, PipelineRuntimeException;
3. PipelineRunner
负责管道的执行,
PipelineRunner
这个运行器的实现负责运行管道——通过管道的概念。目前有两种主要实现
PreviewPipelineRunner
用于预览ProductionPipelineRunner
用于生产执行(独立和集群模式共享)下面用
ProductionPipelineRunner
示例// 这个方法就是Pipeline中的 runner.run(originPipe, pipes, badRecordsHandler, statsAggregationHandler); @Override public void run( SourcePipe originPipe, List<PipeRunner> pipes, BadRecordsHandler badRecordsHandler, StatsAggregationHandler statsAggregationHandler ) throws StageException, PipelineRuntimeException { this.originPipe = originPipe; this.pipes = pipes; this.badRecordsHandler = badRecordsHandler; this.statsAggregationHandler = statsAggregationHandler; this.runnerPool = new RunnerPool<>(pipes, pipeContext.getRuntimeStats(), runnersHistogram); // 管道开始运行 this.running = true; try { LOG.debug("Staring pipeline with offset: {}", offsetTracker.getOffsets()); // 根据源的类型运行不同方法 if (originPipe.getStage().getStage() instanceof PushSource) { // 推送源 runPushSource(); } else { // 轮询源 runPollSource(); } } catch (Throwable throwable) { LOG.error("Pipeline execution failed", throwable); // 发送管道错误通知请求 sendPipelineErrorNotificationRequest(throwable); // 错误通知 errorNotification(originPipe, pipes, throwable); // 抛出异常 Throwables.propagateIfInstanceOf(throwable, StageException.class); Throwables.propagateIfInstanceOf(throwable, PipelineRuntimeException.class); Throwables.propagate(throwable); } if(resetOffset) { // 偏移跟踪器重置偏移 offsetTracker.resetOffset(); } } // -------------------------------------------------------------------------- // 推送源 private void runPushSource() throws StageException, PipelineRuntimeException { // 此对象将接收来自推送源回调的委托调用 originPipe.getStage().setPushSourceContextDelegate(this); // 配置最大批量尺寸 int batchSize = configuration.get(Constants.MAX_BATCH_SIZE_KEY, Constants.MAX_BATCH_SIZE_DEFAULT); // 在快照捕获前使用快照批量大小 if (batchesToCapture > 0) { batchSize = snapshotBatchSize; } try { // 推送原点将阻塞调用,直到所有数据都已被消费或管道停止 originPipe.process(offsetTracker.getOffsets(), batchSize, this); } catch (Throwable ex) { LOG.error("Origin Pipe failed", ex); // 执行失败,但这可能是其他管道执行失败的“依赖”异常 if(exceptionFromExecution == null) { exceptionFromExecution = ex; } } // 如果异常导致执行失败,向上传播 if(exceptionFromExecution != null) { // 记录异常错误码 if(statsCollector != null) { if(exceptionFromExecution instanceof StageException) { statsCollector.errorCode(((StageException) exceptionFromExecution).getErrorCode()); } if(exceptionFromExecution instanceof PipelineRuntimeException) { statsCollector.errorCode(((PipelineRuntimeException) exceptionFromExecution).getErrorCode()); } } Throwables.propagateIfInstanceOf(exceptionFromExecution, StageException.class); Throwables.propagateIfInstanceOf(exceptionFromExecution, PipelineRuntimeException.class); Throwables.propagate(exceptionFromExecution); } } // -------------------------------------------------------------------------- // 轮询源 public void runPollSource() throws StageException, PipelineException { while (!offsetTracker.isFinished() && !stop && !finished) { if (threadHealthReporter != null) { threadHealthReporter.reportHealth(ProductionPipelineRunnable.RUNNABLE_NAME, -1, System.currentTimeMillis()); } for (BatchListener batchListener : batchListenerList) { batchListener.preBatch(); } if(observer != null) { observer.reconfigure(); } // 开始批量执行 long start = System.currentTimeMillis(); FullPipeBatch pipeBatch = createFullPipeBatch(Source.POLL_SOURCE_OFFSET_KEY, offsetTracker.getOffsets().get(Source.POLL_SOURCE_OFFSET_KEY)); // 运行 origin Map<String, Long> memoryConsumedByStage = new HashMap<>(); Map<String, Object> stageBatchMetrics = new HashMap<>(); processPipe( originPipe, pipeBatch, false, null, null, memoryConsumedByStage, stageBatchMetrics ); // 上一步 origin 已经运行, FullPipeBatch 将又有一个新的偏移量 String newOffset = pipeBatch.getNewOffset(); try { // 运行管道其余部分 runSourceLessBatch( start, pipeBatch, Source.POLL_SOURCE_OFFSET_KEY, newOffset, memoryConsumedByStage, stageBatchMetrics ); } catch (Throwable t) { // 失败时创建部分批处理, 当管道进入不可恢复的错误时,通过抢救内存结构来创建特殊批处理。 createFailureBatch(pipeBatch); // 记录错误码 if(statsCollector != null) { if(exceptionFromExecution instanceof StageException) { statsCollector.errorCode(((StageException) exceptionFromExecution).getErrorCode()); } if(exceptionFromExecution instanceof PipelineRuntimeException) { statsCollector.errorCode(((PipelineRuntimeException) exceptionFromExecution).getErrorCode()); } } Throwables.propagateIfInstanceOf(t, StageException.class); Throwables.propagateIfInstanceOf(t, PipelineRuntimeException.class); Throwables.propagate(t); } for (BatchListener batchListener : batchListenerList) { batchListener.postBatch(); } } }
4. StageRuntime
管道执行调用链,
Pipeline.run
->PipeRunner.processPipe
->StagePipe.process
->StageRuntime.execute
->Stage.produce|process|write
// public String execute(final String previousOffset,final int batchSize,final Batch batch,final BatchMaker batchMaker,ErrorSink errorSink,EventSink eventSink,ProcessedSink processedSink,SourceResponseSink sourceResponseSink) throws StageException { Callable<String> callable = () -> { String newOffset = null; switch (getDefinition().getType()) { // origin case SOURCE: newOffset = ((Source) getStage()).produce(previousOffset, batchSize, batchMaker); break; // processor case PROCESSOR: ((Processor) getStage()).process(batch, batchMaker); break; // execuror and target case EXECUTOR: case TARGET: ((Target) getStage()).write(batch); break; default: throw new IllegalStateException(Utils.format("Unknown stage type: '{}'", getDefinition().getType())); } return newOffset; }; return execute(callable, errorSink, eventSink, processedSink, sourceResponseSink); }
produce
: 运行管道时,DC 从Source
阶段调用此方法以获取一批记录进行处理, 如果没有数据, Source阶段不应在此方法中无限期阻塞。 他们应该有一个内部超时,之后他们会产生一个空批次。 通过这样做,它让管道中的其他阶段有机会知道管道仍然健康但没有数据; 并可能允许通知外部系统process
: 运行管道时,DC 从Processor
阶段调用此方法并加工一批记录.write
: 运行管道时,DC 从Target
或Executor
阶段调用此方法以将一批记录写入外部系统