https://www.cnblogs.com/fangxuanlang/category/1633463.html
1. 管道概念和设计
1.1. 设计数据流
你能在 pipeline 中分支或者合并一个数据流.
1.1.1. 数据流分叉
When you connect a stage to multiple stages, all data passes to all connected stages. You can configure required fields for a stage to discard records before they enter the stage, but by default all records are passed.
For example, 下面的管道,从目录中的所有数据,穿戴2个分支去处理, ,但是你可以配置必要的字段,便于分割,或者替代不需要的记录.
对于更复杂的条件,路由数据通过 stream selector.
某些状态产生事件,到事件流。事件流
Some stages generate events that pass to event streams. Event streams originate from an event-generating stage, such as an origin or destination, and pass from the stage through an event stream output, as follows:
For more information about the event framework and event streams, see Dataflow Triggers Overview.
1.1.2. 合并流
在一个管道中,可以合并2步或更多步 流到 下一个状态. 当合并数据时, Data Collector 传递 所有流到相同的 阶段,但是并不 合并记录.
For example, 下面例子中, the Stream Selector 阶段 sends "空值" 到 Field Replacer 阶段:
The data from the Stream Selector default stream and all data from Field Replacer pass to Expression Evaluator for further processing, but in no particular order and with no record merging.
Important: Pipeline validation does not prevent duplicate data. To avoid writing duplicate data to destinations, configure the pipeline logic to remove duplicate data or to prevent the generation of duplicate data.
Note that you cannot merge event streams with data streams. Event records must stream from the event-generating stage to destinations or executors without merging with data streams. For more information about the event framework and event streams, see Dataflow Triggers Overview.
.....持续可看首行地址..........