Aggregator Transformation

The Aggregator is an active transformation that changes the number of rows in the pipeline. The Aggregator
transformation has the following components and options:
¨ Aggregate cache. The Integration Service stores data in the aggregate cache until it completes aggregate
calculations. The Integration Service stores group values in an index cache and it stores row data in the data
cache.
¨ Aggregate expression. Enter an expression in an output port. The expression can include nonaggregate
expressions and conditional clauses.
¨ Group by port. Indicate how to create groups. You can configure an input, input/output, output, or variable port
for the group. When grouping data, the Aggregator transformation outputs the last row of each group unless
otherwise specified.
¨ Sorted input. Select this option to improve session performance. To use sorted input, you must pass data to
the Aggregator transformation sorted by group by port, in ascending or descending order.

 

You can select multiple group by ports to create a new group for each unique combination. The Integration Service then performs the defined aggregation for each group.可以给input, input/output, output, and variable ports选上group by选项,Aggregator Transformation为所有选上group by的ports的唯一组合
生成唯一的一行数据,没有选择group by的有聚合函数的port按聚合函数计算,没有选择group by的没有聚合函数的port默认按LAST聚合函数计算。

(也可以所有port不选择group by则是对所有行聚合)

When you group values, the Integration Service produces one row for each group. If you do not group values, the
Integration Service returns one row for all input rows. The Integration Service typically returns the last row of each
group (or the last row received) with the result of the aggregation
. However, if you specify a particular row to be
returned (for example, by using the FIRST function), the Integration Service then returns the specified row.

Aggregator Transformation可以用来去掉重复的行 distinct
比如:所有的port都是input/output port并且选上group by选项。

经过Aggregator Transformation所有行经过了分组处理。

The Designer allows aggregate expressions only in the Aggregator transformation. An aggregate expression can
include conditional clauses and nonaggregate functions. The expression can also include one aggregate function
within another aggregate function, such as:
MAX( COUNT( ITEM ))
CEIL(AVG(POINTS)) IIF( MAX( QUANTITY ) > 0, MAX( QUANTITY ), 0))
SUM( COMMISSION, COMMISSION > QUOTA )

Null Values in Aggregate Functions
When you configure the Integration Service, you can choose how you want the Integration Service to handle null
values in aggregate functions. You can choose to treat null values in aggregate functions as NULL or zero. By
default, the Integration Service treats null values as NULL in aggregate functions.

 

Using Sorted Input


To use sorted input, you pass sorted data through the Aggregator.
Data must be sorted in the following ways:
By the Aggregator group by ports, in the order they appear in the Aggregator transformation.
If you use sorted input and do not presort data correctly, you receive unexpected results.

For relational and file sources, use the Sorter transformation to sort data in the mapping before passing it to the
Aggregator transformation. You can place the Sorter transformation anywhere in the mapping prior to the
Aggregator if no transformation changes the order of the sorted data. Group by columns in the Aggregator
transformation must be in the same order as they appear in the Sorter transformation.

且Sorter transformation中的排序列要么全选升序要么全选降序。

If the session uses relational sources, you can also use the Number of Sorted Ports option in the Source Qualifier
transformation
to sort group by columns in the source database. Group by columns must be in the same order in
both the Aggregator and Source Qualifier transformations.

Aggregator transformations如果不选中Sorted Input,它会重新分组组织数据,如果你希望它尊重upstream的排序顺序,需要选中Sorted Input.

 

Sorter Transformation是一个block Transformation,等它处理完它的数据才会往下走(downstream)

When you do not use sorted input, the Integration Service performs aggregate calculations as it reads. Since the
data is not sorted, the Integration Service stores data for each group until it reads the entire source to ensure all
aggregate calculations are accurate.

When you use sorted input, the Integration Service assumes all data is sorted by group and itperforms aggregate calculations as it reads rows for a group.

 


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值