INCREMENTAL AGGREGATION IN INFORMATICA

Incremental Aggregation is the process of capturing the changes in the source and calculating the aggregations in a session. This process makes the integration service to update the target incrementally and avoids the process of calculating the aggregations on the entire source. Consider the below sales table as an example and see how the incremental aggregation works.

Source:

YEAR PRICE
----------
2010 100
2010 200
2010 300
2011 500
2011 600
2012 700

For simplicity, I have used only the year and price columns of sales table. We need to do aggregation and find the total price in each year.

When you run the session for the first time using the incremental aggregation, then integration service process the entire source and stores the data in two file, index and data file. The integration service creates the files in the cache directory specified in the aggregator transformation properties.

After the aggregation, the target table will have the below data.

Target:

YEAR PRICE
----------
2010 600
2011 1100
2012 700

Now assume that the next day few more rows are added into the source table.

Source:

YEAR PRICE
----------
2010 100
2010 200
2010 300
2011 500
2011 600
2012 700

2010 400
2011 100
2012 200
2013 800

Now for the second run, you have to pass only the new data changes to the incremental aggregation. So, the source will contain the last four records. The incremental aggregation uses the data stored in the cache and calculates the aggregation. Once the aggregation is done, the integration service writes the changes to the target and the cache. The target table will contains the below data.

Target:

YEAR PRICE
----------
2010 1000
2011 1200
2012 900
2013 800

Points to remember
  1. When you use incremental aggregation, first time you have to run the session with complete source data and in the subsequent runs you have to pass only the changes in the source data.
  2. Use incremental aggregation only if the target is not going to change significantly. If the incremental aggregation process changes more than hhalf of the data in target, then the session perfromance many not benfit. In this case go for normal aggregation.

Note:  The integration service creates a new aggregate cache when
  • A new version of mapping is saved
  • Configure the session to reinitialize the aggregate cache
  • Moving or deleting the aggregate files
  • Decreasing the number of partitions

Configuring the mapping for incremental aggregation

Before enabling the incremental aggregation option, make sure that you capture the changes in the source data. You can use lookup transformation or stored procedure transformation to remove the data which is already processed. You can also create a trigger on the source database and can read only the source changes in the mapping.

Hi,
Is incremental aggregation so simple? If we implement d idea of incremental load or CDC, and by default aggregator has caching property...why do i need to excercise incremental aggregation as separate option.

What is the advantage of using this over normal map. (using cdc and not using incremental aggregation property). Please explain.

Normal aggregator also caches the data. However, this cache will be cleared when the session run completes. In case of incremental aggregation the cache will not be cleared and it is reused in the next session run.

If you want to use normal aggregation, every time you run the session you have to pass the complete source data to calculate the aggregation. In case of incremental aggregation, as the processed data is stored in the cache, you just need to pass only the changes in the source. This way the data in cache and the changes form the complete source.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值