Informatica Powercenter调优

转载 2013年12月06日 11:49:37

Mapping 调优

1) 尽量在source qualifier里过滤数据

2) Join的时候,用数据量较少的表作为Master Table, 尽量用整型字段作为key来join

3) 用aggregator, joiner的时候,如果能确认输入的数据已排过序, 可以打开sorted input. (Data must be sorted in the order of the 'group by' ports.) if the size of the data is small, can use the sorter trans before the aggregater trans. Otherwise do the sort out of informatica. when the size greater than 250,000 rows, it's better to use sorted input.

4) 用|| 而不是concat函数来连接字符

5) Transaction Control Transformation会影响后面的transformation(Aggregater[, Joiner, Rank, Soter] with the All input level transformation scope)或者target的效率, 因为它会忽略超过transaction上线的数据

6) SQL Override的时候尽量不要用aggregate function (it may cause the temp space in Oracle to fill up when selecting large amounts of data)

7) 优化Lookup

caching lookups, 但是当输入记录比较少(5000 rows),但是lookup table比较大(5,000,000), then the lookup table should not be cached.

static cache(by default), it is built when it processes the first lookup request, and its not updated during the session if rows are inserted.

dynamic cache, it should be used when the lookup table is also the target table. so that the lookup cache is updated.

persistent cache, when u want to save and reuse the cache files, when the lookup table doesn't change between session runs.

lookup trans应该去掉不用的端口

lookup condition把=(equality operator)用作第一个条件

lookup trans默认给每一个lookup port添加order by, 可以在lookup override sql中注释掉一部分字段来提高效率。或者在order by port上添加index




Identify the bottlenecks in below stages : 

Methods : 
1) Test the throughput 
To verify the database tuning, create a pipeline with one partition and measure the reader throughput in the workflow monitor and after that add more partitions and verify the throughput scales linearly. If the session has two partitions, the reader throughput should be twice as fast. If the reader throughput is slow then database has to be tuned properly.

Optimizing the Target
1) drop indexes and key constraints before load, then create them after load by pre-session and post-session sql command or pre-load and post-load store procedure.

2) minimize the relational database connections

3) using bulk load

4) In case the target is flat file, create in the same Informatica server, then FTP it to other server if needs.

Optimizing the Source
1) better to join multiple sources in the source qualifier, instead of using joiner trans.

2) create an index on the order by or group by column.

3) using conditional filter in the source qualifer to filter out unnecessary rows.

Optimizing the LookUp
1) consider to disable the lookup cache, if the lookup table is huge, and no efficient sql override. But also needs to consider about the no. of rows which will refer the lookup table. Uncached lookup can be made efficient by creating index on columns used in the lookup condition.

2) using conditional filter in the sql override to reduce the no. of rows cached.

3) don't fetch the unnecessary fields

4) remove the default ORDER BY clause, and create index on columns used in the ORDER BY clause.

5) use lookup trans towards the end of the mapping, this eliminates carring extra ports through the pipeline.

Optimizing the Aggregator
1) use Sorter trans first, then enable the Sorted Input Option. You can do the ORDER BY in SQ also.

2) if possible, use nemeric ports for group by columns.

Optimizing the Joiner
1) mark the source with less number of rows as MASTER.

2) if possible, use nemeric ports for join condition.

Optimizing the Sorter
1) if the amount of source data is greater than the Sorter cache size, then the Integration Service requires more than twice the amount of disk space. For optimal performance, allocate sort cache size as 16 MB of physical memory.

Optimizing the Expression
1) use operatoer instead of function in the expression.

2) uncheck the output option for the ports which are not required as output.

Optimizing in General
1) remove the unnecessary ports

2) don't use the verbose data

3) avoid unnecessary data type conversion

4) when updating targets, only link the necessary ports

5) use shortcut in sharefolder, and reusable components like mapplet, transformation

Optimizing in Session
1) A workflow runs faster when you do not configure it to write session and workflow log files. Workflows and sessions always create binary logs. To write the session or workflow log file, the Integration Service writes logging events twice

Optimizing by Partition
1) In task window under Config Object tab there is one option called Partitioning Options. In that there is one attribute called Dynamic Partitioning. We need to select the value Based on number of nodes in grid.
2) This will dynamically use the available nodes on Informatica servers. If dev has two nodes then the session will run with two partitions and if Prod has five nodes then the session will run with five partitions.
3) Merge the output file if using dynamical partitions. (sequential merge)
4) It is always best to select Target instances and Source qualifier as partition points. For these transformations we can select direct pass-through partition as type.
5) Also if the mapping involves aggregator we need to select a partition point before the aggregator. It is advised to select the sorter transformation as a partition point and for sorter we can select Hash Auto Keys partition as type

1. write sql in mapping only, then refresh to workflow.
2. be care of data type and length.
3. naming convention and description standard
4, check the result option when using the lookup transformation.
5, each port in source qualifier should be bounded to the source.

6. CHECK IF there is useless targets or source in the mapping, which can cause the
 ERROR : Failed to prepare the task. 
   And can't get the session log, even though the mapping and workflow are valid.

7. In the lookup transformation, need to take care of the datatype in the db.
   Ex : char, varchar

8. The lookup transformation will compare the NULL Value, and retrun the matched record.

9. Confiugre the connection $Source/$Target in the mapping desinger for the lookup trans, to make it be consistent with the session property.



1. lookup override sql -- use 'AS, the number, order of the ports should be as same as the sql statement

1. joiner -- make the smaller file the master.
2. In general unconnected lookups should be used if possible since less data is cached.
3. using Sorted Input in Aggregator and Joiner transformations when input rows > 250,000

Informatica 中报: Fisrt Error Code 36401 及其解决办法

最近在Informatica中运行某seesion,通过Sql Query中重写sql取得数据,并写入到指定ftp的指定flat file中去。 target的flat flie是在target de...
  • suncrafted
  • suncrafted
  • 2010年05月04日 12:41
  • 3940


  • 2013年08月26日 14:41
  • 67KB
  • 下载

Informatica PowerCenter 架构、组件及开发步骤(转PPT)

【Informatica PowerCenter应用架构】 【PowerCenter Server和数据移动】 【PowerCenter产品组件】 ...
  • kk185800961
  • kk185800961
  • 2013年01月20日 10:52
  • 4681


使用Informatica定义源和目标: 1.导入源:     启动PowerCenter Designer 开始-->所有程序-->Informatica 9.5.1-->Client-->P...
  • u013813500
  • u013813500
  • 2015年01月10日 10:36
  • 3185

Informatica PowerCenter 常用转换组件一览表

转换(Transformation),是生成、修改 或者 传递数据 的资料库对象。相当于封装了函数,利用输入和输出端口进行数据转换操作。   转换类型 积极转换(Active):可以更改通...
  • yongjian1092
  • yongjian1092
  • 2016年08月10日 20:37
  • 2873


  • 2011年06月25日 14:48
  • 133KB
  • 下载


  • 2013年04月24日 23:42
  • 209KB
  • 下载


  • 2008年12月17日 14:08
  • 712KB
  • 下载


  • 2014年02月24日 13:59
  • 257KB
  • 下载

Informatica 性能调优

  • 2008年06月17日 19:39
  • 44KB
  • 下载
您举报文章:Informatica Powercenter调优