Informatica Powercenter调优

转载 2013年12月06日 11:49:37

Mapping 调优

1) 尽量在source qualifier里过滤数据

2) Join的时候,用数据量较少的表作为Master Table, 尽量用整型字段作为key来join

3) 用aggregator, joiner的时候,如果能确认输入的数据已排过序, 可以打开sorted input. (Data must be sorted in the order of the 'group by' ports.) if the size of the data is small, can use the sorter trans before the aggregater trans. Otherwise do the sort out of informatica. when the size greater than 250,000 rows, it's better to use sorted input.

4) 用|| 而不是concat函数来连接字符

5) Transaction Control Transformation会影响后面的transformation(Aggregater[, Joiner, Rank, Soter] with the All input level transformation scope)或者target的效率, 因为它会忽略超过transaction上线的数据

6) SQL Override的时候尽量不要用aggregate function (it may cause the temp space in Oracle to fill up when selecting large amounts of data)

7) 优化Lookup

caching lookups, 但是当输入记录比较少(5000 rows),但是lookup table比较大(5,000,000), then the lookup table should not be cached.

static cache(by default), it is built when it processes the first lookup request, and its not updated during the session if rows are inserted.

dynamic cache, it should be used when the lookup table is also the target table. so that the lookup cache is updated.

persistent cache, when u want to save and reuse the cache files, when the lookup table doesn't change between session runs.

lookup trans应该去掉不用的端口

lookup condition把=(equality operator)用作第一个条件

lookup trans默认给每一个lookup port添加order by, 可以在lookup override sql中注释掉一部分字段来提高效率。或者在order by port上添加index

 

===================================================================================

 

Identify the bottlenecks in below stages : 
Target
Source
Mapping
Session
System

Methods : 
1) Test the throughput 
To verify the database tuning, create a pipeline with one partition and measure the reader throughput in the workflow monitor and after that add more partitions and verify the throughput scales linearly. If the session has two partitions, the reader throughput should be twice as fast. If the reader throughput is slow then database has to be tuned properly.

Optimizing the Target
1) drop indexes and key constraints before load, then create them after load by pre-session and post-session sql command or pre-load and post-load store procedure.

2) minimize the relational database connections

3) using bulk load

4) In case the target is flat file, create in the same Informatica server, then FTP it to other server if needs.

Optimizing the Source
1) better to join multiple sources in the source qualifier, instead of using joiner trans.

2) create an index on the order by or group by column.

3) using conditional filter in the source qualifer to filter out unnecessary rows.

Optimizing the LookUp
1) consider to disable the lookup cache, if the lookup table is huge, and no efficient sql override. But also needs to consider about the no. of rows which will refer the lookup table. Uncached lookup can be made efficient by creating index on columns used in the lookup condition.

2) using conditional filter in the sql override to reduce the no. of rows cached.

3) don't fetch the unnecessary fields

4) remove the default ORDER BY clause, and create index on columns used in the ORDER BY clause.

5) use lookup trans towards the end of the mapping, this eliminates carring extra ports through the pipeline.

Optimizing the Aggregator
1) use Sorter trans first, then enable the Sorted Input Option. You can do the ORDER BY in SQ also.

2) if possible, use nemeric ports for group by columns.

Optimizing the Joiner
1) mark the source with less number of rows as MASTER.

2) if possible, use nemeric ports for join condition.

Optimizing the Sorter
1) if the amount of source data is greater than the Sorter cache size, then the Integration Service requires more than twice the amount of disk space. For optimal performance, allocate sort cache size as 16 MB of physical memory.

Optimizing the Expression
1) use operatoer instead of function in the expression.

2) uncheck the output option for the ports which are not required as output.

Optimizing in General
1) remove the unnecessary ports

2) don't use the verbose data

3) avoid unnecessary data type conversion

4) when updating targets, only link the necessary ports

5) use shortcut in sharefolder, and reusable components like mapplet, transformation

Optimizing in Session
1) A workflow runs faster when you do not configure it to write session and workflow log files. Workflows and sessions always create binary logs. To write the session or workflow log file, the Integration Service writes logging events twice

Optimizing by Partition
1) In task window under Config Object tab there is one option called Partitioning Options. In that there is one attribute called Dynamic Partitioning. We need to select the value Based on number of nodes in grid.
2) This will dynamically use the available nodes on Informatica servers. If dev has two nodes then the session will run with two partitions and if Prod has five nodes then the session will run with five partitions.
3) Merge the output file if using dynamical partitions. (sequential merge)
4) It is always best to select Target instances and Source qualifier as partition points. For these transformations we can select direct pass-through partition as type.
5) Also if the mapping involves aggregator we need to select a partition point before the aggregator. It is advised to select the sorter transformation as a partition point and for sorter we can select Hash Auto Keys partition as type


1. write sql in mapping only, then refresh to workflow.
2. be care of data type and length.
3. naming convention and description standard
4, check the result option when using the lookup transformation.
5, each port in source qualifier should be bounded to the source.

6. CHECK IF there is useless targets or source in the mapping, which can cause the
 ERROR : Failed to prepare the task. 
   And can't get the session log, even though the mapping and workflow are valid.

7. In the lookup transformation, need to take care of the datatype in the db.
   Ex : char, varchar

8. The lookup transformation will compare the NULL Value, and retrun the matched record.

9. Confiugre the connection $Source/$Target in the mapping desinger for the lookup trans, to make it be consistent with the session property.

 

 

tips
1. lookup override sql -- use 'AS, the number, order of the ports should be as same as the sql statement
2. TARGET LOAD PLAN

performance
1. joiner -- make the smaller file the master.
2. In general unconnected lookups should be used if possible since less data is cached.
3. using Sorted Input in Aggregator and Joiner transformations when input rows > 250,000


相关文章推荐

INFORMATICA1

INFORMATICA INTERVIEW QUESTIONS ON ROUTER TRANSFORMATION 1. What is a router transformation? A...
  • thy822
  • thy822
  • 2013年12月12日 16:26
  • 1448

powercenter问题

1、Powercenter9.6.1hf2中,当把oracle的表导入的过程中发现oracle中的varchar2类型再designer中变换成了nvarchar2(这种正常),nvarchar2变换...
  • wslkn
  • wslkn
  • 2016年12月05日 15:04
  • 601

Informatica中实现count(distinct)

Thanks for your response. Here is the sample data information:  A id1 $200  A id1 $300  A id2 $...
  • thy822
  • thy822
  • 2013年12月04日 15:59
  • 1152

Informatica性能调优(其他)

  • 2011年06月25日 14:48
  • 133KB
  • 下载

Informatica 性能调优

  • 2008年06月17日 19:39
  • 44KB
  • 下载

Informatica PowerCenter 8.6的下载与安装

Informatica PowerCenter 8.6的下载与安装http://datawarehou.se/knowledge/informatica-powercenter-86-download...

informatica调优要点(全)

  • 2008年12月17日 14:08
  • 712KB
  • 下载

Informatica性能调优(初级)

  • 2014年02月24日 13:59
  • 257KB
  • 下载

在windows平台安装Informatica PowerCenter

安装前需要准备的文件 pc86_win32_x86.zip  --  Powercenter Server端安装文件 pc86_client_win32_x86.zip  -- Powercent...
  • thy822
  • thy822
  • 2012年05月27日 19:43
  • 1023

informatica调优要点(中级).pdf

  • 2013年04月24日 23:42
  • 209KB
  • 下载
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:Informatica Powercenter调优
举报原因:
原因补充:

(最多只允许输入30个字)