Joiner transformation 使用注意点:
1)使用场景: 不同数据源( 异构源)
2)Join type: Normal Join, Master Outer Join, Detail Outer Join or Full Outer Join.
3) 用行数少的重复记录少的表为 master, 反之为detail
4) 对输入项做排序
5) 理解 blocking transformation,cache
Sorted Input
Use the sorted input option in the joiner properties tab when both the master and detail are sorted on the ports specified in the join condition. You can improve the performance by using the sorted input option as the integration service performs the join by minimizing the number of disk IOs. you can see good performance when worked with large data sets.
Steps to follow for configuring the sorted input option
- Sort the master and detail source either by using the source qualifier transformation or sorter transformation.
- Sort both the source on the ports to be used in join condition either in ascending or descending order.
- Specify the Sorted Input option in the joiner transformation properties tab.
Why joiner transformation is called as blocking transformation
The integration service blocks and unblocks the source data depending on whether the joiner transformation is configured for sorted input or not.
Unsorted Joiner Transformation
In case of unsorted joiner transformation, the integration service first reads all the master rows before it reads the detail rows. The integration service blocks the detail source while it caches the all the master rows. Once it reads all the master rows, then it unblocks the detail source and reads the details rows.
Sorted Joiner Transformation
Blocking logic may or may not possible in case of sorted joiner transformation. The integration service uses blocking logic if it can do so without blocking all sources in the target load order group. Otherwise, it does not use blocking logic.
Joiner Transformation Performance Improve Tips
To improve the performance of a joiner transformation follow the below tips
- If possible, perform joins in a database. Performing joins in a database is faster than performing joins in a session.
- You can improve the session performance by configuring the Sorted Input option in the joiner transformation properties tab.
- Specify the source with fewer rows and with fewer duplicate keys as the master and the other source as detail.