ssis 有条件拆分_SSIS条件拆分概述

ssis 有条件拆分

SQL Server Integration Services or SSIS is used as an ETL tool to extract-transform-load data from heterogeneous data sources to different databases. After extracting data from the different sources, most often there are a lot of transformations needed. One of the frequent transformations is SSIS Conditional Split.

SQL Server Integration Services或SSIS用作ETL工具,用于将数据从异构数据源提取转换加载到不同的数据库。 从不同来源提取数据后,通常需要进行很多转换。 常见的转换之一是SSIS条件拆分。

情境 (Scenario)

Let us assume we have a set of employees, who has different payment types such as Permanent, Temporary, and Commission bases. As you know, different payment types need different calculations. Let us assume following is the data set.

让我们假设我们有一组员工,他们具有不同的付款类型,例如永久性,临时性和佣金基数。 如您所知,不同的付款类型需要不同的计算。 让我们假设以下是数据集。

Sample data set to demonstrate SSIS Conditional Split

Now the requirement is to perform the calculation for different payment types. This means that you need to split this data set into different payment types and they perform the relent calculation.

现在,要求对不同的付款类型执行计算。 这意味着您需要将此数据集划分为不同的付款类型,然后它们执行剩余的计算。

SSIS实施 (SSIS Implementation)

Let us implement this in SSIS.

让我们在SSIS中实现这一点。

First, create an SSIS project in Visual Studio and open the existing DTSX package.

首先,在Visual Studio中创建一个SSIS项目,然后打开现有的DTSX包。

Since this is a data flow task drag and drop Data Flow Task to the control flow as shown in the below image.

由于这是一个数据流的任务拖放数据流任务的控制流程中所示的下面图像英寸

Data Flow task in the Control Flow.

Then double-click the Data Flow Task which will open in the data flow pane.

然后双击将在数据流窗格中打开的数据流任务

Since we are extracting data from a text a file, let us create a connection to the text file and create a source for it from the Flat file Source.

由于我们要从文本文件中提取数据,因此让我们创建与文本文件的连接,并从平面文件源中为其创建源。

Setting up flat file connection

Since this text file is a comma-separated value, the Delimited format is selected which is the default setting in the flat file connection along with the other default settings.

由于此文本文件是逗号分隔的值,因此选择了“分隔格式”,这是平面文件连接中的默认设置以及其他默认设置。

Following is the sample data set which can be seen after setting up the flat file connection.

以下是设置平面文件连接后可以看到的示例数据集。

Data set view in the Flat Fiel connection Manager

Now we are ready to separate incoming data set to different payment types.

现在,我们准备将传入数据集分离为不同的付款类型。

SSIS条件拆分控制 (SSIS Conditional Split Control)

SSIS Conditional Split control can be seen in the SSIS toolbox under the Common category as shown in the below image.

SSIS条件拆分控件可以在SSIS工具箱的“通用”类别下看到,如下图所示。

SSIS Conditional Split in the SSIS Tool box.

Drag and drop the SSIS Conditional Split control to the data flow and connect with the flat file connection as shown below.

将SSIS条件拆分控件拖放到数据流中,并与平面文件连接进行连接,如下所示。

Connecting SSIS Conditioal Split control to the Data Source.

As shown in the above image, SSIS Conditional Split control is renamed to Split for Different Pay Types for better readability.

如上图所示,SSIS条件拆分控件已重命名为“拆分”以用于不同的付款类型,以提高可读性。

Next is to configure the SSIS Conditional Split Control which can be done by double-clicking the Conditional Split Control.

接下来是配置SSIS条件拆分控件,可以通过双击条件拆分控件来完成。

Configuration of SSIS Conditional Split Transformation.

Above are most of the important configurations in the SSIS Conditional Split Control. In this configuration page, you need to provide the condition which will be used to split the data set. Scripting in the conditional split configuration needs the VBScript format.

以上是SSIS条件拆分控制中的大多数重要配置。 在此配置页面中,您需要提供将用于拆分数据集的条件。 条件拆分配置中的脚本需要VBScript格式。

In the above configuration, three conditions are configured. In this configuration, the dataset is divided into three conditions. Developers have the option of drag and drop the column names from the above to the condition which will become much easier for the developers. These conditions can incorporate with inbuilt string functions, mathematical functions, Date/Time functions, NULL functions.

在以上配置中,配置了三个条件。 在此配置中,数据集分为三个条件。 开发人员可以选择将列名从上面拖放到条件,这对开发人员来说将变得更加容易。 这些条件可以与内置的字符串函数,数学函数,日期/时间函数,NULL函数结合使用。

Though the above configurations are fairly straightforward, there can be instances where the split condition can be complexed. When there is a complex condition, there can be instances where one record may fall into multiple conditions. Due to the Priority order, when a record satisfies a condition, it will not be evaluated again.

尽管上述配置非常简单,但是在某些情况下,拆分条件可能会很复杂。 当情况复杂时,可能会出现一条记录可能属于多种情况的情况。 由于优先顺序,当记录满足条件时,将不会再次对其进行评估。

In the above configuration, there is a Default output name called Other. This is to transfer all records which do not fall into any of the above conditions. This means that all the records coming into the SSIS Conditional Split control will be output from the control.

在以上配置中,有一个默认输出名称,称为其他。 这是为了转移不属于上述任何条件的所有记录。 这意味着进入SSIS条件拆分控件的所有记录都将从该控件输出。

Next is to configure the output from the SSIS Conditional Split.

接下来是配置SSIS条件拆分的输出。

Output Paths for SSIS Conditional Split Tranformation.

As you can see in the above screenshot, there are four paths coming out from the SSIS Conditional Split control.

从上面的屏幕快照中可以看到,SSIS条件拆分控件提供了四个路径。

Configuration of SSIS Conditional Split control outputs.

As seen from the above screenshot, the output is split into four paths and after this, each path is independent of the other path. This means that different transformations can be done for different paths as seen in the following image after executing the package.

从上面的屏幕截图可以看出,输出被分为四个路径,此后,每个路径都独立于另一个路径。 这意味着可以在执行程序包后对不同的路径进行不同的转换,如下图所示。

Execution of SSIS Package with SSIS Conditional Split transaformation.

As shown in the above screenshot, all eight records are split into four paths. Relevant records can be viewed by enabling data viewer at the relevant path.

如上面的屏幕快照所示,所有八个记录都分为四个路径。 通过在相关路径上启用数据查看器 ,可以查看相关记录。

最佳实践 (Best Practices)

Most of the time, developers forget to configure the Other path as shown in the above example. Most of the time, developers will configure the split conditions for their requirement. However, with overtime, when there is a new configuration comes to the data source, this will not be considered. The better option would be, at least move the default path to the audit log so that it can be viewed different times to identify whether those records need to be considered so that conditions can be modified accordingly.

在大多数情况下,开发人员会忘记如上例所示配置“其他”路径。 在大多数情况下,开发人员将根据自己的需求配置拆分条件。 但是,随着时间的流逝,当数据源有新配置时,将不考虑这一点。 更好的选择是,至少将默认路径移至审核日志,以便可以在不同时间查看该日志,以确定是否需要考虑这些记录,以便可以相应地修改条件。

结论 (Conclusion)

Always remember to configure the error output. This applies to every transformation in SSIS. Since you are dealing with data which you don’t have control over, you don’t know what are the data coming in. Therefore, it is always better to configure the error and redirect to a different target.

始终记得配置错误输出。 这适用于SSIS中的每个转换。 由于您正在处理无法控制的数据,因此您不知道输入的数据是什么。因此,最好配置错误并重定向到其他目标。

翻译自: https://www.sqlshack.com/ssis-conditional-split-overview/

ssis 有条件拆分

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值