ssis 包_SSIS包中的错误处理概述

ssis 包

This article explains the process of configuring Error handling in SSIS package.

本文介绍了在SSIS程序包中配置错误处理的过程。

介绍 (Introduction)

In the last article, Overview of SSIS Package Logging, we explored a method to log error in the package execution for different destination providers such as SQL table, XML and text files. The SSIS packages also known as Extract Transform Load (ETL) packages. Usually, we get data from different sources, and we integrate those data on a daily basis in SQL tables.

在上一篇文章SSIS包日志概述中 ,我们探索了一种方法,用于记录针对不同目标提供程序(例如SQL表,XML和文本文件)的包执行中的错误。 SSIS包也称为提取转换负载(ETL)包。 通常,我们从不同来源获取数据,并且每天将这些数据集成到SQL表中。

Suppose we get a flat file from a third party daily and that file contains millions of records. We designed an SSIS package to import that data, and it is running fine. Occasionally, we get bad data (invalid character, special character, invalid length or data type) in the flat files and package execution fails. It is a complicated task to identify which particular row caused this issue. We want our SSIS package to take care of this and do the followings:

假设我们每天从第三方获得一个平面文件,并且该文件包含数百万条记录。 我们设计了一个SSIS包来导入该数据,并且运行良好。 有时,我们在平面文件中收到错误的数据(无效字符,特殊字符,无效长度或数据类型),并且包执行失败。 确定哪个特定的行导致了此问题是一项复杂的任务。 我们希望我们的SSIS包能够解决此问题,并执行以下操作:

  • The SSIS package should execute successfully if all data is in order

    如果所有数据顺序正确,则SSIS包应成功执行
  • In case of any bad data, SSIS package should process all useful data, and bad data should be sent in a separate file. We can open the file later and see what the bad records it contains. It would help us to process the rest of data and do not fail the package

    如果有任何不良数据,SSIS包应处理所有有用的数据,并且不良数据应在单独的文件中发送。 我们稍后可以打开文件,查看其中包含的不良记录。 这将帮助我们处理其余数据,并且不会使包失败

错误处理要求的实际方案 (Practical scenarios for error handling requirement)

Let’s understand this with the following example. We have a CSV file that contains sales records products. If you look at the following image, in the [Total Amount] column, it should have numeric value but records, it contains text format.

让我们用下面的例子来理解这一点。 我们有一个CSV文件,其中包含销售记录产品。 如果查看下面的图像,则在[总计]列中,该图像应具有数字值,但有记录,其中包含文本格式。

Sample Data

创建SSIS包以进行错误处理 (Create SSIS package for error handling )

Firstly, we create a package for error handling in SSIS package and see what happens if we execute the package directly.

首先,我们在SSIS包中创建一个用于错误处理的包,然后看看如果直接执行该包会发生什么。

Let’s open the Visual Studio for data tools and create a new integration service project. In the control flow area, add a data flow task and rename it as [Learn Error Handling].

让我们打开Visual Studio for data工具并创建一个新的Integration Service项目。 在控制流区域中,添加一个数据流任务,并将其重命名为[Learn Error Handling]。

Add a Data Flow task

Right-click on [Learn Error Handling] task and edit. It moves to data flow configuration page. Add a Flat file destination.

右键单击[学习错误处理]任务并进行编辑。 移至数据流配置页面。 添加平面文件目标。

add a flat file source

You can notice the three things in the below image:

您可以在下图中注意到三件事:

  • Blue color arrow

    蓝色箭头
  • Red color arrow

    红色箭头
  • Red cross on Excel source

    Excel来源上的红十字会

Red Cross on Flat file source shows that we need to configure before using it. We use it for error handling in SSIS package.

平面上的红十字会文件来源表明,使用它之前我们需要进行配置。 我们将其用于SSIS包中的错误处理。

Double-click on flat file source and configure a new connection by providing the location of the source file.

双击平面文件源,然后通过提供源文件的位置来配置新连接。

It automatically takes the configuration such as text qualifier, header row delimiter, header rows to skip. Let’s go with the default values.

它会自动采用文本限定符,标题行定界符,标题行之类的配置来跳过。 让我们使用默认值。

Flat File Connection manager editor

Click on a column to verify the data in the source text file and available columns.

单击列以验证源文本文件和可用列中的数据。

View columns and their data for error handling in SSIS

In the Advanced page, we can change column properties such as column names, column delimiter, data type, output column width.

在“高级”页面中,我们可以更改列属性,例如列名,列定界符,数据类型,输出列宽。

Advanced page to view all columns and their properties

We do not want to change any values at this point. Click OK, and it moves to data flow page.

我们现在不想更改任何值。 单击确定,它将移至数据流页面。

Configured SSIS flat file source

In the next step, add a data conversion task from the SSIS toolbox. We use a data conversion task to change the data type of the source column as per our requirements.

在下一步中,从SSIS工具箱中添加数据转换任务。 根据我们的要求,我们使用数据转换任务来更改源列的数据类型。

Add a data conversion task

We do not see any red cross icon on the data conversion task, but still, it requires a configuration. Double-click on data conversion to open Data conversion transformation editor.

我们在数据转换任务上看不到任何红叉图标,但是仍然需要进行配置。 双击数据转换以打开数据转换转换编辑器。

Select all source columns and change the data type as per the following list:

选择所有源列,然后根据以下列表更改数据类型:

  • [Orderdate] data type should be string date[DT_DATE]

    [Orderdate]数据类型应为字符串date [DT_DATE]
  • [Region],[Name],[items] data type should be string[DT_STR]

    [地区],[名称],[项目]数据类型应为字符串[DT_STR]
  • [Units] data type should be four-byte signed integer [DT_I4]

    [单位]数据类型应为四字节有符号整数[DT_I4]
  • [Unit cost] and [Total amount] data type should be currency [DT_CY]

    [单位成本]和[总计]数据类型应为货币[DT_CY]

In the following screenshot, you can also see Output alias for each column. We can give a proper name for each column alias or ignore it.

在以下屏幕截图中,您还可以看到每列的输出别名。 我们可以为每个列别名指定一个适当的名称,也可以忽略它。

Data conversion transformation editor

In the next step, add an OLE DB destination. This OLE DB destination will contain the table name in which we want to insert data.

在下一步中,添加一个OLE DB目标。 该OLE DB目标将包含我们要在其中插入数据的表名。

add a destination table

Double click on destination table, and it opens the OLE DB destination editor.

双击目标表,它会打开OLE DB目标编辑器。

OLE DB destination editor

We do not have an existing table; therefore, click on New to create a new table. It automatically gives the script for creating a new table.

我们没有现有的表格; 因此,单击“新建”以创建一个新表。 它会自动提供用于创建新表的脚本。

Look at the script carefully; we have the columns with their original name and alias name. Remove the original columns from the script, rename the alias columns and specify a valid name (table name should be unique in a database). We do not remove the alias column because these columns contain the correct data type we require in the SQL table.

仔细看一下脚本; 我们有带有原始名称和别名的列。 从脚本中删除原始列,重命名别名列并指定一个有效名称(表名称在数据库中应该是唯一的)。 我们不删除别名列,因为这些列包含我们在SQL表中所需的正确数据类型。

Create a new table to hold data

The updated script should look like the one below.

更新后的脚本应类似于以下脚本。

Remove unwanted columns and create table

Click OK, and it returns to OLE DB destination editor. In this window, verify the mapping between source and destination table.

单击“确定”,它返回到OLE DB目标编辑器。 在此窗口中,验证源表和目标表之间的映射。

OLE DB destination editor

In the above image, you can see the input and destination columns. You might think that mapping is correct, but we need to change the mapping. We use the data conversion operator, and it gives output column names starting with Copy*. We require these copy columns as input columns.

在上图中,您可以看到输入和目标列。 您可能会认为映射是正确的,但是我们需要更改映射。 我们使用数据转换运算符,它给出以Copy *开头的输出列名称。 我们需要这些复制列作为输入列。

Select the columns from the drop-down list, and you see the following configuration.

从下拉列表中选择列,您将看到以下配置。

change the mapping between input and destination columns

The overall SSIS package looks like the following image.

整个SSIS程序包如下图所示。

configured SSIS package with source, data conversion and destination table

Press F5 to execute the package and it fails at data conversion. We have not configured error handling in SSIS package as of now.

按F5执行该程序包,它在数据转换时失败。 到目前为止,我们尚未在SSIS包中配置错误处理。

Failed SSIS package at data conversion with error handling in SSIS

Imagine a situation that the SSIS package failed in a SQL agent job and now you need to find out the reason. If you have a million records in the flat file, it is a complicated task to identify the problematic row and column. You might need to do a lot of manual tasks, in this case.

想象一下SSIS包在SQL代理作业中失败的情况,现在您需要找出原因。 如果您在平面文件中有一百万条记录,那么确定有问题的行和列是一项复杂的任务。 在这种情况下,您可能需要执行很多手动任务。

SSIS comes as a good friend in this case. We can configure an error output. For each task, we see two arrows, blue and red arrow:

在这种情况下,SSIS成为好朋友。 我们可以配置一个错误输出。 对于每个任务,我们看到两个箭头,蓝色和红色箭头:

  • Blue arrow works for successful operation. If the task is executed successfully, it moves the data to the next step

    蓝色箭头表示成功操作。 如果任务成功执行,它将数据移至下一步
  • The red arrow is for the error handling in SSIS package, and we can redirect errors to required destinations

    红色箭头用于SSIS包中的错误处理,我们可以将错误重定向到所需的目标

Data conversion editor

Drag a flat-file destination and join the red arrow with the flat file. It opens the following window to configure error output.

拖动平面文件目标并将红色箭头与平面文件连接。 它打开以下窗口来配置错误输出。

Configure error output

We can use all the columns list and error configurations for error handling in SSIS package. We have following configuration available.

我们可以将所有列列表和错误配置用于SSIS包中的错误处理。 我们提供以下配置。

组件失败 (Failed Component)

If the data flow task fails, it fails the package. It is the default option.

如果数据流任务失败,则它将使包失败。 这是默认选项。

忽略故障 (Ignore failure)

If we change the error mode to ignore failure, it ignores the error message and completes the execution. For example, in our case, it will ignore the rows with bad data and continue working with other rows. It does not throw any error message. We should not use it in the production environment.

如果我们将错误模式更改为忽略失败,则它将忽略错误消息并完成执行。 例如,在我们的例子中,它将忽略数据错误的行,并继续处理其他行。 它不会引发任何错误消息。 我们不应该在生产环境中使用它。

重定向行 (Redirect rows)

It is a useful configuration for this article. We can redirect the failed rows to a configured destination and view the bad data rows in that file. We can configure it for the truncation or the error message.

这是本文的有用配置。 我们可以将失败的行重定向到配置的目标,并查看该文件中的错误数据行。 我们可以将其配置为截断或错误消息。

For this article, we want to configure for the error message only. Make the changes as per the following screenshot.

对于本文,我们只想配置错误消息。 根据以下屏幕截图进行更改。

change error to rediect row

Click OK and configure the flat file destination by specifying the file location. This file will get the data redirected by the error handling in SSIS package.

单击确定,然后通过指定文件位置来配置平面文件目标。 该文件将通过SSIS包中的错误处理来重定向数据。

The package configuration is complete, and it looks like the one below.

程序包配置完成,看起来像下面的一样。

add a flat file destination for the error data

Execute the package now, and it shows successful now. In the following screenshot, you can note the following:

立即执行该程序包,它现在显示成功。 在以下屏幕截图中,您可以注意以下几点:

  • Destination table output got 15 rows out of 18 rows available in the source file

    目标表输出在源文件中的18行中获得了15行
  • The flat file destination (error file) gets remaining three rows. The error handling in SSIS package redirects three rows to the flat file

    平面文件目标(错误文件)剩余三行。 SSIS包中的错误处理将三行重定向到平面文件

package execution

In the sample data shown initially, we had three incorrect data rows. It matches with the output of error handling in SSIS package.

在最初显示的示例数据中,我们有三个错误的数据行。 它与SSIS包中错误处理的输出匹配。

Let’s go to the flat file destination path and open the error file. We can see the bad data rows written into this file.

让我们转到平面文件目标路径并打开错误文件。 我们可以看到不良数据行已写入此文件。

verify bad data rows in the flat file

It is easy for us to look at these wrong data rows and fix them. We do not need to go through all rows containing both good and bad data rows. The SSIS package execution also does not hamper, and it executes successfully with the good data rows.

我们很容易查看这些错误的数据行并进行修复。 我们不需要遍历所有包含好数据行和坏数据行的行。 SSIS包的执行也不会妨碍,它会在数据行正确的情况下成功执行。

结论 (Conclusion)

Error handling in SSIS package the life of a DBA and developer more accessible by diverting the failed or wrong data rows to a separate destination. We can quickly troubleshoot the issues in failed packages. You should have these error handlings in SSIS package.

通过将失败或错误的数据行转移到单独的目标位置,SSIS中的错误处理使DBA和开发人员的生活更加容易访问。 我们可以快速排除故障包装中的问题。 您应该在SSIS包中进行这些错误处理。

翻译自: https://www.sqlshack.com/an-overview-of-error-handling-in-ssis-package/

ssis 包

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值