ssis sql
介绍 (Introduction)
SQL Server Integration Services provides a wide variety of features that helps developers to build a robust Extract, Transform and Load process. After many years contributing to SSIS related tags on Stackoverflow.com, I can say that many developers have some misunderstanding about SSIS features (SSIS OLE DB Source, SSIS Expressions, SQL Server destination …) especially those which are very similar and have some common usability.
SQL Server集成服务提供了多种功能,可帮助开发人员构建健壮的提取,转换和加载过程。 经过多年为Stackoverflow.com上的SSIS相关标签做出贡献之后,我可以说许多开发人员对SSIS功能(SSIS OLE DB源代码,SSIS表达式,SQL Server目标…)有一些误解,尤其是那些非常相似且具有某些通用性的功能。
This article is a part of a series called “SSIS features face-to-face”, which aim to remove any confusion and to illustrate some of the differences between similar features provided by SQL Server Integration Services.
本文是称为“ SSIS功能面对面 ”的系列文章的一部分,该文章旨在消除任何混淆并说明SQL Server Integration Services提供的相似功能之间的某些区别。
When using a SSIS OLE DB Source, many Access modes can be used to read from the source:
使用SSIS OLE DB源时,可以使用许多访问模式从源中读取:
- Table or View: Select a Table or View from a drop-down list 表格或视图:从下拉列表中选择表格或视图
- Table name or View name variable: Choose a SSIS variable that contains the table or view name
- 表名或视图名变量:选择包含表名或视图名的SSIS变量
- SQL Command: Write your own SQL query (single table query or complex query) SQL命令:编写您自己SQL查询(单表查询或复杂查询)
- SQL Command from variable: Select a SSIS variable that contains a SQL command :选择包含SQL命令的SSIS变量
Many times I was asked on the difference between (1) using Table or View access mode and selecting specific Input Columns from the SSIS OLE DB Source and (2) using a SQL Command to select some columns from one table.
很多时候,我被问及(1)使用表或视图访问模式与从SSIS OLE DB源中选择特定的输入列,以及(2)使用SQL命令从一个表中选择某些列之间的区别。
In order to learn more about SSIS OLE DB Source, you can refer to the following official documentation provided by Microsoft:
为了了解有关SSIS OLE DB Source的更多信息,可以参考Microsoft提供的以下官方文档:
The main reason to run this experiment was a question posted on Stack Exchange Database Administrators website. Due to its importance, I decided to write a related article with more details in order to give more explanation.
进行此实验的主要原因是在Stack Exchange Database Administrators网站上发布了一个问题 。 由于它的重要性,我决定写一篇有关更多细节的相关文章,以便给出更多的解释。
实验 (Experiments)
In order to check the difference between these two options, I will run a small experiment using SQL Server profiler to check what is the command sent to a database in each method. The experiments will be conducted on the AdventureWorks database. In this section, I will describe each step in these experiments and illustrate the results.
为了检查这两个选项之间的差异,我将使用SQL Server事件探查器运行一个小实验,以检查每种方法中发送到数据库的命令是什么。 实验将在AdventureWorks数据库上进行。 在本节中,我将描述这些实验的每个步骤并说明结果。
- You can download the AdventureWorks database from the following page: 您可以从以下页面下载AdventureWorks数据库: AdventureWorks Installation and configurationAdventureWorks安装和配置. 。
创建测试包
(Creating Test Packages
)
First of all, I will create two packages; each one contains a Data Flow Task where I will set up a SSIS OLE DB Source and a Row Count component.
首先,我将创建两个包: 每一个都包含一个数据流任务,在这里我将设置一个SSIS OLE DB源和一个行计数组件。
Before adding these components, we have to create a new OLE DB connection manager in order to set up a connection with the database. Just right-click on the Connection Managers tab and click on New OLE DB connection manager:
在添加这些组件之前,我们必须创建一个新的OLE DB连接管理器以建立与数据库的连接。 只需右键单击“连接管理器”选项卡,然后单击“新建OLE DB连接管理器”:
Then you have to set up your connection by choosing the server name, authentication parameters and the database name (AdventureWorks) as shown in the image below:
然后,您必须通过选择服务器名称,身份验证参数和数据库名称(AdventureWorks)来建立连接,如下图所示:
After configuring the OLE DB Connection manager, we must add a Data Flow Task. And within the Data Flow Task, we have to add a SSIS OLE DB Source and a Row Count Transformation (dummy task) as we mentioned above.
配置OLE DB连接管理器之后,我们必须添加一个数据流任务。 在数据流任务中,我们必须如上所述添加一个SSIS OLE DB源和一个行数转换(虚拟任务)。
In the first package, we will configure the SSIS OLE DB Source to read from [Sales].[Customer] Table using Table or View Data access mode, and I will select CustomerID and AccountNumber columns as Input.
在第一个包中,我们将配置SSIS OLE DB源以使用“ 表”或“查看数据”访问模式从[Sales]。[Customer]表读取,我将选择CustomerID和AccountNumber列作为输入。
In the other package, I will configure the SSIS OLE DB Source to read from the following SQL Command:
在另一个程序包中,我将配置SSIS OLE DB源以从以下SQL命令读取:
SELECT CustomerID, AccountNumber
FROM [Sales].[Customer]
创建并启动探查器跟踪
(Create and Start a Profiler Trace
)
Before executing the created packages, we have to create and configure a Trace using SQL profiler in order to monitor all commands executed on the AdventureWorks database.
在执行创建的包之前,我们必须使用SQL事件探查器创建和配置跟踪,以便监视在AdventureWorks数据库上执行的所有命令。
If you are interested to learn more about SQL profiler, you can refer to the following Microsoft articles:
如果您想了解有关SQL事件探查器的更多信息,可以参考以下Microsoft文章:
From the SQL Server Management Studio, Go to Tools menu strip and click on SQL Server Profiler:
在SQL Server Management Studio中,转到“工具”菜单栏,然后单击“ SQL Server Profiler”:
Then a connection dialog appears where you have to set up a connection with the same Instance you already connected in SQL Server Management Studio.
然后将出现一个连接对话框,您必须在其中设置与SQL Server Management Studio中已连接的同一实例的连接。
After connecting to the Server, we have to configure the Trace as the following:
连接到服务器后,我们必须将跟踪配置如下:
First, we have to select the Tuning Template and to set the trace destination to a new Table called OLEDBSourceTest as mentioned in the images below:
首先,我们必须选择调整模板,并将跟踪目标设置为一个名为OLEDBSourceTest的新表,如下图所示:
After that we must go to the Event Selection Tab and click on Column filters button, then we have to filter on AdventureWorks2017 database as shown in the images below:
之后,我们必须转到“ 事件选择”选项卡,然后单击“ 列过滤器”按钮,然后我们必须对AdventureWorks2017数据库进行过滤,如下图所示:
Now, we have to run the Trace.
现在,我们必须运行跟踪。
Note – it is more preferable to create the trace destination table in a different database to prevent showing unwanted queries in the trace log.
注–最好在其他数据库中创建跟踪目标表,以防止在跟踪日志中显示不必要的查询。
结果 (Results)
After executing the first package (Table or View approach), the following command is logged by the SQL profiler:
在执行第一个包(表或视图方法)之后,SQL事件探查器将记录以下命令:
SELECT * FROM [Sales].[Customer]
Moreover, after executing the second package the following command is logged by the SQL profiler:
此外,在执行第二个程序包之后,SQL事件探查器将记录以下命令:
SELECT CustomerID, AccountNumber FROM [Sales].[Customer]
讨论与结论 (Discussion and Conclusion)
After running this experiment, we can figure that even if we select some specific columns while using Table or View data access mode in a SSIS OLE DB Source, a SELECT * command is executed on the database engine and columns are filtered within the SSIS package which is different from using a SQL command to select specific columns since the second choice will only read these columns from the database engine and not filters are applied in SSIS.
运行此实验后,我们可以发现,即使在SSIS OLE DB源中使用表或视图数据访问模式时选择了一些特定的列, 选择 * 命令在数据库引擎上执行,并且在SSIS包中对列进行过滤,这与使用SQL命令选择特定的列不同,因为第二种选择将仅从数据库引擎读取这些列,而不在SSIS中应用过滤器。
Based on that, we can conclude that using the SQL command is more efficient from the performance perspective since it will decrease the memory usage and the amount of data retrieved. But make sure that the SQL command has its needed indexes created. Also, it will be more efficient to have the primary key columns included even if they are not needed.
基于此,我们可以得出结论,从性能的角度来看,使用SQL命令会更有效,因为它将减少内存使用量和检索到的数据量。 但是请确保已创建SQL命令所需的索引。 同样,即使不需要主键列,也将更加有效。
Also, these results can be applied to Lookups Transformations since they also use the same Data access methods, and be aware that in lookups selecting needed columns is more critical since it affects performance more, especially when handling a huge volume of data.
同样,这些结果可以应用于查找转换,因为它们也使用相同的数据访问方法,并且请注意,在查找中选择所需的列更为关键,因为它对性能的影响更大,尤其是在处理大量数据时。
有用的网址 (Helpful Links)
You can refer to the following links to learn more about monitoring the SQL Server Database Engine and some SSIS best practices:
您可以参考以下链接,以了解有关监视SQL Server数据库引擎和某些SSIS最佳实践的更多信息:
- Is this the end of SQL Profiler? 这是SQL Profiler的结尾吗?
- Integration Services Performance Best Practices – Data Flow Optimization 集成服务性能最佳实践–数据流优化
- Integration Services Performance Best Practices – Writing to the Destination Integration Services性能最佳实践–写目标
目录 (Table of contents)
翻译自: https://www.sqlshack.com/ssis-ole-db-source-sql-command-vs-table-or-view/
ssis sql