金属零件图像数据集
Whilst discussing the evolution of SQL Server Data Tools in my previous article, I touched on a new feature, Package Parts, which is available in SSDT 2015. This article takes an in-depth analysis of the Package Parts feature, particularly, as it relates to Data Warehouse development.
在上一篇文章中讨论SQL Server数据工具的演变时,我谈到了SSDT 2015中可用的新功能Package Parts。本文对Package Parts功能进行了深入分析,尤其是与之相关的到数据仓库开发。
背景 (Background)
There are various approaches to designing a reusable SSIS package, for instance, you could create your own package template, implementing parent-child package execution using Execute Package Task, or simply copy and paste an existing package. However, just because your SSIS solution is reusable doesn’t necessarily mean that it’s easily maintainable too. Sometimes as an ETL developer, you need to make changes to your SSIS solution because of structural changes from source, change of connection string etc. and depending on how you have designed your solution, this process of refactoring your ETL could range from bearable to agonising. In the following sections of this article, we discuss and illustrate how Package Parts can be used to create reusable and maintainable SSIS solutions that could later make your ETL refactoring experience, very pleasant.
设计可重复使用的SSIS包的方法有多种,例如,您可以创建自己的包模板,使用“执行包任务”实现父子包执行,或者简单地复制并粘贴现有包。 但是,仅仅因为您的SSIS解决方案是可重用的,并不一定意味着它也易于维护。 有时,作为ETL开发人员,由于源代码的结构更改,连接字符串的更改等,您需要对SSIS解决方案进行更改,并且根据您设计解决方案的方式,重构ETL的过程可能从可承受到痛苦。 。 在本文的以下各节中,我们讨论并说明如何使用“打包部件”来创建可重用和可维护的SSIS解决方案,这些解决方案以后可以使您的ETL重构体验非常愉快。
传统包装模板 (The Traditional Package Template)
In order to understand and appreciate the Package Part feature in SSIS 2016, I will demonstrate some of the limitations we encountered in the traditional package template. A package template allowed you to create a package with shared controls that can later be used as a template for creating other packages. Say for instance you have a requirement to setup a data mart that stores fruit-sales data. Such a data mart might contain the following fact and dimensions:
为了理解和欣赏SSIS 2016中的Package Part功能,我将演示我们在传统包模板中遇到的一些限制。 包模板允许您创建带有共享控件的包,这些控件以后可以用作创建其他包的模板。 假设您需要设置一个存储水果销售数据的数据集市。 这样的数据集市可能包含以下事实和维度:
- Fruits dimension 水果尺寸
- Customer dimension 客户规模
- Method Of Payment dimension 付款方式尺寸
- Fruit Category 水果分类
- Fruit Sales Fact Table 水果销售情况表
Populating the abovementioned dimensions may involve similar processes (i.e. use Execute SQL Task to clear staging table, generate surrogate keys, send out error emails) in which case creating a package template would be useful. A typical package template that can be used to load the fruit-sales data mart is shown in Figure 1.
填充上述维度可能涉及类似的过程(即,使用Execute SQL Task清除登台表,生成代理密钥,发送错误电子邮件),在这种情况下,创建程序包模板将很有用。 图1显示了可用于加载水果销售数据集市的典型包装模板。
The next part involves saving and copying the package template developed into a drive location where Visual Studio is installed. The folder that you need to copy the template to is called DataTransformationItems and depending on your version of Visual Studio installed (in my case, I am using Visual Studio 2013), the path of the drive location should be as follows:
下一部分涉及将开发的包模板保存并复制到安装Visual Studio的驱动器位置。 您需要将模板复制到的文件夹称为DataTransformationItems,并且取决于安装的Visual Studio版本(在我的情况下,我使用的是Visual Studio 2013),驱动器位置的路径应如下所示:
{Your Drive Letter}:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\PrivateAssemblies\ProjectItems \DataTransformationProject\DataTransformationItems
{您的驱动器号}:\ Program Files(x86)\ Microsoft Visual Studio 14.0 \ Common7 \ IDE \ PrivateAssemblies \ ProjectItems \ DataTransformationProject \ DataTransformationItems
Once you have located the correct drive location, then copy and paste the package template into the folder as shown in Figure 2.
找到正确的驱动器位置后,然后将包模板复制并粘贴到该文件夹中, 如图2所示。
After you have closed and re-launched SSDT, the next time you add a new item into an SSIS project, the package template you added in Figure 2 will come up as one of the template options for your new item as shown in Figure 3.
关闭并重新启动SSDT之后,下次将新项目添加到SSIS项目中时,在图2中添加的包模板将作为新项目的模板选项之一出现, 如图3所示。
Now that I have a template, we can use it to create the rest of the packages that will load our fact and dimensions for the fruit-sales data mart. The complete ETL for loading our data mart is shown in Figure 4.
现在我有了一个模板,我们可以使用它来创建其余的程序包,这些程序包将为水果销售数据集市加载我们的事实和维度。 图4中显示了用于加载数据集市的完整ETL。
The greatest limitation with the traditional package template comes in when we need to refactor our SSIS solution. Say for instance we later realise that for some reasons our staging database has been changed from AdvetureWorks2014 to ReportServer. Well, to ensure that any packages created using the template has the correct database reference, we will update our package template such that the OLE DB connection shows the correct database as shown in Figure 5.
当我们需要重构我们的SSIS解决方案时,传统软件包模板的最大局限就出现了。 例如,我们稍后意识到由于某些原因,我们的登台数据库已从AdvetureWorks2014更改为ReportServer 。 好了,为了确保使用该模板创建的所有软件包都具有正确的数据库引用,我们将更新我们的软件包模板,以使OLE DB连接显示正确的数据库, 如图5所示。
To check that the ensure is effective, I created a new dimension called SomeNewDimension.dtsx and as it can be seen in Figure 6, SomeNewDimension.dtsx has ReportServer connection name instead of AdvetureWorks2014.
为了检查确保有效,我创建了一个名为SomeNewDimension.dtsx的新维度,如图6所示, SomeNewDimension.dtsx具有ReportServer连接名称而不是AdvetureWorks2014 。
But, what happens to packages created with an older version of the template? Will they still have references to the old connection? The answer is, unfortunately, yes – as shown in Figure 7.
但是,使用较旧版本的模板创建的包会怎样? 他们还会参考旧的连接吗? 不幸的是,答案是肯定的- 如图7所示。
This means that in order to ensure that the old packages have the correct connection, we will have to open each individual package and update the connection (that’s where the pain comes in, as in a case of a solution with 100 packages would mean opening and editing each package). Another alternative would be to delete all old packages and recreate them using the new template – either way, you will have to manually work on those 100 packages!
这意味着,为了确保旧程序包具有正确的连接,我们将不得不打开每个单独的程序包并更新连接(这就是麻烦所在,例如,对于具有100个程序包的解决方案而言,这意味着打开和编辑每个程序包)。 另一种选择是删除所有旧软件包,然后使用新模板重新创建它们-任一种方式,您都必须手动处理这100个软件包!
包装零件 (Package Parts)
Package Parts enable us to produce package templates that are easier to manage and reuse. As from SSDT 2015, every SSIS project has a new Package Parts node, which in turn, has a Control Flow sub-node, as shown in Figure 8.
套件零件使我们能够生产易于管理和重复使用的套件模板。 从SSDT 2015开始,每个SSIS项目都有一个新的Package Parts节点,该节点又具有一个Control Flow子节点, 如图8所示。
It is in this Control Flow sub-node that we get to create Package Parts by either adding an existing file-based Package Part or simply create one from scratch, as per options shown in Figure 9.
通过在这个Control Flow子节点中,我们可以通过添加现有的基于文件的Package Part或仅从头开始创建一个Package Part来创建Package Parts, 如图9所示。
Working on Package Parts is similar to the way you work on SSIS Packages except that Package Parts have their own file extension, dtsxp. In Figure 10, we have created a Package Part that mirrors the traditional package template that was created earlier on.
处理软件包部件的方法与处理SSIS软件包的方法类似,不同之处在于软件包部件具有自己的文件扩展名dtsxp 。 在图10中 ,我们创建了一个Package Part,它反映了之前创建的传统Package模板。
Now that we have created out Package Part, we have to add it in an SSIS Package. Unfortunately, you cannot just click and drag the Package Part from Solution Explorer into the SSIS Package’s Control Flow area, instead as soon as you have created and saved the Package Part, it will come up under Package Parts node in SSIS Toolbox, as shown in Figure 11 – wherein you can click and drag it onto the SSIS Package’s Control Flow area.
现在我们已经创建了Package Part,我们必须将其添加到SSIS Package中。 不幸的是,您不能仅从“解决方案资源管理器”中单击并拖动“包装零件”到SSIS包装的“控制流”区域,而是一旦创建并保存了包装零件,它就会出现在SSIS工具箱中的“包装零件”节点下,如下图所示。 图11 –您可以单击并将其拖动到SSIS包的“控制流”区域。
Figure 12 shows an SSIS Package that uses a Package Part. As it can be seen, Package Part’s component have a letter P on the right hand corner of that component’s name.
图12显示了使用Package Part的SSIS Package。 可以看出,“包装部件”的组件在该组件名称的右上角有一个字母P。
Now to demonstrate as to how changes to the Package Parts gets applied across packages referencing it, we will attempt to make changes to the Package Parts’ connection name. The existing version of the Package Part has the AdvetureWorks2014 connection name, as can be seen in Figure 13 from one of the SSIS Packages that are referencing our Package Part, PackageTemplate.dtsxp.
现在,为了演示如何在引用它的包中应用对包部件的更改,我们将尝试对包部件的连接名称进行更改。 软件包部件的现有版本具有AdvetureWorks2014连接名称, 如图13所示 ,其中一个SSIS软件包引用了我们的软件包部件PackageTemplate.dtsxp 。
Once the connection name of PackageTemplate.dtsxp is changed to ReportServer, we can see that such a change has been applied on SSIS Packages referencing the Package Part as shown in Figure 14.
将PackageTemplate.dtsxp的连接名称更改为ReportServer后 ,我们可以看到这种更改已应用到引用Package Part的SSIS Packages上, 如图14所示。
包装件的局限性 (Limitations of Package Parts)
One of the things you would have noticed when we replicated the traditional package template into Package Parts was that in the Package Part we had grouped all the executables into a Sequence Container. This is because, although you can get away with adding multiple executables into a package Part without using a Sequence Container, when you attempt to add them from SSIS Toolbox you run into an error message displayed in Figure 15 as apparently Package Parts with multiple executables cannot be added into an SSIS Package.
当我们将传统的程序包模板复制到“程序包部件”中时,您会注意到的一件事是,在“程序包部件”中,我们将所有可执行文件分组到一个“序列容器”中。 这是因为,尽管无需使用序列容器即可将多个可执行文件添加到程序包部件中,但是当您尝试从SSIS工具箱中添加它们时,却遇到了如图15所示的错误消息,因为带有多个可执行程序的程序包部件显然无法被添加到SSIS包中。
I can almost hear what you are thinking, “So Microsoft, why allow me to create such a Package Part with multiple executables outside the Sequence Container, if you are going to prevent me from using it?” Personally, I think this is a bug!
我几乎可以听到您的想法:“那么,Microsoft,为什么要阻止我使用这样的程序包部件,使其在Sequence Container之外具有多个可执行文件?” 我个人认为这是一个错误!
Remember how a change to the Package Part gets applied across all packages who references that Package Part, well, if one of the SSIS Packages that references the Package Part is opened at the time of editing the Package Part, then when you switch context back to that SSIS Package, you will run into a message displayed in Figure 16. It basically means that you will have to manually refresh all SSIS Packages that were opened during updating of the Package Part. So make sure that you close all packages prior to making changes to Package Parts.
请记住,如果在编辑“包装件”时打开了一个引用“包装件”的SSIS包,那么对“包装件”所做的更改将如何应用到所有引用该“包装件”的包中。该SSIS包,您将遇到图16中显示的消息。 基本上,这意味着您必须手动刷新在更新软件包部件期间打开的所有SSIS软件包。 因此,请确保在更改“包装零件”之前关闭所有包装。
The rest of the limitations of Package Parts include the following:
封装件的其他限制包括:
- Project connections are not visible to Package Parts, thus can’t be used 项目连接对于“包装零件”不可见,因此无法使用
- You also cannot parameterise you connections within Package Parts 您也无法参数化套件零件中的连接
- Unless you use variables, some Package Parts’ executables are difficult to customize directly from SSIS Packages i.e. data flow task, script task etc. 除非使用变量,否则某些Package Parts的可执行文件很难直接从SSIS Packages中进行自定义,即数据流任务,脚本任务等。
- You cannot debug a Package Part from the designer 您不能从设计器调试包装零件
结论 (Conclusion)
Package Parts introduces another alternative to creating reusable and maintainable SSIS solutions. Unfortunately, it has its own sets of shortcomings which I hope will be resolved in subsequent releases of SQL Server.
套件零件介绍了创建可重用和可维护的SSIS解决方案的另一种方法。 不幸的是,它有其自身的缺点,我希望可以在后续版本SQL Server中解决。
金属零件图像数据集