ssis 循环导入数据_使用集成服务(SSIS)包从Amazon S3 SSIS存储桶导入数据

ssis 循环导入数据

This article explores data import in SQL Server from a CSV file stored in the Amazon S3 SSIS bucket.

本文探讨了从Amazon S3 SSIS存储桶中存储的CSV文件在SQL Server中的数据导入。

介绍 (Introduction)

The SSIS package is an ETL tool for data extraction, transformation and loading data from various sources such as OLE DB, ODBC, Ado.NET, flat-file connections. Nowadays, many organizations have hybrid infrastructures model. It is a combination of both On-premise and cloud infrastructure. It is essential for a DBA and developer to be familiar with cloud solutions as well.

SSIS包是一个ETL工具,用于从各种来源(例如OLE DB,ODBC,Ado.NET,平面文件连接)提取,转换和加载数据。 如今,许多组织都有混合基础架构模型。 它是内部部署和云基础架构的组合。 DBA和开发人员也必须熟悉云解决方案。

Recently I have a requirement in that to import S3 bucket property and file data with the following source and destination:

最近,我的要求是要导入具有以下源和目标的S3存储桶属性和文件数据:

  • Source: CSV file stored in AWS S3 bucket 来源 :存储在AWS S3存储桶中的CSV文件
  • Destination: On-premise SQL Server database table 目标 :本地SQL Server数据库表

First, let’s take an overview of the AWS S3 bucket.

首先,让我们概述一下AWS S3存储桶。

Amazon S3概述 (Overview of Amazon S3 )

AWS S3 is an acronym of Amazon Web Service Simple Storage service. It is a highly scalable and cost-effective cloud storage for data storage, archival. We can upload any file extension in the S3 bucket. Single file size cannot exceed beyond five TB. Users can control the file properties, accessibility such as public or private accessibility. We can store unlimited data in the S3 bucket. The best part is that you only pay for the used storage.

AWS S3是Amazon Web Service简单存储服务的首字母缩写。 它是用于数据存储,归档的高度可扩展且经济高效的云存储。 我们可以在S3存储桶中上传任何文件扩展名。 单个文件的大小不能超过5 TB。 用户可以控制文件属性,可访问性,例如公共或私有可访问性。 我们可以在S3存储桶中存储无限的数据。 最好的部分是,您只需为已用的存储空间付费。

AWS免费套餐帐户 (AWS Free Tier account)

The starting point of the AWS journey is the AWS Console and creates an AWS free tier account. Go to this URL, and it opens the following free tier page of AWS:

AWS旅程的起点是AWS控制台,并创建一个AWS免费套餐帐户。 转到此URL ,它将打开以下AWS的免费层页面:

AWS Free Tier account

We can create a free account, and it is 12 months free. You can use it as a starting point to learn AWS:

我们可以创建一个免费帐户,并且12个月免费。 您可以将其用作学习AWS的起点:

AWS Free Tier account usage

Few vital features of free tier account in AWS are as following:

AWS中免费套餐帐户的一些重要功能如下:

  • 750 hours per month free usage of Amazon EC2 instances

    每月750小时免费使用Amazon EC2实例
  • 750 hours per month free usage of Amazon RDS t2.micro server

    每月可免费使用Amazon RDS t2.micro服务器750小时
  • 30 GB per month usage of Amazon elastic storage

    每月30 GB的Amazon Elastic Storage使用量
  • 5 GB free AWS S3 storage

    5 GB的免费AWS S3存储

You can go through the URL to read more about the free account in AWS. Click on Create a Free Account and go through the process by entering your personal information such as name, country, email address, etc.

您可以浏览URL以了解有关AWS中免费帐户的更多信息。 单击创建免费帐户 ,然后输入您的个人信息(例如姓名,国家/地区,电子邮件地址等)来完成此过程。

Here is a catch; you need to provide a valid debit or credit card. It deducts 2 rupees from your credit or debit card. Do not worry; you will also get it back in a few days. It is a mandatory step to sign up for a free account. Amazon does not deduct any money until you have not exhausted the limit of the free tier account.

这是一个陷阱。 您需要提供有效的借记卡或信用卡。 它从您的信用卡或借记卡中扣除2卢比。 别担心; 您也将在几天后将其取回。 注册免费帐户是必不可少的步骤。 除非您没有耗尽免费套餐帐户的限额,否则亚马逊不会扣除任何款项。

Amazon S3配置 (Amazon S3 configuration)

To create an S3 bucket do the following:

要创建S3存储桶,请执行以下操作:

  • Open the AWS web console and login to the account using IAM user having permissions to create, access, upload, and import data into the S3 bucket

    打开AWS Web控制台并使用具有创建,访问,上传数据并将其导入S3存储桶的权限的IAM用户登录到该帐户
  • Navigate to Services, Click on S3 in the Storage section:

    导航到服务,在“存储”部分中单击S3

    Amazon S3 configuration

    It lists out all existing AWS S3 Bucket. We can see bucket name, access level, bucket region and the created data of the S3 bucket

    它列出了所有现有的AWS S3存储桶。 我们可以看到存储桶名称,访问级别,存储桶区域以及S3存储桶的创建数据

  • Click on Create bucket and provide the S3 bucket name:

    单击创建存储桶并提供S3存储桶名称:

    Create a new bucket

  • Amazon S3 is a region-specific resource; therefore; you need to select the region from the drop-down list. Then, click on Create:

    Amazon S3是特定于区域的资源。 因此; 您需要从下拉列表中选择区域。 然后,点击创建

    Specify bucket and region

    It creates the S3 bucket with default configurations and shows in the bucket list. This bucket belongs to Asia Pacific (Mumbai) region:

    它使用默认配置创建S3存储桶,并显示在存储桶列表中。 此存储桶属于亚太地区(孟买):

    S3 bucket properties

  • Open this S3 bucket, drag the CSV file from the source system, drop in the S3 bucket, and click the Upload button:

    打开此S3存储桶,将CSV文件从源系统中拖放到S3存储桶中,然后单击“ 上传”按钮:

    Upload file into S3 bicket

    You can see the CSV file in the S3 bucket:

    您可以在S3存储桶中看到CSV文件:

    Uploaded file properties

Amazon S3 SSIS数据上传 (Amazon S3 SSIS data upload)

As described earlier, we require data import from CSV file (stored in AWS S3 bucket) into the SQL server table.

如前所述,我们需要将数据从CSV文件(存储在AWS S3存储桶中)导入SQL Server表中。

Question: How do you connect an SSIS package with an AWS S3 bucket?

问题: 如何将SSIS软件包与AWS S3存储桶连接?

By default, the SSIS package does not allow you to connect with the AWS S3 bucket. It does not mean that we do have any mechanism to do so.

默认情况下,SSIS软件包不允许您与AWS S3存储桶连接。 这并不意味着我们有任何机制可以这样做。

We can browse Visual studio Marketplace and use Amazon S3 SSIS Components (SSIS Productivity Pack) for our requirements:

我们可以浏览Visual Studio Marketplace并使用Amazon S3 SSIS组件 (SSIS生产力包)来满足我们的要求:

Amazon S3 SSIS Components

It enables SSIS package and Amazon S3 SSIS bucket integration without writing any specific code. You can easily use this SSIS productivity pack to download complete the task effectively and efficiently.

它无需编写任何特定代码即可启用SSIS软件包和Amazon S3 SSIS存储桶集成。 您可以轻松地使用此SSIS生产力包来高效地下载完成任务。

Note: It is trial version software, and you can use this to test out software functionality using development tools such as SSDT. You do not require a license in this case. Read more about licensing and available licenses using the link:

注意:这是试用版软件,您可以使用它来使用开发工具(例如SSDT)测试软件功能。 在这种情况下,您不需要许可证。 使用链接阅读有关许可和可用许可的更多信息:

Amazon S3 SSIS Components licensing

Download the appropriate 64-bit or 32-bit. Double-click on the installer and follow up the installation wizard for installing SSIS productivity pack with default configurations:

下载适当的64位或32位。 双击安装程序,然后遵循安装向导以默认配置安装SSIS生产力包:

Download appropriate version

Amazon S3 SSIS软件包配置 (Amazon S3 SSIS package configuration)

Once you installed SSIS Productivity Pack, launch Visual Studio (SSDT). I am using SSDT Visual Studio 2015 in this article.

安装SSIS生产力包后,启动Visual Studio(SSDT)。 我在本文中使用SSDT Visual Studio 2015。

S3铲斗连接 (S3 bucket connection)

The first step is to create a connection pointing to the Amazon S3 bucket. Right-click on the connection manager and choose New connection:

第一步是创建指向Amazon S3存储桶的连接。 右键单击连接管理器,然后选择新建连接

Create a new connection

It opens a list of available connection managers. We require connecting with the AWS S3 bucket; therefore, select the Amazon S3 and click on Add. It opens the Amazon S3 Connection Manager.

它打开可用连接管理器的列表。 我们需要连接到AWS S3存储桶; 因此,选择Amazon S3并单击Add 。 它将打开Amazon S3连接管理器。

In General page of the connection manager, we specify connection properties and login credentials to the AWS S3 bucket:

在连接管理器的“常规”页面中,我们为AWS S3存储桶指定连接属性和登录凭证:

  • Access and secret key: We specify access and secret key on an IAM user. We get these keys from the IAM console at AWS web services

    访问和密钥:我们在IAM用户上指定访问和密钥。 我们从AWS Web Services的IAM控制台获取这些密钥

    • Logon to AWS Console and navigate to IAM in the services menu

      登录到AWS控制台并在服务菜单中导航到IAM
    • Navigate to Users and click on a specific user

      导航到用户,然后单击特定用户
    • In the user, properties go to the Security credentials tab

      在用户中,属性转到“安全凭据”选项卡
    • Choose the access key and download the secret key in a CSV file

      选择访问密钥并将秘密密钥下载为CSV文件

    Security credentials

  • You need to store the access key and secret key in a secured location. Specify it in the Amazon S3 Connection Manager. It authenticates user account in Amazon services using these keys, and you can see the bucket name in the drop-down list

    您需要将访问密钥和秘密密钥存储在安全的位置。 在Amazon S3 Connection Manager中指定它。 它使用这些密钥对Amazon服务中的用户帐户进行身份验证,并且您可以在下拉列表中看到存储桶名称。

    Amazon S3 connection Manager

    Click on Test Connection, and it checks for connection to the Amazon S3 SSIS bucket. Once the connection is successful, you get the following message:

    单击Test Connection ,它会检查与Amazon S3 SSIS存储桶的连接。 连接成功后,您会收到以下消息:

    Test connection

    Click OK, and we can see AWS S3 Connection in the connection manager:

    单击OK ,我们可以在连接管理器中看到AWS S3 Connection:

    New configured connection

添加数据流任务 (Add a Data Flow task)

In the next step, add a data flow task in the SSIS package for the Amazon S3 SSIS bucket:

在下一步中,在SSIS包中为Amazon S3 SSIS存储桶添加数据流任务:

data flow task

Rename the data flow task as AWS S3 Data Import:

将数据流任务重命名为AWS S3 Data Import:

Rename the data flow task as AWS S3 Data Import

Double-click on this task, and it takes you to the data flow screen. In the SSIS toolbox on the data flow tab, you can see options for Amazon S3 bucket:

双击此任务,它将带您到数据流屏幕。 在“数据流”选项卡上的SSIS工具箱中,您可以看到Amazon S3存储桶的选项:

  • Amazon S3 Destination

    Amazon S3目标
  • Amazon S3 Source

    Amazon S3来源
  • Premium Flat File source

    高级平面文件源
  • Premium Flat File destination

    高级平面文件目标

Drag Premium Flat File Source in the data flow task from the SSIS toolbox:

从SSIS工具箱的数据流任务中拖动Premium Flat File Source:

Premium Flat file source

Double-click on Premium Flat File Source and it opens the editor window for configuration:

双击Premium Flat File Source ,它会打开编辑器窗口进行配置:

  • Connection Manager: Select the existing connection to the S3 bucket from the drop-down. If you do not have an existing connection to the S3 bucket, click on New. You require specifying keys (access and security key) information, as shown in previous steps 连接管理器:从下拉列表中选择与S3存储桶的现有连接。 如果您没有与S3存储桶的现有连接,请单击“新建”。 您需要指定密钥(访问和安全密钥)信息,如先前步骤所示
  • Source File Path: It is the CSV file path in the S3 bucket. Click on eclipse and browse to the CSV file path. You might have multiple files in the S3 bucket. The source file path should point to a valid CSV file 源文件路径:这是S3存储桶中的CSV文件路径。 单击eclipse,然后浏览到CSV文件路径。 S3存储桶中可能有多个文件。 源文件路径应指向有效的CSV文件

Click on eclipse (…) on the source file path and browse to the CSV file path:

在源文件路径上单击eclipse(…),然后浏览到CSV文件路径:

Specify the path of CSV Amazon S3 SSIS

We can see the CSV file in the Source item path column.

我们可以在“源项目路径”列中看到CSV文件。

  • The first row has the header: My CSV file first column contains column headers. Remove the check from here if the CSV file does not contains headers 第一行包含标题:我的CSV文件的第一列包含列标题。 如果CSV文件不包含标题,请从此处取消选中
  • Skip Empty Rows: CSV file might contain empty rows. Put a check on this to skip the empty rows 跳过空行: CSV文件可能包含空行。 对此进行检查以跳过空行

Premium flat file source editor

Click on columns tab to view excel file columns. We may uncheck the columns if we do not want to import them into SQL Server tables. It also shows the data type and length automatically. We can change the column data types and their properties in this window. For this demonstration, let’s go with default columns properties in the Amazon S3 SSIS package:

单击列选项卡以查看excel文件列。 如果我们不想将其导入到SQL Server表中,则可以取消选中这些列。 它还会自动显示数据类型和长度。 我们可以在此窗口中更改列数据类型及其属性。 对于此演示,让我们使用Amazon S3 SSIS包中的默认列属性:

Columns and their properties

Click OK and add an OLE DB destination connection. This OLE DB destination should point to the SQL Server instance as per our requirement. Right-click and configure the destination with the SQL Server connection and SQL Server table. You also need to do the mapping between source and destination columns:

单击确定,然后添加一个OLE DB目标连接。 根据我们的要求,此OLE DB目标应指向SQL Server实例。 右键单击并使用SQL Server连接和SQL Server表配置目标。 您还需要在源列和目标列之间进行映射:

OLE DB destination connection

Once the OLE DB destination connection is successful, you can see the package as shown below:

OLE DB目标连接成功后,您将看到如下所示的包:

Configured SSIS package

Before executing the Amazon S3 SSIS package, rename the tasks as follows:

在执行Amazon S3 SSIS软件包之前,请按以下步骤重命名任务:

  • Premium Flat File Source -> AWS S3 Bucket CSV file

    高级平面文件源-> AWS S3 Bucket CSV文件
  • OLE DB Destination editor: SQL Server destination table

    OLE DB目标编辑器:SQL Server目标表

    Rename the SSIS package tasks

Execute the SSIS package, and it transfers 100 rows from the source CSV file to SQL Server table.

执行SSIS程序包,并将其从源CSV文件传输100行到SQL Server表。

Execute the SSIS package

Verify the records in the SQL Server table.

验证SQL Server表中的记录。

Verify the records in SQL Server

结论 (Conclusion)

In this article, we explored the method to integrate the SSIS package and the Amazon S3 SSIS bucket. We can use the component specified in this article for data import, export between On-premise and cloud systems as well.

在本文中,我们探讨了将SSIS软件包和Amazon S3 SSIS存储桶集成的方法。 我们也可以将本文中指定的组件用于内部和云系统之间的数据导入和导出。

翻译自: https://www.sqlshack.com/data-import-from-amazon-s3-ssis-bucket-using-an-integration-service-ssis-package/

ssis 循环导入数据

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值