在SQL Server中导入和使用CSV文件

介绍 (Introduction)

CSV (comma separated values) is one of the most popular formats for datasets used in machine learning and data science. MS Excel can be used for basic manipulation of data in CSV format. We often need to execute complex SQL queries on CSV files, which is not possible with MS Excel. See this article for what is possible with Power BI.

CSV(逗号分隔值)是机器学习和数据科学中使用的数据集最流行的格式之一。 MS Excel可用于CSV格式的基本数据处理。 我们经常需要对CSV文件执行复杂SQL查询,而MS Excel无法做到。 有关Power BI的功能,请参阅本文。

However, before we can execute complex SQL queries on CSV files, we need to convert CSV files to data tables.

但是,在可以对CSV文件执行复杂SQL查询之前,我们需要将CSV文件转换为数据表。

There are many methods of converting CSV data into a database table format. One of the ways is to create a new table and copy all the data from the CSV file to the table. However, copy and pasting data can be extremely cumbersome and time-consuming if the dataset is very large.

有很多方法可以将CSV数据转换为数据库表格式。 一种方法是创建一个新表并将所有数据从CSV文件复制到该表。 但是,如果数据集很大,则复制和粘贴数据可能会非常繁琐且耗时。

Another way is to write a script that reads the CSV data and inserts it into the corresponding data table. This method is faster than copy-pasting but it still requires a manual script. There is a better way to achieve this if you are lucky enough to use SQL Server. In SQL Server, you can simply import and convert CSV files into data tables with a few mouse clicks using SSMS (SQL Server Management Studio).

另一种方法是编写一个脚本,以读取CSV数据并将其插入相应的数据表中。 此方法比复制粘贴要快,但仍需要手动脚本。 如果您足够幸运地使用SQL Server,则有更好的方法来实现此目的。 在SQL Server中,只需单击几下,即可使用SSMS(SQL Server Management Studio)将CSV文件导入并转换为数据表。

In this article, we will see how we can import CSV data into SQL server using SSMS, convert it into data tables, and execute SQL queries on the tables that we generate.

在本文中,我们将看到如何使用SSMS将CSV数据导入SQL Server,如何将其转换为数据表以及如何对生成的表执行SQL查询。

导入CSV文件 (Importing a CSV file)

The file that we are going to import contains 178 records of wine information. The file can be downloaded from this Github repository. I have named the file “wine.csv”, you can give it any name.

我们将要导入的文件包含178条葡萄酒信息记录。 可以从此Github存储库下载文件。 我已将文件命名为“ wine.csv”,您可以使用任何名称。

You can import a CSV file into a specific database. Let’s first create a dummy database named ‘Bar’ and try to import the CSV file into the Bar database.

您可以将CSV文件导入特定的数据库。 让我们首先创建一个名为“ Bar”的虚拟数据库,然后尝试将CS​​V文件导入Bar数据库。

In the SSMS, execute the following script to create the database:

在SSMS中,执行以下脚本来创建数据库:

CREATE DATABASE Bar

Now follow these steps to import CSV file into SQL Server Management Studio. This process is valid for all different types of flat files, however for the sake of this demonstration we will use a CSV file.

现在,请按照以下步骤将CSV文件导入SQL Server Management Studio。 此过程对所有不同类型的平面文件均有效,但是出于演示目的,我们将使用CSV文件。

Step 1: Selecting the database

步骤1:选择数据库

The first step is to go to the Object Explorer -> [Database] -> Tasks and click the “Import Flat File …” as shown in the following figure:

第一步是转到对象资源管理器-> [数据库]->任务,然后单击“导入平面文件…”,如下图所示:

Once you select the “Import Flat File …” option, you will see a new window containing brief information about how to import the flat files, as shown below:

选择“导入平面文件...”选项后,您将看到一个新窗口,其中包含有关如何导入平面文件的简要信息,如下所示:

Simply click the “Next” button.

只需单击“下一步”按钮。

Step 2: Select the file to import

步骤2:选择要导入的文件

When you click the “Next” button, a window will appear, prompting you to select the file to import and to specify the corresponding table name. Select the file and enter the name that you want to give to the table that will contain data from your CSV file. I named the table as “wine” (which is the default name) as shown in the figure below:

当您单击“下一步”按钮时,将出现一个窗口,提示您选择要导入的文件并指定相应的表名。 选择文件,然后输入要提供给表的名称,该表将包含来自CSV文件的数据。 我将表命名为“ wine”(这是默认名称),如下图所示:

Click the “Next” button again.

再次单击“下一步”按钮。

If your file is already open or being used by any other application, you will see the following error:

如果您的文件已经打开或正在被任何其他应用程序使用,您将看到以下错误:

Simply close the CSV file and/or any other application that has access to the CSV file, go to the previous screen and click the “Next” button again to fix this.

只需关闭CSV文件和/或有权访问CSV文件的任何其他应用程序,转到上一个屏幕,然后再次单击“下一步”按钮即可解决此问题。

Step 3: Preview the data

步骤3:预览资料

If your file has loaded successfully, a new window will appear containing a preview of your file, as shown in the following figure:

如果文件已成功加载,则会出现一个新窗口,其中包含文件预览,如下图所示:

You can see that the preview contains data from our CSV file. Click the “Next” button.

您可以看到预览中包含来自CSV文件的数据。 点击“下一步”按钮。

Step 4: Modify columns

步骤4:修改栏

Once you click the “Next” button on the “Preview Data” screen, you will see the “Modify Columns” screen where you can modify the details of the columns of your table, as shown in the following screenshot:

单击“预览数据”屏幕上的“下一步”按钮后,将看到“修改列”屏幕,您可以在其中修改表的列的详细信息,如以下屏幕截图所示:

In the above figure, you can see that each column in the CSV file has been assigned a default data type. The default types should work in the most cases. However, if you want, you can change the data type by simply clicking the drop-down list in front of each column as shown below:

在上图中,您可以看到CSV文件中的每一列都被分配了默认数据类型。 在大多数情况下,默认类型应适用。 但是,如果需要,您可以通过单击每列前面的下拉列表来更改数据类型,如下所示:

Furthermore, you can also set a column as the Primary key and allow Null values for any column. Once you are done with modifying the columns, click the “Next” button.

此外,您还可以将一列设置为主键,并为任何列允许Null值。 修改完列后,单击“下一步”按钮。

Step 5: Viewing the summary

步骤5:查看摘要

A new window displaying the summary of the operation including the name of the Server, Database, Table and the file that is being imported will be displayed. This is shown in the following screenshot:

将显示一个新窗口,其中显示操作摘要,包括服务器名称,数据库名称,表名称和要导入的文件。 在下面的屏幕快照中显示了此内容:

Click the “Finish” button.

点击“完成”按钮。

Step 6: Process completion

步骤6:流程完成

Depending upon the size of the file, the SQL Server will take some time to import the file. Once the file is successfully imported, without any errors, you will see the following screen:

根据文件的大小,SQL Server将花费一些时间来导入文件。 成功导入文件后,没有任何错误,您将看到以下屏幕:

Working with a CSV file

使用CSV文件

We have successfully imported our CSV file to our SQL Server data table. We can now execute SQL queries on this table.

我们已成功将CSV文件导入到SQL Server数据表中。 现在,我们可以在此表上执行SQL查询。

Since we imported the file into our “Bar” database in the “wine” table, the “Bar” database should contain a table named “wine”. Go to Object Explorer-> Databases -> Bar-> Tables, here you should see your data table “dbo.wine”, as shown below:

由于我们已将文件导入“葡萄酒”表中的“酒吧”数据库,因此“酒吧”数据库应包含一个名为“葡萄酒”的表。 转到“对象资源管理器”->“数据库”->“栏”->“表”,在这里您应该看到数据表“ dbo.wine”,如下所示:

Execute the following query to SELECT all the records from the newly created “wine” table.

执行以下查询,以从新创建的“葡萄酒”表中选择所有记录。

USE Bar
SELECT * FROM wine

The output looks like this:

输出看起来像这样:

Similarly, we can filter records based on the value in one more column. Let’s filter all those records where “Phenols” has a value greater than 3.5.

同样,我们可以根据另一栏中的值过滤记录。 让我们过滤所有那些“苯酚”值大于3.5的记录。

USE Bar
SELECT * FROM wine
WHERE Phenols > 3.5

The output looks like this:

输出看起来像这样:

You can see that only those records are retrieved where the value of “Phenols” column is greater than 3.5.

您可以看到只有“酚”列的值大于3.5的那些记录才被检索到。

In short, you can perform all the operations on the imported “wine” table that you perform on any SQL table, which you create from scratch in SQL Server.

简而言之,您可以对在任何SQL表上执行的导入“ wine”表执行所有操作,该表是在SQL Server中从头开始创建的。

结论 (Conclusion)

CSV is a very common format, especially for machine learning and data science datasets. SQL Server can be used to preprocess CSV data more effectively than MS Excel. In this article, we saw how a CSV file can be imported into SQL Server via SSMS and how basic SQL operations can be performed on the table that is created as a result of importing the CSV file.

CSV是一种非常常见的格式,尤其是对于机器学习和数据科学数据集而言。 与MS Excel相比,SQL Server可更有效地预处理CSV数据。 在本文中,我们了解了如何通过SSMS将CSV文件导入到SQL Server中,以及如何对由于导入CSV文件而创建的表执行基本SQL操作。

本的其他精彩文章 (Other great articles from Ben)

Machine Learning Services – Configuring R Services in SQL Server
Importing and Working with CSV Files in SQL Server
机器学习服务–在SQL Server中配置R服务
在SQL Server中导入和使用CSV文件

翻译自: https://www.sqlshack.com/importing-and-working-with-csv-files-in-sql-server/

  • 3
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值