在python中连接到SQL Server

“SQL is an important tool for any data scientist” — is my entry for the understatement of the year.

“对于任何数据科学家来说,SQL都是重要的工具”,这是我对这一年低估的看法。

As information jockeys, data scientists use SQL to query the data they will need for analysis from established databases. Depending on the demands of your position as a data scientist, you might even be asked to help construct a database from your employer’s data sources, which SQL is also helpful for. Being able to select the information that is pertinent to the current project at the start of the project is a simple efficiency to learn and will save time on the project.

作为信息工作者,数据科学家使用SQL从建立的数据库中查询进行分析所需的数据。 根据您对数据科学家的职位要求,甚至可能会要求您从雇主的数据源中帮助构建数据库,SQL也会有所帮助。 能够在项目开始时选择与当前项目相关的信息是一种简单的学习效率,并且可以节省项目时间。

Early on in my journey to learn data science, I wanted to be able to leverage Python with data pulled from a Microsoft SQL Server Database and after learning about the use of sqlite3 in Python, I figured out how to connect directly to streamline my work, pulling data directly into a Jupyter notebook. Continue reading if you would like to learn how to do this!

在学习数据科学的早期,我希望能够将Python与从Microsoft SQL Server数据库中提取的数据一起使用,并且在了解了在SQL中使用sqlite3的知识之后,我想出了如何直接连接以简化工作的方法,直接将数据提取到Jupyter笔记本中。 如果您想学习如何做,请继续阅读!

If you’re not familiar with the sqlite3 Python module, it is a module which provides a SQL interface to SQLite databases. There is plenty of literature out on how to use SQLite in Python, but I found this blog post to be most helpful.

如果您不熟悉sqlite3 Python模块,那么它是为SQLite数据库提供SQL接口的模块。 关于如何在Python中使用SQLite的文献很多,但是我发现此博客文章最有帮助

If you aren’t building your own database with SQLite and you want to connect to a database that is already established, you can use the pyodbc module. Pyodbc allows a user to connect with a DataBase Management System (DBMS) database using the ODBC driver. A quick pip install should get you up and running, but you can find full installation docs here. Once you have the pyodbc installed (full documentation for pyodbc can be found here) you will need to install the ODBC driver for the DBMS you want to connect to — a quick Google search should direct you to the appropriate download.

如果您不是使用SQLite构建自己的数据库,而是想连接到已经建立的数据库,则可以使用pyodbc模块。 Pyodbc允许用户使用ODBC驱动程序与数据库管理系统(DBMS)数据库连接。 快速安装pip应该可以使您启动并运行,但是您可以在此处找到完整的安装文档。 一旦安装了pyodbc(有关pyodbc的完整文档,请参见此处),您将需要为要连接的DBMS安装ODBC驱动程序-快速的Google搜索应可将您定向到适当的下载位置。

Now I’ll walk through an example of writing a connection to a Microsoft SQL Server DB and extracting a query.

现在,我将通过一个示例编写与Microsoft SQL Server DB的连接并提取查询的示例。

Similar to sqlite3 module, to connect to the database you will need to establish a connection object to represent the database. Pyodbc passes an ODBC connection string to the local driver manager which in turn calls the relevant database driver which in turn calls the database to request the connection.

与sqlite3模块类似,要连接到数据库,您将需要建立一个连接对象来表示数据库。 Pyodbc将ODBC连接字符串传递给本地驱动程序管理器,该驱动程序又调用相关的数据库驱动程序,该驱动程序又调用数据库以请求连接。

Image for post
Generate a connection object
生成连接对象

For Microsoft SQL Server 2017, there are a number of connection strings to select from depending on whether you are connecting to a db with standard security, connecting to a trusted connection, or if you want to enable Multiple Active Result Sets (MARS). I used the following to connect to a trusted connection:

对于Microsoft SQL Server 2017,有许多连接字符串可供选择,具体取决于您是使用标准安全性连接到数据库,连接到受信任的连接还是要启用多个活动结果集(MARS)。 我使用以下内容来连接到受信任的连接:

‘Driver={ODBC Driver 13 for SQL Server}; Server=MyServerAddress; Database= myDataBase; Trusted_Connection=yes;’

'Driver = {用于SQL Server的ODBC驱动程序13}; Server = MyServerAddress; 数据库= myDataBase; Trusted_Connection =是;'

Find other connection strings configured for numerous DBMSs here.

在此处找到为众多DBMS配置的其他连接字符串。

Once you have your connection established, there are two methods you can use to extract a query. The first way is to establish a cursor object. A cursor object enables the traversal over the rows of a result set pulled from a database. A cursor will allow you to process individual rows in the set by managing the context of fetch operations. If the .cursor() method is called without specified class and index arguments, it creates a DB-API style cursor which can use numerous operations, like find(), execute(), fetchone(), and fetchmany().

建立连接后,可以使用两种方法提取查询。 第一种方法是建立游标对象。 游标对象使能遍历从数据库中提取的结果集的行。 游标将允许您通过管理提取操作的上下文来处理集合中的各个行。 如果在没有指定类和索引参数的情况下调用.cursor()方法,它将创建一个DB-API样式的游标,该游标可以使用许多操作,例如find(),execute(),fetchone()和fetchmany()。

Image for post
Fetch all rows to dataframe
获取所有行到数据框

As you can see in the above code, you can then select a subset of the query to pull — fetchone() would pull just one row, fetchall() selects all rows, and fetchmany(x) pulls the next x remaining rows and returns them as a list of tuples.

如您在上面的代码中所见,您可以选择查询的一个子集来拉取— fetchone()仅拉出一行,fetchall()选择所有行,fetchmany(x)拉出其余的x行并返回它们作为元组列表。

The second method is to use pandas.read_sql function, which is a wrapper function for the read_sql_table and read_sql_query calls shown above. This function allows the user to input a written query or database table name and the connection string and feed the result directly into a dataframe.

第二种方法是使用pandas.read_sql函数,该函数是上面显示的read_sql_table和read_sql_query调用的包装函数。 该功能允许用户输入书面查询或数据库表名称和连接字符串,并将结果直接输入数据框。

Image for post
Read a query straight into a dataframe
直接将查询读入数据框

And just like that you have your data ready to use in a dataframe!

就像您已经准备好在数据框中使用数据一样!

Some helpful tips to keep in mind:

请记住一些有用的提示:

  • Queries can be written in python exactly as they would be in a DBMS.

    查询可以完全像在DBMS中一样用python编写。
  • Convert data types directly in the query to reduce conversion operations.

    直接在查询中转换数据类型以减少转换操作。
  • When writing long queries, wrap the query string in triple quote marks — this allows strings to span multiple lines, so you can indent for readability.

    编写长查询时,请用三引号将查询字符串引起来,这使字符串可以跨越多行,因此可以缩进以提高可读性。

Hopefully this helps you streamline your data sourcing!

希望这可以帮助您简化数据采购!

翻译自: https://medium.com/@zachary.a.zazueta/connecting-to-sql-server-in-python-bc3c6b7aafb7

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值