用python导入数据(三)

关系型数据库

1、启动SQL引擎。 将创建一个引擎以连接到工作目录中的SQLite数据库'Chinook.sqlite'。

# Import necessary module
from sqlalchemy import create_engine

# Create engine: engine
engine = create_engine('sqlite:///Chinook.sqlite')

这里,'sqlite:///Chinook.sqlite'被称为SQLite数据库Chinook.sqlite的连接字符串。

2、使用引擎上的方法table_names()可以知道数据库里面有哪些表。

# Import necessary module
from sqlalchemy import create_engine

# Create engine: engine
engine = create_engine('sqlite:///Chinook.sqlite')

# Save the table names to a list: table_names
table_names = engine.table_names()

# Print the table names to the shell
print(table_names)

<script.py> output:
    ['Album', 'Artist', 'Customer', 'Employee', 'Genre', 'Invoice', 'InvoiceLine', 'MediaType', 'Playlist', 'PlaylistTrack', 'Track']

3、使用python对数据库进行查询操作。(方法一)

# Import packages
from sqlalchemy import create_engine
import pandas as pd

# Create engine: engine
engine = create_engine('sqlite:///Chinook.sqlite')

# Open engine connection: con
con = engine.connect()

# Perform query: rs
rs = con.execute('select * from Album')

# Save results of the query to DataFrame: df
df = pd.DataFrame(rs.fetchall())

# Close connection
con.close()

# Print head of DataFrame df
print(df.head())

<script.py> output:
       0                                      1  2
    0  1  For Those About To Rock We Salute You  1
    1  2                      Balls to the Wall  2
    2  3                      Restless and Wild  2
    3  4                      Let There Be Rock  1
    4  5                               Big Ones  3

自定义查询    fetchmany(size=3)—— size:要返回的行数,  fetchall()——取全部

使用 df.columns = rs.keys() 将列索引 换为 真实的索引。

from sqlalchemy import create_engine
import pandas as pd

# Open engine in context manager
# Perform query and save results to DataFrame: df
with engine.connect() as con:
    rs = con.execute('select LastName,Title from Employee')
    df = pd.DataFrame(rs.fetchmany(size=3))
    df.columns = rs.keys()

# Print the length of the DataFrame df
print(len(df))

# Print the head of the DataFrame df
print(df.head())

<script.py> output:
    3
      LastName                Title
    0    Adams      General Manager
    1  Edwards        Sales Manager
    2  Peacock  Sales Support Agent

不使用df.columns = rs.keys() 结果:

   0                    1
0    Adams      General Manager
1  Edwards        Sales Manager
2  Peacock  Sales Support Agent

使用SQL的WHERE过滤数据库记录

# Create engine: engine
engine = create_engine('sqlite:///Chinook.sqlite')

# Open engine in context manager
# Perform query and save results to DataFrame: df
with engine.connect() as con:
    rs = con.execute('select * from Employee where EmployeeId >= 6')
    df = pd.DataFrame(rs.fetchall())
    df.columns = rs.keys()

# Print the head of the DataFrame df
print(df.head())

<script.py> output:
       EmployeeId  LastName FirstName       Title  ReportsTo            BirthDate  \
    0           6  Mitchell   Michael  IT Manager          1  1973-07-01 00:00:00   
    1           7      King    Robert    IT Staff          6  1970-05-29 00:00:00   
    2           8  Callahan     Laura    IT Staff          6  1968-01-09 00:00:00   
    
                  HireDate                      Address        City State Country  \
    0  2003-10-17 00:00:00         5827 Bowness Road NW     Calgary    AB  Canada   
    1  2004-01-02 00:00:00  590 Columbia Boulevard West  Lethbridge    AB  Canada   
    2  2004-03-04 00:00:00                  923 7 ST NW  Lethbridge    AB  Canada   
    
      PostalCode              Phone                Fax                    Email  
    0    T3B 0C5  +1 (403) 246-9887  +1 (403) 246-9899  michael@chinookcorp.com  
    1    T1K 5N8  +1 (403) 456-9986  +1 (403) 456-8485   robert@chinookcorp.com  
    2    T1H 1Y8  +1 (403) 467-3351  +1 (403) 467-8772    laura@chinookcorp.com

使用ORDER BY排序SQL记录

# Create engine: engine
engine = create_engine('sqlite:///Chinook.sqlite')

# Open engine in context manager
with engine.connect() as con:
    rs = con.execute('select * from Employee order by BirthDate')
    df = pd.DataFrame(rs.fetchall())

    # Set the DataFrame's column names
    df.columns = rs.keys()

# Print head of DataFrame
print(df.head())

<script.py> output:
       EmployeeId  LastName FirstName                Title  ReportsTo  \
    0           4      Park  Margaret  Sales Support Agent        2.0   
    1           2   Edwards     Nancy        Sales Manager        1.0   
    2           1     Adams    Andrew      General Manager        NaN   
    3           5   Johnson     Steve  Sales Support Agent        2.0   
    4           8  Callahan     Laura             IT Staff        6.0   
    
                 BirthDate             HireDate              Address        City  \
    0  1947-09-19 00:00:00  2003-05-03 00:00:00     683 10 Street SW     Calgary   
    1  1958-12-08 00:00:00  2002-05-01 00:00:00         825 8 Ave SW     Calgary   
    2  1962-02-18 00:00:00  2002-08-14 00:00:00  11120 Jasper Ave NW    Edmonton   
    3  1965-03-03 00:00:00  2003-10-17 00:00:00         7727B 41 Ave     Calgary   
    4  1968-01-09 00:00:00  2004-03-04 00:00:00          923 7 ST NW  Lethbridge   
    
      State Country PostalCode              Phone                Fax  \
    0    AB  Canada    T2P 5G3  +1 (403) 263-4423  +1 (403) 263-4289   
    1    AB  Canada    T2P 2T3  +1 (403) 262-3443  +1 (403) 262-3322   
    2    AB  Canada    T5K 2N1  +1 (780) 428-9482  +1 (780) 428-3457   
    3    AB  Canada    T3B 1Y7   1 (780) 836-9987   1 (780) 836-9543   
    4    AB  Canada    T1H 1Y8  +1 (403) 467-3351  +1 (403) 467-8772   
    
                          Email  
    0  margaret@chinookcorp.com  
    1     nancy@chinookcorp.com  
    2    andrew@chinookcorp.com  
    3     steve@chinookcorp.com  
    4     laura@chinookcorp.com

4、直接使用pandas 查询 利用pandas的强大功能将SQL查询的结果写入一个DataFrame 

pd.read_sql_query('select * from Employee ',engine)

如下代码为两种方法对比:

# Import packages
from sqlalchemy import create_engine
import pandas as pd

# Create engine: engine
engine = create_engine("sqlite:///Chinook.sqlite")

# Execute query and store records in DataFrame: df
df = pd.read_sql_query('select * from Album', engine)

# Print head of DataFrame
print(df.head())

# Open engine in context manager and store query result in df1
with engine.connect() as con:
    rs = con.execute("SELECT * FROM Album")
    df1 = pd.DataFrame(rs.fetchall())
    df1.columns = rs.keys()

# Confirm that both methods yield the same result
print(df.equals(df1))

<script.py> output:
       AlbumId                                  Title  ArtistId
    0        1  For Those About To Rock We Salute You         1
    1        2                      Balls to the Wall         2
    2        3                      Restless and Wild         2
    3        4                      Let There Be Rock         1
    4        5                               Big Ones         3
    True

使用where 和order by

# Import packages
from sqlalchemy import create_engine
import pandas as pd

# Create engine: engine
engine = create_engine('sqlite:///Chinook.sqlite')

# Execute query and store records in DataFrame: df
df = pd.read_sql_query('select * from Employee where EmployeeId >=6 order by BirthDate',engine)

# Print head of DataFrame
print(df.head())

SQL的强大之处在于表之间的关系:INNER JOIN

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///Chinook.sqlite')

# Open engine in context manager
# Perform query and save results to DataFrame: df
with engine.connect() as con:
    rs = con.execute('select Title,Name from Album inner join Artist on Album.ArtistID = Artist.ArtistID')
    df = pd.DataFrame(rs.fetchall())
    df.columns =rs.keys()

# Print head of DataFrame df
print(df.head())

过滤你的INNER JOIN

df = pd.read_sql_query('select * from PlaylistTrack inner join Track on PlaylistTrack.TrackId =Track.TrackId where Milliseconds <250000',engine)

 

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值