python数据分析（应用数据库）

最新推荐文章于 2024-06-22 10:30:53 发布

星之空殇

最新推荐文章于 2024-06-22 10:30:53 发布

阅读量2.9k

点赞数 1

本文链接：https://blog.csdn.net/dengjiaxing0321/article/details/78660293

版权

本文详细介绍了Python中进行数据分析时如何操作数据库，包括使用sqlite3进行轻量级访问，利用pandas直接访问数据库，SQLAlchemy的安装配置及数据填充与查询，Pony ORM的使用，Dataset库的便捷功能，以及PyMongo与MongoDB的数据库交互。

摘要由CSDN通过智能技术生成

本文介绍主题如下：

基于sqlite3的轻量级访问
通过pandas访问数据库
SQLAlchemy的安装与配置
通过SQLAlchemy填充数据库
通过SQLAlchemy查询数据库
Pony ORM
Dataset：懒人数据库
PyMongo与MongoDB
利用Redis存储数据
Apache Cassandra

1、基于sqlite3的轻量级访问

SQLite是一款非常流行的关系型数据库，由于它非常轻盈，因此被大量应用程序广泛使用。sqlite3是python标准发行版中自带的模块，可以用于处理sqlite数据库。数据库既可以保存到文件中，也可以保存在内存中，这里保存到内存中。

代码：

import sqlite3
with sqlite3.connect(":memory:") as con:
    c=con.cursor()  #创建游标
    c.execute('''CREATE TABLE sensors(data text,city text,code text,sensor_id real,temperature real)''') #新建表，text和real分别表示字符串和数值的类型
    for table in c.execute("SELECT name FROM sqlite_master WHERE type='table'"):
        print "Table",table[0]
        c.execute("INSERT INTO sensors VALUES ('2016-11-05','Utrecht','Red',42,15.14)")
        c.execute("SELECT * FROM sensors")
        print c.fetchone()  #输出插入记录
        con.execute("DROP TABLE sensors")  #删除表
        print "# of tables",c.execute("SELECT COUNT(*) FROM sqlite_master WHERE type='table'").fetchone()[0]
c.close()

运行结果：

Table sensors
(u'2016-11-05', u'Utrecht', u'Red', 42.0, 15.14)
# of tables 0

2、通过pandas访问数据库

首先安装statsmodels库，安装命令如下：

pip install statsmodels

该库中包含太阳黑子周期数据。

代码：

import statsmodels.api as sm 
from pandas.io.sql import read_sql
import sqlite3
with sqlite3.connect(":memory:") as con:
    c=con.cursor()
    data_loader=sm.datasets.sunspots.load_pandas()  #加载数据
    df=data_loader.data
    rows=[tuple(x) for x in df.values]
    
    con.execute("CREATE TABLE SUNSPOTS(year,sunactivity)")
    con.executemany("INSERT INTO sunspots(year,sunactivity) VALUES(?,?)",rows)  #执行多次SQL语句
    c.execute("SELECT COUNT(*) FROM sunspots")  #统计数据表中元组数
    print c.fetchone()
    print "Deleted",con.execute("DELETE FROM sunspots where sunactivity>20").rowcount,"rows"  #rowcount表示受影响的行
    print read_sql("SELECT * FROM sunspots where year<1732",con)  #如果把数据库连接到pandas，使用read_sql函数执行查询就可以返回pandas DataFrame了
    con.execute("DROP TABLE sunspots")
c.close()

运行结果：

(309,)
Deleted 217 rows
      year  sunactivity
0   1700.0          5.0
1   1701.0         11.0
2   1702.0         16.0
3   1707.0         20.0
4   1708.0         10.0
5   1709.0