SQLAlchemy这个Python软件包,就像其名字一样(SQL的炼金术),非常神奇。
本人看中的是其统一命令方式,可访问大多数的企业级关系型数据库:
- PostgreSQL
- MySQL
- SQLite
- Oracle
- Microsoft SQL Server
Ubuntu下的csvkit
集成了SQLAlchemy
,可以很方便地在命令行中select
.下述代码
$ sql2csv --db 'sqlite:///iris.db' --query 'select * from iris'\
> ' where sepal_length > 7.5' | csvlook
| sepal_length | sepal_width | petal_length | petal_width | species |
| ------------ | ----------- | ------------ | ----------- | -------------- |
| 7.6 | 3.0 | 6.6 | 2.1 | Iris-virginica |
| 7.7 | 3.8 | 6.7 | 2.2 | Iris-virginica |
| 7.7 | 2.6 | 6.9 | 2.3 | Iris-virginica |
| 7.7 | 2.8 | 6.7 | 2.0 | Iris-virginica |
| 7.9 | 3.8 | 6.4 | 2.0 | Iris-virginica |
| 7.7 | 3.0 | 6.1 | 2.3 | Iris-virginica |
Python环境下,尝试一下。
>>> import sqlalchemy as db
>>> import pandas as pd
>>>
>>> engine = db.create_engine('sqlite:///iris.db')
>>> connection = engine.connect()
>>> iris = db.Table('iris', metadata, autoload=True, autoload_with=engine)
>>> query = db.select([iris.columns.species.distinct()])
>>> ResultProxy = connection.execute(query)
>>> ResultSet = ResultProxy.fetchall()
>>> ResultSet
[('Iris-setosa',), ('Iris-versicolor',), ('Iris-virginica',)]
>>> query = db.select([iris]).where(iris.columns.species == 'Iris-setosa')
>>> ResultProxy = connection.execute(query)
>>> ResultSet = ResultProxy.fetchall()
>>> ResultSet[:3]
[(5.1, 3.5, 1.4, 0.2, 'Iris-setosa'), (4.9, 3.0, 1.4, 0.2, 'Iris-setosa'), (4.7, 3.2, 1.3, 0.2, 'Iris-setosa')]
>>> query = db.select([iris.columns.sepal_length, iris.columns.sepal_width]).where(iris.columns.species.in_(['Iris-versicolor', 'Iris-virginica']))
>>> ResultProxy = connection.execute(query)
>>> ResultSet = ResultProxy.fetchall()
>>> ResultSet[:10]
[(7.0, 3.2), (6.4, 3.2), (6.9, 3.1), (5.5, 2.3), (6.5, 2.8), (5.7, 2.8), (6.3, 3.3), (4.9, 2.4), (6.6, 2.9), (5.2, 2.7)]
>>> df = pd.DataFrame(ResultSet)
>>> df.columns = ResultSet[0].keys()
>>> df.head()
sepal_length sepal_width
0 7.0 3.2
1 6.4 3.2
2 6.9 3.1
3 5.5 2.3
4 6.5 2.8
Reference: SQLAlchemy — Python Tutorial