准备工作 导包
import pandas as pd
读取csv文件
pandas的 read_excel() 方法
fpath = './datas/ml-latest-small/ratings.csv'
ratings = pd.read_csv(fpath, engine='python', encoding='utf-8')
查看DataFrame的具体结构
ratings.head()
| userId | movieId | rating | timestamp |
---|
0 | 1 | 1 | 4.0 | 964982703 |
1 | 1 | 3 | 4.0 | 964981247 |
2 | 1 | 6 | 4.0 | 964982224 |
3 | 1 | 47 | 5.0 | 964983815 |
4 | 1 | 50 | 5.0 | 964982931 |
查看结构
ratings.shape
(100836, 4)
查看所有列名
ratings.columns
Index(['userId', 'movieId', 'rating', 'timestamp'], dtype='object')
index
ratings.index
RangeIndex(start=0, stop=100836, step=1)
ratings.dtypes
userId int64
movieId int64
rating float64
timestamp int64
dtype: object
读取txt文件
fpath = './datas/crazyant/access_pvuv.txt'
同样调用read_csv()方法
- 设置切割对象为 \t
- 没有头部
- 设置列名为 a,b,c
txt_demo = pd.read_csv(
fpath,
sep='\t',
header=None,
names=['a','b','c']
)
txt_demo.head()
| a | b | c |
---|
0 | 2019-09-10 | 139 | 92 |
1 | 2019-09-09 | 185 | 153 |
2 | 2019-09-08 | 123 | 59 |
3 | 2019-09-07 | 65 | 40 |
4 | 2019-09-06 | 157 | 98 |
读取xls文件
fpath = './datas/crazyant/access_pvuv.xlsx'
pandas的 read_excel() 方法
xlsx_demo = pd.read_excel(fpath, engine='python', encoding='utf-8')
xlsx_demo
| 日期 | PV | UV |
---|
0 | 2019-09-10 | 139 | 92 |
1 | 2019-09-09 | 185 | 153 |
2 | 2019-09-08 | 123 | 59 |
3 | 2019-09-07 | 65 | 40 |
4 | 2019-09-06 | 157 | 98 |
5 | 2019-09-05 | 205 | 151 |
6 | 2019-09-04 | 196 | 167 |
7 | 2019-09-03 | 216 | 176 |
8 | 2019-09-02 | 227 | 148 |
9 | 2019-09-01 | 105 | 61 |
读取数据库数据
import pandas as pd
from sqlalchemy import create_engine
engin = create_engine("mysql+mysqlconnector://root:123456@localhost:3306/rabbit_test?charset=utf8")
sql = '''select * from student'''
df = pd.read_sql_query(sql, engin)
print(df)
数据库插入数据
data = {
'id': [17, 18, 19],
"name": ["yanxu", "shuaishuai", "xiaolong"],
"age": [20, 22, 23],
"class_id": [1, 2, 3]
}
df = pd.DataFrame(data)
df.to_sql("student", engin, index=False, if_exists="append")
print(df)