利用python整理表格数据
由于疫情数据量大,且时间跨度大从2020.1.23-2020.2.2日,可以利用‘’日期‘标签进行筛选:
首先需要python环境以及都三方库pandas
一下是实现代码:
import pandas as pd
data=[]
dfd = pd.read_excel('F:\gbh\python\practice\大创\data\武汉疫情数据\迁徙.xlsx')
#dfd.head(10)
index=
dfd['日期'] = pd.to_datetime(dfd['日期'].astype('str'))
time=['2020-01-24','2020-01-25','2020-01-26','2020-01-27','2020-01-28','2020-01-29','2020-01-30','2020-01-31','2020-02-01','2020-02-02','2020-02-03']
for i in range(12):
index+=1
data_china = dfd[dfd['日期'] == time[i]]
print(data_china)
data.append(data_china)
#data_china.to_csv('E:\data_tengxun\迁徙1.23.csv',index=0,encoding='utf_8_sig')
data_china.to_csv('E:\data_tengxun\迁徙1.2'+str(index)+'.csv',index=0,encoding='utf_8_sig')
以上代码实现批量筛选
如果是单独筛选,可以使用一下代码实现:
import pandas as pd
data=[]
dfd = pd.read_excel('F:\gbh\python\practice\大创\data\武汉疫情数据\迁徙.xlsx')
#dfd.head(10)
dfd['日期'] = pd.to_datetime(dfd['日期'].astype('str'))
#time=['2020-01-24','2020-01-25','2020-01-26','2020-01-27','2020-01-28','2020-01-29','2020-01-30','2020-01-31','2020-02-01','2020-02-02','2020-02-03']
data_china = dfd[dfd['日期'] == '2020-02-03']
print(data_china)
data.append(data_china)
#data_china.to_csv('E:\data_tengxun\迁徙1.23.csv',index=0,encoding='utf_8_sig')
data_china.to_csv('E:\data_tengxun\迁徙2.03.csv',index=0,encoding='utf_8_sig')
运行结果