输入1(今天)DataFrame格式:
code date_time
0 10 1100
1 20 2200
2 30 3300
3 40 4400
输入2(昨天)DataFrame格式:
code date_time
0 10 1000
1 20 2500
2 50 5000
实现功能:
根据某一列关键词,计算今天表中某一列的变化率。公式为(今天频次-昨天频次)/昨天频次。
如果昨天未出现,分母为0,则变化率置为1。
期望输出,DataFrame格式:
code date_time_x date_time_y 变化率
0 10 1100 1000 0.10
1 20 2200 2500 -0.12
2 30 3300 0 1.00
3 40 4400 0 1.00
Python实现代码:
import numpy as np
import pandas as pd
# 生成数据
dt1 = {'code': ['10', '20', '30', '40'], 'date_time': ['1100', '2200', '3300', '4400'], }
dataframe_today = pd.DataFrame(dt1)
print(dataframe_today)
dt2 = {'code': ['10', '20', '50'], 'date_time': ['1000', '2500', '5000']}
dataframe_yesterday = pd.DataFrame(dt2)
print(dataframe_yesterday)
# 变化率计算
data_merge = pd.merge(dataframe_today, dataframe_yesterday, on='code', how='left')
data_merge.fillna(0, inplace=True)
data_merge[['date_time_x', 'date_time_y']] = data_merge[['date_time_x', 'date_time_y']].astype('int')
data_merge['变化率'] = (data_merge['date_time_x'] - data_merge['date_time_y']) / data_merge['date_time_y']
pd.Series(data_merge['变化率'])[np.isinf(data_merge['变化率'])] = 1
# data_merge = data_merge.sort_values(by='变化率', ascending=False) # 这一列可以根据变化率降序排列
print(data_merge)