我有点不愿回答b / c似乎@chrisb可能已经成功回答了原来的问题,后来改变了.然而,克里斯没有在几天内更新答案,这个答案确实采取了不同的方法,所以我要回答1克里斯的答案并添加这个答案.
首先,只需使用’index’=’date2’从原始数据框创建一个新的数据框.这将是附加到现有数据框的基础(请注意,’index’是此处的列,而不是索引):
df2 = df[ df['index'] != df['date2'] ]
df2['index'] = df2['date2']
df2['value'] = np.nan
index date2 id value
0 2006-01-26 2006-01-26 3 NaN
1 2006-01-26 2006-01-26 1 NaN
2 2006-01-26 2006-01-26 2 NaN
4 2006-02-26 2006-02-26 4 NaN
现在,只需附加所有这些,但删除我们不需要的那些(如果我们已经有一个’index’=’date2’的现有行,这里的id = 2):
df3 = df.append(df2)
df3 = df3.drop_duplicates(['index','date2','id'])
df3 = df3.reset_index(drop=True).sort(['id','index','date2'])
df3['value'] = df3.value.fillna(method='ffill')
index date2 id value
1 2006-01-25 2006-01-26 1 1.0
6 2006-01-26 2006-01-26 1 1.0
2 2006-01-25 2006-01-26 2 2.0
3 2006-01-26 2006-01-26 2 2.1
0 2006-01-24 2006-01-26 3 3.0
5 2006-01-26 2006-01-26 3 3.0
4 2006-01-27 2006-02-26 4 4.0
7 2006-02-26 2006-02-26 4 4.0