原数据:
df=pd.DataFrame().assign(A=['a']*7).assign(B=['b1','b1','b1','b2','b2','b2','b2']).assign(C=['c']*7).assign(D=[2001,2003,2005,2001,2002,2003,2004])
df
'''
A B C D
0 a b1 c 2001
1 a b1 c 2003
2 a b1 c 2005
3 a b2 c 2001
4 a b2 c 2002
5 a b2 c 2003
6 a b2 c 2004
'''
代码:
(
df.groupby(['A','B','C'])
.apply(lambda x: ','.join(x.D.astype(str)))
.str.split(',',expand=True)
)
'''
0 1 2 3
A B C
a b1 c 2001 2003 2005 None
b2 c 2001 2002 2003 2004
'''