pandas合并
concat
pandas在数据处理时会使用到数据合并
axis合并方向
df1 = pd.DataFrame(np.ones((3,4))*0,columns=['a','b','c','d'])
df2 = pd.DataFrame(np.ones((3,4))*1,columns=['a','b','c','d'])
df3 = pd.DataFrame(np.ones((3,4))*2,columns=['a','b','c','d'])
res = pd.concat([df1,df2,df3],axis=0)
- axis=0 是预设值,因此默认是axis=0
Run:
a b c d
0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
0 1.0 1.0 1.0 1.0
1 1.0 1.0 1.0 1.0
2 1.0 1.0 1.0 1.0
0 2.0 2.0 2.0 2.0
1 2.0 2.0 2.0 2.0
2 2.0 2.0 2.0 2.0
重置index
res = pd.concat([df1,df2,df3],axis=0,ignore_index=True)
Run:
a b c d
0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
3 1.0 1.0 1.0 1.0
4 1.0 1.0 1.0 1.0
5 1.0 1.0 1.0 1.0
6 2.0 2.0 2.0 2.0
7 2.0 2.0 2.0 2.0
8 2.0 2.0 2.0 2.0
join合并方式
df1 = pd.DataFrame(np.ones((3,4))*0,columns=['a','b','c','d'])
df2 = pd.DataFrame(np.ones((3,4))*1,columns=['b','c','d','e'])
- Join=‘outer’ 是预设值,是按照column来纵向合并的,列标签与列数据个数相同,其他用NaN填充
Run:
a b c d e
0 0.0 0.0 0.0 0.0 NaN
1 0.0 0.0 0.0 0.0 NaN
2 0.0 0.0 0.0 0.0 NaN
0 NaN 1.0 1.0 1.0 1.0
1 NaN 1.0 1.0 1.0 1.0
2 NaN 1.0 1.0 1.0 1.0
res = pd.concat([df1,df2],axis=0,join='inner')
- 当是inner是只是合并共有的数据,其余抛弃
Run:
b c d
0 0.0 0.0 0.0
1 0.0 0.0 0.0
2 0.0 0.0 0.0
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0
append(添加数据)
df1 = pd.DataFrame(np.ones((3,4))*0,columns=['a','b','c','d'])
df2 = pd.DataFrame(np.ones((3,4))*1,columns=['b','c','d','e'])
s1 = pd.Series([1,2,3,4],index=['a','b','c','d'])
res = df1.append(df2,ignore_index=True)
- append只有纵向合并
Run:
a b c d e
0 0.0 0.0 0.0 0.0 NaN
1 0.0 0.0 0.0 0.0 NaN
2 0.0 0.0 0.0 0.0 NaN
3 NaN 1.0 1.0 1.0 1.0
4 NaN 1.0 1.0 1.0 1.0
5 NaN 1.0 1.0 1.0 1.0
res = df1.append(s1,ignore_index=True)
- 合并Series时会将s1合并到df1,以及重置index
Run:
a b c d
0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
3 1.0 2.0 3.0 4.0