目录
1.merge数据合并
import pandas as pd
import numpy as np
1.1merge默认合并数据
merge合并时默认是内连接(inner)
price = pd.DataFrame({'fruit':['apple','grape','orange','orange'],'price':[8,7,9,11]})
amount = pd.DataFrame({'fruit':['apple','grape','orange'],'amout':[5,11,8]})
display(price,amount,pd.merge(price,amount))
#------------------------------------------------------------------------
fruit price
0 apple 8
1 grape 7
2 orange 9
3 orange 11
fruit amout
0 apple 5
1 grape 11
2 orange 8
fruit price amout
0 apple 8 5
1 grape 7 11
2 orange 9 8
3 orange 11 8
1.2左连接和右连接
pd.merge(price,amount,how = 'left')#左连接
pd.merge(price,amount,how = 'right')#右连接
1.3 merge通过多个键合并
left = pd.DataFrame({'key1':['one','one','two'],'key2':['a','b','a'],'value1':range(3)})
right = pd.DataFrame({'key1':['one','one','two','two'],'key2':['a','a','a','b'],'value2':range(4)})
display(left,right,pd.merge(left,right,on = ['key1','key2'],how = 'left'))
#--------------------------------------------------------------------------------
key1 key2 value1
0 one a 0
1 one b 1
2 two a 2
key1 key2 value2
0 one a 0
1 one a 1
2 two a 2
3 two b 3
key1 key2 value1 value2
0 one a 0 0.0
1 one a 0 1.0
2 one b 1 NaN
3 two a 2 2.0
2.concat数据连接
concat方法默认情况下会按行的方向堆叠数据,如果在列向上连接,设置axis=1。
2.1两个Series的数据连接
s1 = pd.Series([0,1],index = ['a','b'])
s2 = pd.Series([2,3,4],index = ['a','d','e'])
s3 = pd.Series([5,6],index = ['f','g'])
print(pd.concat([s1,s2,s3])) #Series行合并
#----------------------------------------------
a 0
b 1
a 2
d 3
e 4
f 5
g 6
2.2两个DataFrame的数据连接
data1 = pd.DataFrame(np.arange(6).reshape(2,3),columns = list('abc'))
data2 = pd.DataFrame(np.arange(20,26).reshape(2,3),columns = list('ayz'))
data = pd.concat([data1,data2],axis = 0,sort=False)
display(data1,data2,data)
#--------------------------------------------------------------------------
a b c
0 0 1 2
1 3 4 5
a y z
0 20 21 22
1 23 24 25
a b c y z
0 0 1.0 2.0 NaN NaN
1 3 4.0 5.0 NaN NaN
0 20 NaN NaN 21.0 22.0
1 23 NaN NaN 24.0 25.0
3.combine_first合并数据
如果需要合并的两个DataFrame存在重复索引,可以使用combine_first方法。
s6.combine_first(s5)
#-----------------------
0 1
a 0.0 0.0
b 1.0 5.0
f NaN 5.0
g NaN 6.0