0 M
1 F
10 NaN
11 NaN
Name: Gender, dtype: object 0 NaN
1 NaN
10 M
11 F
Name: Gender, dtype: object
0 173.0
1 192.0
10 NaN
11 NaN
Name: Height, dtype: float64 0 NaN
1 NaN
10 161.0
11 175.0
Name: Height, dtype: float64
df1 = pd.DataFrame({'A':['A0','A1'],'B':['B0','B1']},
index =[0,1])
df2 = pd.DataFrame({'A':['A2','A3'],'B':['B2','B3']},
index =[2,3])
df3 = pd.DataFrame({'A':['A1','A3'],'D':['D1','D3'],'E':['E1','E3']},
index =[1,3])
left = pd.DataFrame({'key1':['K0','K0','K1','K2'],'key2':['K0','K1','K0','K1'],'A':['A0','A1','A2','A3'],'B':['B0','B1','B2','B3']})
right = pd.DataFrame({'key1':['K0','K1','K1','K2'],'key2':['K0','K0','K0','K0'],'C':['C0','C1','C2','C3'],'D':['D0','D1','D2','D3']})
right2 = pd.DataFrame({'key1':['K0','K1','K1','K2'],'key2':['K0','K0','K0','K0'],'C':['C0','C1','C2','C3']})
left = pd.DataFrame({'A':[1,2],'B':[2,2]})
right = pd.DataFrame({'A':[4,5,6],'B':[2,3,4]})#pd.merge(left, right, on='B', how='outer',validate='one_to_one') #报错
left = pd.DataFrame({'A':[1,2],'B':[2,1]})
pd.merge(left, right, on='B', how='outer',validate='one_to_one')
left = pd.DataFrame({'A':['A0','A1','A2'],'B':['B0','B1','B2']},
index=['K0','K1','K2'])
right = pd.DataFrame({'C':['C0','C2','C3'],'D':['D0','D2','D3']},
index=['K0','K2','K3'])
left.join(right)
A
B
C
D
K0
A0
B0
C0
D0
K1
A1
B1
NaN
NaN
K2
A2
B2
C2
D2
对于many_to_one模式下的合并,往往join更为方便
同样可以指定key:
left = pd.DataFrame({'A':['A0','A1','A2','A3'],'B':['B0','B1','B2','B3'],'key':['K0','K1','K0','K1']})
right = pd.DataFrame({'C':['C0','C1'],'D':['D0','D1']},
index=['K0','K1'])
left.join(right, on='key')
A
B
key
C
D
0
A0
B0
K0
C0
D0
1
A1
B1
K1
C1
D1
2
A2
B2
K0
C0
D0
3
A3
B3
K1
C1
D1
多层key:
left = pd.DataFrame({'A':['A0','A1','A2','A3'],'B':['B0','B1','B2','B3'],'key1':['K0','K0','K1','K2'],'key2':['K0','K1','K0','K1']})
index = pd.MultiIndex.from_tuples([('K0','K0'),('K1','K0'),('K2','K0'),('K2','K1')],names=['key1','key2'])
right = pd.DataFrame({'C':['C0','C1','C2','C3'],'D':['D0','D1','D2','D3']},
index=index)
left.join(right, on=['key1','key2'])