7.Pandas合并merge

0 引言

Pandas中不仅可以调用 .concat 函数对多个表格进行合并,还可以调用 .merge 函数进行多个表格的合并。

1 .merge合并DataFrame表格

import pandas as pd

生成多个DataFrame表格

left = pd.DataFrame({'key':['K0','K1','K2','K3'],
                     'A':['A0','A1','A2','A3'],
                     'B':['B0','B1','B2','B3']})
                          
right = pd.DataFrame({'key':['K0','K1','K2','K3'],
                     'C':['C0','C1','C2','C3'],
                     'D':['D0','D1','D2','D3']})

print(left)
print(right)
  key   A   B
0  K0  A0  B0
1  K1  A1  B1
2  K2  A2  B2
3  K3  A3  B3
  key   C   D
0  K0  C0  D0
1  K1  C1  D1
2  K2  C2  D2
3  K3  C3  D3

掉用 .merge 函数并依据’key’进行两个表格的合并

res = pd.merge(left,right,on='key')
res
keyABCD
0K0A0B0C0D0
1K1A1B1C1D1
2K2A2B2C2D2
3K3A3B3C3D3
left = pd.DataFrame({'key1':['K0','K0','K1','K2'],
                     'key2':['K0','K1','K0','K1'],
                     'A':['A0','A1','A2','A3'],
                     'B':['B0','B1','B2','B3']})
                          
right = pd.DataFrame({'key1':['K0','K1','K1','K3'],
                      'key2':['K0','K0','K0','K0'],
                      'C':['C0','C1','C2','C3'],
                      'D':['D0','D1','D2','D3']})

print(left)
print(right)
  key1 key2   A   B
0   K0   K0  A0  B0
1   K0   K1  A1  B1
2   K1   K0  A2  B2
3   K2   K1  A3  B3
  key1 key2   C   D
0   K0   K0  C0  D0
1   K1   K0  C1  D1
2   K1   K0  C2  D2
3   K3   K0  C3  D3

调用 .merge,并依据多个key,设置how为’outer’,结果就是两个表格的位置一样的值就完全合并,不一样的值就设置为空值(NaN)

# how = ['left','right','inner','outer']
res = pd.merge(left,right,on=['key1','key2'],how='outer')# how默认inner
res
key1key2ABCD
0K0K0A0B0C0D0
1K0K1A1B1NaNNaN
2K1K0A2B2C1D1
3K1K0A2B2C2D2
4K2K1A3B3NaNNaN
5K3K0NaNNaNC3D3

how 设置为 ‘inner’ ,有空值的行会直接删掉

# how = ['left','right','inner','outer']
res = pd.merge(left,right,on=['key1','key2'],how='inner')# how默认inner
res
key1key2ABCD
0K0K0A0B0C0D0
1K1K0A2B2C1D1
2K1K0A2B2C2D2

how 设置为’left‘,只会考虑左边的DataFrame

# how = ['left','right','inner','outer']
res = pd.merge(left,right,on=['key1','key2'],how='left')# how默认inner
res
key1key2ABCD
0K0K0A0B0C0D0
1K0K1A1B1NaNNaN
2K1K0A2B2C1D1
3K1K0A2B2C2D2
4K2K1A3B3NaNNaN

继续在 .merge 函数中添加参数 indicator ,显示merge信息

# how = ['left','right','inner','outer']
res = pd.merge(left,right,on=['key1','key2'],how='outer',indicator=True)# 显示merge信息
res
key1key2ABCD_merge
0K0K0A0B0C0D0both
1K0K1A1B1NaNNaNleft_only
2K1K0A2B2C1D1both
3K1K0A2B2C2D2both
4K2K1A3B3NaNNaNleft_only
5K3K0NaNNaNC3D3right_only

也可以为参数 indicator 设置名字

# how = ['left','right','inner','outer']
res = pd.merge(left,right,on=['key1','key2'],how='outer',indicator='indicator_column')# 显示merge信息
res
key1key2ABCDindicator_column
0K0K0A0B0C0D0both
1K0K1A1B1NaNNaNleft_only
2K1K0A2B2C1D1both
3K1K0A2B2C2D2both
4K2K1A3B3NaNNaNleft_only
5K3K0NaNNaNC3D3right_only
left = pd.DataFrame({'A':['A0','A1','A2','A3'],
                     'B':['B0','B1','B2','B3']},
                     index = ['K0','K1','K2','K3'],)
                          
right = pd.DataFrame({'C':['C0','C1','C2','C3'],
                      'D':['D0','D1','D2','D3']},
                      index = ['K0','K1','K2','K3'],)

print(left)
print(right)
     A   B
K0  A0  B0
K1  A1  B1
K2  A2  B2
K3  A3  B3
     C   D
K0  C0  D0
K1  C1  D1
K2  C2  D2
K3  C3  D3

调用 .merge 也可以像 .concat 一样依据索引 index 进行合并

res = pd.merge(left,right,left_index=True,right_index=True,how='outer')
res
ABCD
K0A0B0C0D0
K1A1B1C1D1
K2A2B2C2D2
K3A3B3C3D3
boys = pd.DataFrame({'k':['K0','K1','K2'],'age':[1,2,3]})
girls = pd.DataFrame({'k':['K0','K0','K3'],'age':[4,5,6]})

print(boys)
print(girls)
    k  age
0  K0    1
1  K1    2
2  K2    3
    k  age
0  K0    4
1  K0    5
2  K3    6

参数suffixes 可以为合并后的表格重新修改表头,区分不同表格的表头

res = pd.merge(boys,girls,on='k',suffixes=['_boy','_girl'],how='outer')
res
kage_boyage_girl
0K01.04.0
1K01.05.0
2K12.0NaN
3K23.0NaN
4K3NaN6.0
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ZPILOTE

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值