merge()方法
语法如下:#pd即指Pandas库
pd.merge(left,right,how = 'inner',on = None,left_on = None,right_on = None,left_index = False,right_index = False,sort = True,suffixes = ('_x','_y'))
left:参与合并的左侧
right:参与合并的右侧
how:为合并方式,默认为'inner',其他方式有:'outer','left','right'
on :指定连接键的列名
left_on:左侧DataFrame中用作连接的键
right_on:右侧DataFrame中用作连接的键
left_index,right_index:指定是否以索引作为键,默认为否(False)
sort:是否根据连接键合并后的数据进行排序,默认为True
suffixes:在重复的列名后面追加后缀,默认为('_x','_y')
merge()方法应用:
import pandas as pd
df1 = pd.DataFrame({'ID':[1,2,3],'class':['A0','A1','A2'],'name':['a','b','c']})
df2 = pd.DataFrame({'ID':[2,3,4],'class':['A1','A2','A3'],'name':['b','c','d']})
print(pd.merge(df1,df2,on = 'ID',suffixes = ('_left','_right')))
join()方法
join()方法与merg()方法类似,语法如下:
df.join(other,on = None,how = 'left',lsuffix = ' ',rsuffix = ' ',sort = False)
df:代表DataFrame数据对象
other:被合并的对象
on:参与连接的df对象的某列
how:于merg()方法相同,默认为'inner'
lsuffix:左表(df)重复列名的后缀
rsuffix:右表(other)重复列名的后缀
sort:是否根据连接键合并后的数据进行排序,默认为False
join()方法应用:
df1 = pd.DataFrame({'Sepal length':[5,6,3,7,8],'Species':['Test','Test','Test','Test','Test'],'index':['a','b','c','d','e']})
df1 = df1.set_index('index')#将‘index’设置索引
print(df1.join(df1,on='index',lsuffix='_left',rsuffix='_right'))