我认为您需要按set_index创建的系列创建map,如果某些值不匹配,则会得到NaN:
#change data for match
print (df1)
UniProtID NAME
0 O75015 PPP2R5B
1 P00734 PPP2R1B
2 P63151 PPP2R2A
df2['UniProt Name'] = df2['UniProtID'].map(df1.set_index('UniProtID')['NAME'])
print (df2)
DrugBankID Name Type UniProtID UniProt Name
0 DB00001 Lepirudin BiotechDrug P00734 PPP2R1B
1 DB00002 Cetuximab BiotechDrug P00533 NaN
2 DB00002 Cetuximab BiotechDrug O75015 PPP2R5B
如果相反,NaN需要原始值:
df2['UniProt Name'] = df2['UniProtID'].map(df1.set_index('UniProtID')['NAME'])
.fillna(df2['UniProt Name'])
print (df2)
DrugBankID Name Type UniProtID \n0 DB00001 Lepirudin BiotechDrug P00734
1 DB00002 Cetuximab BiotechDrug P00533
2 DB00002 Cetuximab BiotechDrug O75015
UniProt Name
0 PPP2R1B
1 Epidermal growth factor receptor
2 PPP2R5B
df = pd.merge(df2, df1, on="UniProtID", how='left')
df['UniProt Name'] = df['NAME'].fillna(df['UniProt Name'])
#alternative
#df['UniProt Name'] = df['NAME'].combine_first(df['UniProt Name'])
df.drop('NAME', axis=1, inplace=True)
print (df)
DrugBankID Name Type UniProtID \n0 DB00001 Lepirudin BiotechDrug P00734
1 DB00002 Cetuximab BiotechDrug P00533
2 DB00002 Cetuximab BiotechDrug O75015
UniProt Name
0 PPP2R1B
1 Epidermal growth factor receptor
2 PPP2R5B
df = pd.merge(df2, df1, on="UniProtID", how='left')
df = df.drop('UniProt Name', axis=1).rename(columns={'NAME':'UniProt Name'})
print (df)
DrugBankID Name Type UniProtID UniProt Name
0 DB00001 Lepirudin BiotechDrug P00734 PPP2R1B
1 DB00002 Cetuximab BiotechDrug P00533 NaN
2 DB00002 Cetuximab BiotechDrug O75015 PPP2R5B