python替换名称_在python中用对应的名称替换数千行ID名称的...

我认为您需要按set_index创建的系列创建map,如果某些值不匹配,则会得到NaN:

#change data for match

print (df1)

UniProtID NAME

0 O75015 PPP2R5B

1 P00734 PPP2R1B

2 P63151 PPP2R2A

df2['UniProt Name'] = df2['UniProtID'].map(df1.set_index('UniProtID')['NAME'])

print (df2)

DrugBankID Name Type UniProtID UniProt Name

0 DB00001 Lepirudin BiotechDrug P00734 PPP2R1B

1 DB00002 Cetuximab BiotechDrug P00533 NaN

2 DB00002 Cetuximab BiotechDrug O75015 PPP2R5B

如果相反,NaN需要原始值:

df2['UniProt Name'] = df2['UniProtID'].map(df1.set_index('UniProtID')['NAME'])

.fillna(df2['UniProt Name'])

print (df2)

DrugBankID Name Type UniProtID \n0 DB00001 Lepirudin BiotechDrug P00734

1 DB00002 Cetuximab BiotechDrug P00533

2 DB00002 Cetuximab BiotechDrug O75015

UniProt Name

0 PPP2R1B

1 Epidermal growth factor receptor

2 PPP2R5B

df = pd.merge(df2, df1, on="UniProtID", how='left')

df['UniProt Name'] = df['NAME'].fillna(df['UniProt Name'])

#alternative

#df['UniProt Name'] = df['NAME'].combine_first(df['UniProt Name'])

df.drop('NAME', axis=1, inplace=True)

print (df)

DrugBankID Name Type UniProtID \n0 DB00001 Lepirudin BiotechDrug P00734

1 DB00002 Cetuximab BiotechDrug P00533

2 DB00002 Cetuximab BiotechDrug O75015

UniProt Name

0 PPP2R1B

1 Epidermal growth factor receptor

2 PPP2R5B

df = pd.merge(df2, df1, on="UniProtID", how='left')

df = df.drop('UniProt Name', axis=1).rename(columns={'NAME':'UniProt Name'})

print (df)

DrugBankID Name Type UniProtID UniProt Name

0 DB00001 Lepirudin BiotechDrug P00734 PPP2R1B

1 DB00002 Cetuximab BiotechDrug P00533 NaN

2 DB00002 Cetuximab BiotechDrug O75015 PPP2R5B

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值