example1
import pandas as pd
df = pd.DataFrame({'address': {0: '71 Pilgrim Avenue, Chevy Chase, MD',
1: '72 Main St, Chevy Chase, MD'},
'id': {0: 'a', 1: 'b'}})
print(df)
#if your address format is consistent, you can simply use a split function.
df2 = df.join(pd.DataFrame(df.address.str.split(',').tolist(),columns=['street', 'city', 'state']))
#df2 = df2.applymap(lambda x: x.strip())
df2
原始数据
我现在需要将这个_的前后内容给分出来
import pandas as pd
data = pd.read_csv("./barcodes.tsv",sep="\t",header=None)
data.head()
拆分字符串
data2 = pd.DataFrame(data[0].str.split("_").tolist(),columns=["cell_class","cell_ID"])
data2.head()
data2["cell_class"].value_counts()
重新拼接
data_all = data.join(data2)