Please, help me understand my error.
I'm trying to change one column in my .csv file.
I have .csv file as following:
sku,name,code
k1,aaa,886
k2,bbb,898
k3,ccc,342
k4,ddd,503
k5,eee,401
I want to replace "k" symbol with the "_" symbol in the "sku" column.
I wrote the code:
import sys
import pandas as pd
import numpy as np
import datetime
df = pd.read_csv('cat0.csv')
for r in df['sku']:
r1 = r.replace('k', '_')
df['sku'] = r1
print (df)
But the code inserts the last value in every row of the "sku" column. So I get:
sku name code
0 _5 aaa 886
1 _5 bbb 898
2 _5 ccc 342
3 _5 ddd 503
4 _5 eee 401
I want to get as following:
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401
解决方案
You can use str.replace on the whole column:
from io import StringIO
import pandas as pd
data = """sku,name,code
k1,aaa,886
k2,bbb,898
k3,ccc,342
k4,ddd,503
k5,eee,401"""
file = StringIO(data)
df = pd.read_csv(file)
df['sku'] = df['sku'].str.replace('k', '_')
print(df)
This yields
sku name code
0 _1 aaa 886
1 _2 bbb 898
2 _3 ccc 342
3 _4 ddd 503
4 _5 eee 401