python iterrows,Python：使用.iterrows（）创建列

最新推荐文章于 2024-07-04 11:11:47 发布

weixin_39946657

最新推荐文章于 2024-07-04 11:11:47 发布

阅读量539

点赞数

文章标签： python iterrows

I am trying to use a loop function to create a matrix of whether a product was seen in a particular week.

Each row in the df (representing a product) has a close_date (the date the product closed) and a week_diff (the number of weeks the product was listed).

import pandas

mydata = [{'subid' : 'A', 'Close_date_wk': 25, 'week_diff':3},

{'subid' : 'B', 'Close_date_wk': 26, 'week_diff':2},

{'subid' : 'C', 'Close_date_wk': 27, 'week_diff':2},]

df = pandas.DataFrame(mydata)

My goal is to see how many alternative products were listed for each product in each date_range

I have set up the following loop:

for index, row in df.iterrows():

i = 0

max_range = row['Close_date_wk']

min_range = int(row['Close_date_wk'] - row['week_diff'])

for i in range(min_range,max_range):

col_head = 'job_week_' + str(i)

row[col_head] = 1

Can you please help explain why the "row[col_head] = 1" line is neither adding a column, nor adding a value to that column for that row.

For example, if:

row A has date range 1,2,3

row B has date range 2,3

row C has date range 3,4,5'

then ideally I would like to end up with

row A has 0 alternative products in week 1

1 alternative products in week 2

2 alternative products in week 3

row B has 1 alternative products in week 2

2 alternative products in week 3

&c..

解决方案

You can't mutate the df using row here to add a new column, you'd either refer to the original df or use .loc, .iloc, or .ix, example:

In [29]:

df = pd.DataFrame(columns=list('abc'), data = np.random.randn(5,3))

df

Out[29]:

a b c

0 -1.525011 0.778190 -1.010391

1 0.619824 0.790439 -0.692568

2 1.272323 1.620728 0.192169

3 0.193523 0.070921 1.067544

4 0.057110 -1.007442 1.706704

In [30]:

for index,row in df.iterrows():

df.loc[index,'d'] = np.random.randint(0, 10)

df

Out[30]:

a b c d

0 -1.525011 0.778190 -1.010391 9

1 0.619824 0.790439 -0.692568 9

2 1.272323 1.620728 0.192169 1

3 0.193523 0.070921 1.067544 0

4 0.057110 -1.007442 1.706704 9

You can modify existing rows:

In [31]:

# reset the df by slicing

df = df[list('abc')]

for index,row in df.iterrows():

row['b'] = np.random.randint(0, 10)

df

Out[31]:

a b c

0 -1.525011 8 -1.010391

1 0.619824 2 -0.692568

2 1.272323 8 0.192169

3 0.193523 2 1.067544

4 0.057110 3 1.706704

But adding a new column using row won't work:

In [35]:

df = df[list('abc')]

for index,row in df.iterrows():

row['d'] = np.random.randint(0,10)

df

Out[35]:

a b c

0 -1.525011 8 -1.010391

1 0.619824 2 -0.692568

2 1.272323 8 0.192169

3 0.193523 2 1.067544

4 0.057110 3 1.706704

weixin_39946657

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。