map函数python pandas_正确使用map将函数映射到df,python pandas

本文探讨了在Python Pandas中使用map和apply函数的两种方法,特别是涉及多列操作时的选择。文章指出,使用Series的map方法更符合Python Pandas的习惯,且效率更高。而当需要考虑当前行和前一行的数据时,可以使用apply方法配合lambda表达式。最终给出了使用apply和直接调用函数的两种有效解决方案。
摘要由CSDN通过智能技术生成

Searching for awhile now and can't get anything concrete on this. Looking for a best practice answer. My code works, but I'm not sure if I'm introducing problems.

# df['Action'] = list(map(my_function, df.param1)) # Works but older

# i think?

df['Action'] = df['param1'].map(my_function)

Both of these produce the same VISIBLE result. I'm not entirely sure how the first, commented out line works, but it is an example I found on the internets that I applied here and it worked. Most other uses of map I've found are like the 2nd line, where it is called from the Series object.

So first question, which of these is better practice and what exactly is the first one doing?

2nd and final question. This is the more important of the two.

Map, apply, applymap - not sure which to use here.

The first commented out line of code does NOT work, while the second gives me exactly what I want.

def my_function(param1, param2, param3):

return param1 * param2 * param3 # example

# Can't get this df.map function to work?

# Error map is not attribute of dataframe

# df['New_Col'] = df.map(my_function, df.param1, df.param1.shift(1),

# df.param2.shift(1))

# TypeError: my_function takes 3 positional args, but 4 were given

# df['New_Col'] = df.apply(my_function, args=(df.param1, df.param1.shift(1),

# df.param2.shift(1)))

# This works, not sure why

df['New_Col'] = list(map(my_function, df.param1, df.param1.shift(1),

df.param2.shift(1)))

I'm trying to compute a result that is based off of two columns of the df, from the current and previous rows. I've tried variations on map and apply when called from the df directly (df.map, df.apply) and haven't had success. But if I use the list(map(...)) notation it works great.

Is list(map(...)) acceptable? Which is best practice? Is there a correct way to use apply or map directly from the df object?

Thanks guys, appreciated.

EDIT: MaxU's response below works also. As it is, both of these work:

df['New_Col'] = list(map(my_function, df.param1, df.param1.shift(1),

df.param2.shift(1)))

df['New_Col'] = my_function(df.parma1, df.param1.shift(1), df.param2.shift(1))

# This does NOT work

df['New_Col'] = df.apply(my_function, axis=1, args=(df.param1,

df.param1.shift(1), df.param2.shift(1)))

# Also does not work

# AttributeError: ("'float' object has no attribute 'shift'",

'occurred at index 2000-01-04 00:00:00')

# Will work if I remove the shift(), but not what I need.

df['New_Col'] = df.apply(lambda x: my_function(x.param1, x.param1.shift(1),

x.param2.shift(1)))

I'm still unclear as to the proper syntax to use apply here, and if any of these 3 methods are superior to the other (I'm guessing list(map(...)) is the "worst" of the 3 since it iterates and isn't vectorized.

解决方案So first question, which of these is better practice and what exactly

is the first one doing?

df['Action'] = df['param1'].map(my_function)

is much more idiomatic, faster (vectorized) and more reliable.

2nd and final question. This is the more important of the two. Map,

apply, applymap - not sure which to use here. The first commented out

line of code does NOT work, while the second gives me exactly what I

want.

Pandas does NOT have DataFrame.map() - only Series.map(), so if you need to access multiple columns in your mapping function - you can use DataFrame.apply().

Demo:

df['New_Col'] = df.apply(lamba x: my_function(x.param1,

x.param1.shift(1),

x.param2.shift(1),

axis=1)

or just:

df['New_Col'] = my_function(df.param1, df.param1.shift(1), df.param2.shift(1))

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值