python值传递还是引用传递,python pandas dataframe,是按值传递还是按引用传递

If I pass a dataframe to a function and modify it inside the function, is it pass-by-value or pass-by-reference?

I run the following code

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})

def letgo(df):

df = df.drop('b',axis=1)

letgo(a)

the value of a does not change after the function call. Does it mean it is pass-by-value?

I also tried the following

xx = np.array([[1,2], [3,4]])

def letgo2(x):

x[1,1] = 100

def letgo3(x):

x = np.array([[3,3],[3,3]])

It turns out letgo2() does change xx and letgo3() does not. Why is it like this?

解决方案

The short answer is, Python always does pass-by-value, but every Python variable is actually a pointer to some object, so sometimes it looks like pass-by-reference.

In Python every object is either mutable or non-mutable. e.g., lists, dicts, modules and Pandas data frames are mutable, and ints, strings and tuples are non-mutable. Mutable objects can be changed internally (e.g., add an element to a list), but non-mutable objects cannot.

As I said at the start, you can think of every Python variable as a pointer to an object. When you pass a variable to a function, the variable (pointer) within the function is always a copy of the variable (pointer) that was passed in. So if you assign something new to the internal variable, all you are doing is changing the local variable to point to a different object. This doesn't alter (mutate) the original object that the variable pointed to, nor does it make the external variable point to the new object. At this point, the external variable still points to the original object, but the internal variable points to a new object.

If you want to alter the original object (only possible with mutable data types), you have to do something that alters the object without assigning a completely new value to the local variable. This is why letgo() and letgo3() leave the external item unaltered, but letgo2() alters it.

As @ursan pointed out, if letgo() used something like this instead, then it would alter (mutate) the original object that df points to, which would change the value seen via the global a variable:

def letgo(df):

df.drop('b', axis=1, inplace=True)

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})

letgo(a) # will alter a

In some cases, you can completely hollow out the original variable and refill it with new data, without actually doing a direct assignment, e.g. this will alter the original object that v points to, which will change the data seen when you use v later:

def letgo3(x):

x[:] = np.array([[3,3],[3,3]])

v = np.empty((2, 2))

letgo3(v) # will alter v

Notice that I'm not assigning something directly to x; I'm assigning something to the entire internal range of x.

If you absolutely must create a completely new object and make it visible externally (which is sometimes the case with pandas), you have two options. The 'clean' option would be just to return the new object, e.g.,

def letgo(df):

df = df.drop('b',axis=1)

return df

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})

a = letgo(a)

Another option would be to reach outside your function and directly alter a global variable. This changes a to point to a new object, and any function that refers to a afterward will see that new object:

def letgo():

global a

a = a.drop('b',axis=1)

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})

letgo() # will alter a!

Directly altering global variables is usually a bad idea, because anyone who reads your code will have a hard time figuring out how a got changed. (I generally use global variables for shared parameters used by many functions in a script, but I don't let them alter those global variables.)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值