pandas 索引问题
有时候我们会经常把一个DataFrame传进一个函数里,又害怕它被函数里面的语句更改了,所以我一般做法是先df.copy()一下,但是其实我们在用ix, iloc, loc 索引的时候就已经进行了拷贝,下面举个例子
In [52]: df = pd.DataFrame(np.arange(10).reshape(2,5), columns=list('abcde'), index=list('mn'))
In [53]: df
Out[53]:
a b c d e
m 0 1 2 3 4
n 5 6 7 8 9
In [54]: aa = df.loc[:, ['a','b']]
In [55]: aa
Out[55]:
a b
m 0 1
n 5 6
In [56]: aa['a'][0] = 3
In [57]: df
Out[57]:
a b c d e
m 0 1 2 3 4
n 5 6 7 8 9
In [58]: aa
Out[58]:
a b
m 3 1
n 5 6
In [59]: aa = df.iloc[:, [0,1]]
In [60]: aa
Out[60]:
a b
m 0 1
n 5 6
In [61]: aa['a'][0] = 3
C:\Apps\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py:2910: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
exec(code_obj, self.user_global_ns, self.user_ns)
In [66]: aa
Out[66]:
m 0
n 5
Name: a, dtype: int32
In [67]: aa[0] = 3
In [68]: df
Out[68]:
a b c d e
m 3 1 2 3 4
n 5 6 7 8 9
In [69]: aa = df.loc[:, 'b']
In [70]: aa[0] = 3
In [71]: df
Out[71]:
a b c d e
m 3 3 2 3 4
n 5 6 7 8 9
可以发现当选择的列数 大于1 的时候是copy了的,但是如果只选择一列,则没copy,在对其赋值的时候要格外小心。最好还是copy一下吧