Situation
以下有三张表:
接着进行表关联:
df1 = pd.merge(f0221a,f0221b,how = 'inner',on = 'classid')
df2 = pd.merge(df1,f0221c,how = 'inner',on = 'stuid')
df3 = df2[['stuname','classname','course','score']]
df4 = df3.loc[df3['classname'] == '一班']
最后当我添加新列时出现SettingWithCopyWarning报错:
df4['rk'] = df4.groupby('course')['score'].rank()
#<ipython-input-33-6cf912642ce9>:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df4['rk'] = df4.groupby('course')['score'].rank()
但奇怪的是,结果rk列添加成功了。
Solution
当出现这个报警的时候,对df的修改有时候会成功,有时候会失败,原因是pandas的dataframe的修改写操作,不允许先筛选子dataframe,只允许在源dataframe上进行,一步到位
解决方法有以下两种:
(1)使用.loc方法将多步操作合并为一步,直接修改源dataframe
df3 = df2.loc[df3['classname'] == '一班',['stuname','classname','course','score']]
df3['rk'] = df3.groupby('course')['score'].rank(ascending = False)
(2)对上面的df4进行copy,获得一个新的df,在进行下一步操作
df5 = df4.copy()
df5['rk'] = df5.groupby('course')['score'].rank(ascending = False)