本文翻译自:Delete column from pandas DataFrame
When deleting a column in a DataFrame I use: 在删除DataFrame中的列时,我使用:
del df['column_name']
And this works great. 这很棒。 Why can't I use the following? 为什么不能使用以下内容?
del df.column_name
As you can access the column/Series as df.column_name
, I expect this to work. 因为您可以使用df.column_name
来访问列/系列, df.column_name
我希望这可以正常工作。
#1楼
参考:https://stackoom.com/question/uGxE/从pandas-DataFrame删除列
#2楼
It's good practice to always use the []
notation. 始终使用[]
表示法是一种好习惯。 One reason is that attribute notation ( df.column_name
) does not work for numbered indices: 原因之一是属性符号( df.column_name
)对编号索引不起作用:
In [1]: df = DataFrame([[1, 2, 3], [4, 5, 6]])
In [2]: df[1]
Out[2]:
0 2
1 5
Name: 1
In [3]: df.1
File "<ipython-input-3-e4803c0d1066>", line 1
df.1
^
SyntaxError: invalid syntax
#3楼
As you've guessed, the right syntax is 如您所料,正确的语法是
del df['column_name']
It's difficult to make del df.column_name
work simply as the result of syntactic limitations in Python. 仅仅由于Python的语法限制,很难使del df.column_name
起作用。 del df[name]
gets translated to df.__delitem__(name)
under the covers by Python. Python将del df[name]
转换为df.__delitem__(name)
。
#4楼
The best way to do this in pandas is to use drop
: 在熊猫中做到这一点的最好方法是使用drop
:
df = df.drop('column_name', 1)
where 1
is the axis number ( 0
for rows and 1
for columns.) 其中1
是轴编号( 0
代表行, 1
代表列)。
To delete the column without having to reassign df
you can do: 要删除该列而不必重新分配df
您可以执行以下操作:
df.drop('column_name', axis=1, inplace=True)
Finally, to drop by column number instead of by column label , try this to delete, eg the 1st, 2nd and 4th columns: 最后,要按列号而不是按列标签删除,请尝试将其删除,例如第一,第二和第四列:
df = df.drop(df.columns[[0, 1, 3]], axis=1) # df.columns is zero-based pd.Index
Also working with "text" syntax for the columns: 还可以对列使用“文本”语法:
df.drop(['column_nameA', 'column_nameB'], axis=1, inplace=True)
#5楼
Use: 使用:
columns = ['Col1', 'Col2', ...]
df.drop(columns, inplace=True, axis=1)
This will delete one or more columns in-place. 这将就地删除一个或多个列。 Note that inplace=True
was added in pandas v0.13 and won't work on older versions. 请注意, inplace=True
已在pandas v0.13中添加,不适用于旧版本。 You'd have to assign the result back in that case: 在这种情况下,您必须将结果分配回去:
df = df.drop(columns, axis=1)
#6楼
Drop by index 按索引下降
Delete first, second and fourth columns: 删除第一,第二和第四列:
df.drop(df.columns[[0,1,3]], axis=1, inplace=True)
Delete first column: 删除第一列:
df.drop(df.columns[[0]], axis=1, inplace=True)
There is an optional parameter inplace
so that the original data can be modified without creating a copy. 有一个可选参数inplace
使原来的数据可以不创建副本被修改。
Popped 弹出
Column selection, addition, deletion 列选择,添加,删除
Delete column column-name
: 删除列column-name
:
df.pop('column-name')
Examples: 例子:
df = DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6]), ('C', [7,8, 9])], orient='index', columns=['one', 'two', 'three'])
print df
: print df
:
one two three
A 1 2 3
B 4 5 6
C 7 8 9
df.drop(df.columns[[0]], axis=1, inplace=True)
print df
: df.drop(df.columns[[0]], axis=1, inplace=True)
print df
:
two three
A 2 3
B 5 6
C 8 9
three = df.pop('three')
print df
: three = df.pop('three')
print df
:
two
A 2
B 5
C 8