Python Pandas的DataFrame对象中轴的意义,axis=0 或者axis=1代表什么意义?
通常来说:axis = 0代表行, axis=1代表列。
一、从删除操作来看axis:
举个例子,创建一个dataframe变量df:
df = pd.DataFrame(np.arange(16).reshape(4,4),
index = list('1234'),
columns = list('ABCD'))
df的结构如下:
现在:
df.drop(['A'], axis = 1,inplace=True)#该语句代表删除列名为‘A’的列,所以axis=1代表列。结果如下:
df.drop(['1'], axis = 0, inplace = True)#该语句代表删除行名为‘1’的行,所以axis=0代表行。结果如下:
二、从统计角度来看axis:
不同的地方,当计算列的平均值或者和的时候,情况又有点不一样
当我们使用,df.sum(axis=0)的时候,结果如下:
当使用df.sum(axis=1)的时候,结果如下:
在stackflowshan上,答案是这样的:
+------------+---------+--------+
| | A | B |
+------------+---------+---------
| 0 | 0.626386| 1.52325|----axis=1----->
+------------+---------+--------+
| |
| axis=0 |
↓ ↓
“It specifies the axis along which the means are computed. By default axis=0
. This is consistent with the numpy.mean
usage when axis
is specified explicitly (in numpy.mean
, axis==None by default, which computes the mean value over the flattened array) , in which axis=0
along the rows (namely, index in pandas), and axis=1
along the columns. For added clarity, one may choose to specify axis='index'
(instead of axis=0
) or axis='columns'
(instead of axis=1
).”
看了还是不很理解:这样理解一下吧,使用sum操作的时候axis=0代表是逐行对数据操作,所以统计的是列的结果。