文章目录
import pandas as pd
import numpy as np
from pandas import Series,DataFrame
3.7 汇总计算 与 描述统计
- pandas 的 Series 和 DataFrame 自带
-
xxx.sum()
-
xxx.mean()
-
xxx.max()
-
xxx.add()
-
等等等 方法
df1=DataFrame(
[
[3,2,np.nan],
[2,7,-5],
[5,np.nan,np.nan],
[7,6,4]
]
)
df1
0 1 2
0 3 2.0 NaN
1 2 7.0 -5.0
2 5 NaN NaN
3 7 6.0 4.0
df1.sum()
0 17.0
1 15.0
2 -1.0
dtype: float64
-
默认参数 axis=0
df1.sum(axis=0)
0 17.0
1 15.0
2 -1.0
dtype: float64 -
axis=0 还可以携程 axis=‘index’
df1.sum(axis='index')
0 17.0
1 15.0
2 -1.0
dtype: float64
df1.sum(axis='columns')
0 5.0
1 4.0
2 5.0
3 17.0
dtype: float64
- #平均
print(df1)
df1.mean(skipna=True)
0 1 2
0 3 2.0 NaN
1 2 7.0 -5.0
2 5 NaN NaN
3 7 6.0 4.0
0 4.25
1 5.00
2 -0.50
dtype: float64
- 让 nan 参与计算
df1.mean(skipna=False)
0 4.25
1 NaN
2 NaN
dtype: float64
- 而很多时候, 需要处理nan,然后再计算
df1.fillna(0).mean(0)
0 4.25
1 3.75
2 -0.25
dtype: float64
累计求和
df2=DataFrame(np.random.randint(1,10,(4,6))