Pandas-常用统计分析方法 describe、quantile、sum、mean、median、count、max、min、idxmax、idxmin、mad、var、std、cumsum

理论:

describe()快速查看每列数据的统计信息,以下是可以输出的统计指标

count,数据个数(非空数据)

mean,均值

std,标准差

min,最小值

25%,第1四分位数,即第25百分位数

50%,第2四分位数,即第50百分位数

75%,第3四分位数,即第75百分位数

max,最大值

quantile(q)

输出指定位置的百分位数,默认q=0.5,q的范围是[0,1]

 

常用统计方法:

sum(),求和

mean(),求均值

median(),求中位数

count(),求非空的个数

注意:以上统计方法不对缺失数据进行统计

 

max(),求最大值

min(),求最小值

idxmax(),返回最大值对应的索引

idxmin(),返回最小值对应的索引

注意:argmax()和argmin()在近期的版本中即将停止使用

 

mad(),求平均绝对误差(mean absolute deviation),对表示各个变量值之间差异程度的数值之一

var():方差

std():求标准差

cumsum(),求累加

 

第15节 常用统计方法(1) --describe、quantile

In [1]:

 

 
import pandas as pd

In [2]:

 

 
data = pd.read_csv(r'C:\Users\ML Learning\Projects\第四章-数据分析预习内容\第四章-数据分析预习内容\第一节-数据分析工具pandas基础\lesson_05\lesson_05\examples\datasets\2021_happiness.csv')
data.head()

Out[2]:

 CountryRegionHappiness RankHappiness ScoreLower Confidence IntervalUpper Confidence IntervalEconomy (GDP per Capita)FamilyHealth (Life Expectancy)FreedomTrust (Government Corruption)GenerosityDystopia Residual
0DenmarkWestern Europe17.5267.4607.5921.441781.163740.795040.579410.444530.361712.73939
1SwitzerlandWestern Europe27.5097.4287.5901.527331.145240.863030.585570.412030.280832.69463
2IcelandWestern Europe37.5017.3337.6691.426661.183260.867330.566240.149750.476782.83137
3NorwayWestern Europe47.4987.4217.5751.577441.126900.795790.596090.357760.378952.66465
4FinlandWestern Europe57.4137.3517.4751.405981.134640.810910.571040.410040.254922.82596

In [3]:

 

data.describe()

Out[3]:

 Happiness RankHappiness ScoreLower Confidence IntervalUpper Confidence IntervalEconomy (GDP per Capita)FamilyHealth (Life Expectancy)FreedomTrust (Government Corruption)GenerosityDystopia Residual
count157.000000157.000000157.000000157.000000157.000000157.000000157.000000157.000000157.000000157.000000157.000000
mean78.9808925.3821855.2823955.4819750.9538800.7936210.5576190.3709940.1376240.2426352.325807
std45.4660301.1416741.1480431.1364930.4125950.2667060.2293490.1455070.1110380.1337560.542220
min1.0000002.9050002.7320003.0780000.0000000.0000000.0000000.0000000.0000000.0000000.817890
25%40.0000004.4040004.3270004.4650000.6702400.6418400.3829100.2574800.0612600.1545702.031710
50%79.0000005.3140005.2370005.4190001.0278000.8414200.5965900.3974700.1054700.2224502.290740
75%118.0000006.2690006.1540006.4340001.2796401.0215200.7299300.4845300.1755400.3118502.664650
max157.0000007.5260007.4600007.6690001.8242701.1832600.9527700.6084800.5052100.8197103.837720

In [4]:

 

 
data.quantile(q=0.5)

Out[4]:

Happiness Rank                   79.00000
Happiness Score                   5.31400
Lower Confidence Interval         5.23700
Upper Confidence Interval         5.41900
Economy (GDP per Capita)          1.02780
Family                            0.84142
Health (Life Expectancy)          0.59659
Freedom                           0.39747
Trust (Government Corruption)     0.10547
Generosity                        0.22245
Dystopia Residual                 2.29074
Name: 0.5, dtype: float64

In [5]:

 

 
data.quantile(q=0.25)

Out[5]:

Happiness Rank                   40.00000
Happiness Score                   4.40400
Lower Confidence Interval         4.32700
Upper Confidence Interval         4.46500
Economy (GDP per Capita)          0.67024
Family                            0.64184
Health (Life Expectancy)          0.38291
Freedom                           0.25748
Trust (Government Corruption)     0.06126
Generosity                        0.15457
Dystopia Residual                 2.03171
Name: 0.25, dtype: float64

In [6]:

 

 
import pandas as pd

In [7]:

 

 
data = pd.read_csv(r'C:\Users\ML Learning\Projects\第四章-数据分析预习内容\第四章-数据分析预习内容\第一节-数据分析工具pandas基础\lesson_05\lesson_05\examples\datasets\log.csv')
data.head()

Out[7]:

 timeuservideoplayback positionpausedvolume
01469974424cherylintro.html5False10.0
11469974454cherylintro.html6NaNNaN
21469974544cherylintro.html9NaNNaN
31469974574cherylintro.html10NaNNaN
41469977514bobintro.html1NaNNaN

In [8]:

 

 
data.sum()  #求和

Out[8]:

time                                                       48509194942
user                 cherylcherylcherylcherylbobbobbobbobcherylcher...
video                intro.htmlintro.htmlintro.htmlintro.htmlintro....
playback position                                                  429
paused                                                               1
volume                                                              35
dtype: object

In [9]:

 

 
data.mean()  # 求均值

Out[9]:

time                 1.469976e+09
playback position    1.300000e+01
paused               3.333333e-01
volume               8.750000e+00
dtype: float64

In [10]:

 

data.median()  # 求中位数

Out[10]:

time                 1.469975e+09
playback position    1.000000e+01
paused               0.000000e+00
volume               1.000000e+01
dtype: float64

In [11]:

 

data.count() #求非空的个数

Out[11]:

time                 33
user                 33
video                33
playback position    33
paused                3
volume                4
dtype: int64

In [12]:

 

 
import pandas as pd

In [13]:

 

data = pd.read_csv(r'C:\Users\ML Learning\Projects\第四章-数据分析预习内容\第四章-数据分析预习内容\第一节-数据分析工具pandas基础\lesson_05\lesson_05\examples\datasets\2021_happiness.csv')
data.head()

Out[13]:

 CountryRegionHappiness RankHappiness ScoreLower Confidence IntervalUpper Confidence IntervalEconomy (GDP per Capita)FamilyHealth (Life Expectancy)FreedomTrust (Government Corruption)GenerosityDystopia Residual
0DenmarkWestern Europe17.5267.4607.5921.441781.163740.795040.579410.444530.361712.73939
1SwitzerlandWestern Europe27.5097.4287.5901.527331.145240.863030.585570.412030.280832.69463
2IcelandWestern Europe37.5017.3337.6691.426661.183260.867330.566240.149750.476782.83137
3NorwayWestern Europe47.4987.4217.5751.577441.126900.795790.596090.357760.378952.66465
4FinlandWestern Europe57.4137.3517.4751.405981.134640.810910.571040.410040.254922.82596

In [14]:

 

 
data.max()

Out[14]:

Country                                Zimbabwe
Region                           Western Europe
Happiness Rank                              157
Happiness Score                           7.526
Lower Confidence Interval                  7.46
Upper Confidence Interval                 7.669
Economy (GDP per Capita)                1.82427
Family                                  1.18326
Health (Life Expectancy)                0.95277
Freedom                                 0.60848
Trust (Government Corruption)           0.50521
Generosity                              0.81971
Dystopia Residual                       3.83772
dtype: object

In [15]:

 

 
data.min()

Out[15]:

Country                                        Afghanistan
Region                           Australia and New Zealand
Happiness Rank                                           1
Happiness Score                                      2.905
Lower Confidence Interval                            2.732
Upper Confidence Interval                            3.078
Economy (GDP per Capita)                                 0
Family                                                   0
Health (Life Expectancy)                                 0
Freedom                                                  0
Trust (Government Corruption)                            0
Generosity                                               0
Dystopia Residual                                  0.81789
dtype: object

In [17]:

 

 
data['Happiness Score'].idxmax()

Out[17]:

0

In [18]:

 

 
data['Happiness Score'].idxmin()

Out[18]:

156

In [21]:

 

data.mad()  # 求绝对值误差

Out[21]:

Happiness Rank                   39.254899
Happiness Score                   0.955256
Lower Confidence Interval         0.957480
Upper Confidence Interval         0.953032
Economy (GDP per Capita)          0.342828
Family                            0.211727
Health (Life Expectancy)          0.188426
Freedom                           0.119887
Trust (Government Corruption)     0.084441
Generosity                        0.102143
Dystopia Residual                 0.413041
dtype: float64

In [22]:

 

data.var()  #求方差

Out[22]:

Happiness Rank                   2067.159889
Happiness Score                     1.303418
Lower Confidence Interval           1.318002
Upper Confidence Interval           1.291617
Economy (GDP per Capita)            0.170235
Family                              0.071132
Health (Life Expectancy)            0.052601
Freedom                             0.021172
Trust (Government Corruption)       0.012329
Generosity                          0.017891
Dystopia Residual                   0.294003
dtype: float64

In [23]:

 

data.std() #求标准差

Out[23]:

Happiness Rank                   45.466030
Happiness Score                   1.141674
Lower Confidence Interval         1.148043
Upper Confidence Interval         1.136493
Economy (GDP per Capita)          0.412595
Family                            0.266706
Health (Life Expectancy)          0.229349
Freedom                           0.145507
Trust (Government Corruption)     0.111038
Generosity                        0.133756
Dystopia Residual                 0.542220
dtype: float64

In [24]:

 

data.cumsum()  #求累加

Out[24]:

 CountryRegionHappiness RankHappiness ScoreLower Confidence IntervalUpper Confidence IntervalEconomy (GDP per Capita)FamilyHealth (Life Expectancy)FreedomTrust (Government Corruption)GenerosityDystopia Residual
0DenmarkWestern Europe17.5267.4607.5921.441781.163740.795040.579410.444530.361712.73939
1DenmarkSwitzerlandWestern EuropeWestern Europe315.03514.88815.1822.969112.308981.658071.164980.856560.642545.43402
2DenmarkSwitzerlandIcelandWestern EuropeWestern EuropeWestern Europe622.53622.22122.8514.395773.492242.525401.731221.006311.119328.26539
3DenmarkSwitzerlandIcelandNorwayWestern EuropeWestern EuropeWestern EuropeWest...1030.03429.64230.4265.973214.619143.321192.327311.364071.4982710.93004
4DenmarkSwitzerlandIcelandNorwayFinlandWestern EuropeWestern EuropeWestern EuropeWest...1537.44736.99337.9017.379195.753784.132102.898351.774111.7531913.75600
..........................................
152DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe...Western EuropeWestern EuropeWestern EuropeWest...11778832.366817.188847.544148.28013124.1050686.3372257.6226421.1534236.91896357.94872
153DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe...Western EuropeWestern EuropeWestern EuropeWest...11932835.726820.476850.976148.66240124.2154386.5106657.7869421.2245437.23164360.09430
154DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe...Western EuropeWestern EuropeWestern EuropeWest...12087839.029823.668854.390148.94363124.2154386.7587758.1337221.3404137.40681362.22970
155DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe...Western EuropeWestern EuropeWestern EuropeWest...12243842.098826.604857.592149.69082124.3640987.3887158.2028421.5127437.89078363.04759
156DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe...Western EuropeWestern EuropeWestern EuropeWest...12400845.003829.336860.670149.75913124.5985187.5461858.2460421.6069338.09368365.15163

157 rows × 13 columns

 

 

 

 

 

 

 

 

 

 

  • 3
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值