Pandas - 10.1 聚合groupby-agg/aggreagte_pandas groupby max count-CSDN博客

本文链接：https://blog.csdn.net/Aaron_ChenShenyu/article/details/125896530

本文介绍了Pandas中与groupby配合使用的多种方法，如count、size、mean、std等统计函数，以及如何进行聚合操作，包括传入自定义函数和同时计算多个函数。通过这些函数，可以对数据进行详细的分析和总结。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

可以与groupby一起使用的方法或函数

count / np.count_nonzero 统计频数（不包含NaN值）
size 统计频数（包含NaN值）
mean / np.mean 求平均值
std / np.std 样本标准差
min /np.min 最小值
quantile(q=0.25) / np.percentile(q=0.25) 较小四分位数
quantile(q=0.5) / np.percentile(q=0.5) 中位数
quantile(q=0.75) / np.percentile(q=0.75) 较大四分位数
max / np.max 最大值
sum / np.sum 求和
var / np.var 无偏方差
sem / scipy.stats.sem 平均值的无偏方差
describe / scipy.stats.describe 统计信息描述
frist 返回第一行
last 返回最后一行
nth 返回第n行

import pandas as pd
df = pd.read_csv('data/gapminder.tsv', sep='\t')

continent_describe = df.groupby('continent').lifeExp.describe()
print(continent_describe)

'''
           count       mean        std     min       25%      50%       75%  \
continent                                                                     
Africa     624.0  48.865330   9.150210  23.599  42.37250  47.7920  54.41150   
Americas   300.0  64.658737   9.345088  37.579  58.41000  67.0480  71.69950   
Asia       396.0  60.064903  11.864532  28.801  51.42625  61.7915  69.50525   
Europe     360.0  71.903686   5.433178  43.585  69.57000  72.2410  75.45050   
Oceania     24.0  74.326208   3.795611  69.120  71.20500  73.6650  77.55250   

              max  
continent          
Africa     76.442  
Americas   80.653  
Asia       82.603  
Europe     81.757  
Oceania    81.235  
'''

聚合函数

除了上面列出的函数，可以调用agg或aggregate方法传入想用的聚合函数。

传入其他库的函数
传入自定义的函数

传入其他库的函数

import numpy as np

cont_le_agg = df.groupby('continent'