引用一个图:
查看分组后的具体信息:
# 原始信息
regiment company name preTestScore postTestScore
0 Nighthawks 1st Miller 4 25
1 Nighthawks 1st Jacobson 24 94
2 Nighthawks 2nd Ali 31 57
3 Nighthawks 2nd Milner 2 62
4 Dragoons 1st Cooze 3 70
5 Dragoons 1st Jacon 4 25
6 Dragoons 2nd Ryaner 24 94
7 Dragoons 2nd Sone 31 57
# 查看方法
for name, group in df.groupby('name'):
print(name)
print(group)
# 分组情况
Ali
regiment company name preTestScore postTestScore
2 Nighthawks NaN Ali 31 57
11 Scouts 2nd Ali 3 70
Cooze
regiment company name preTestScore postTestScore
4 Dragoons 1st Cooze 3 70
Jacobson
regiment company name preTestScore postTestScore
1 Nighthawks 1st Jacobson 24 94
Jacon
regiment company name preTestScore postTestScore
5 Dragoons 1st Jacon 4 25
Miller
regiment company name preTestScore postTestScore
0 Nighthawks 1st Miller 4 25
元数据
name date race age signs_of_mental_illness flee
0 Tim Elliot 02/01/15 A 53.0 True Not fleeing
1 Lewis Lee Lembke 02/01/15 W 47.0 False Not fleeing
2 John Paul Quintero 03/01/15 H 23.0 False Not fleeing
3 Matthew Hoffman 04/01/15 W 32.0 True Not fleeing
4 Michael Rodriguez 04/01/15 H 39.0 False Not fleeing
- 根据种族分类例如A种族所有数据形成一张表,B种族也有一张透视表,然后就可以分类处理:
data.groupby('race')
- 对分类后的列表取某列进行运算,得到的结果是不同类别的结果
data.groupby('race')['age'].mean()
对不同种族的年龄求平均值
race
A 36.605263
B 31.635468
H 32.995157
N 30.451613
O 33.071429
W 40.046980
Name: age, dtype: float64