给groupby()传入一个列表,列表中的元素为分类字段,从左到右分类级别增大。(一级分类、二级分类…)
import pandas as pd
data = [[‘a’, ‘A’, ‘1等’, 109], [‘b’, ‘B’, ‘1等’, 112], [‘c’, ‘A’, ‘1等’, 125], [‘d’, ‘B’, ‘2等’, 120],
[‘e’, ‘B’, ‘1等’, 126], [‘f’, ‘B’, ‘2等’, 133], [‘g’, ‘A’, ‘2等’, 124], [‘h’, ‘B’, ‘1等’, 134],
[‘i’, ‘A’, ‘2等’, 117], [‘j’, ‘A’, ‘2等’, 128], [‘h’, ‘A’, ‘1等’, 130], [‘i’, ‘B’, ‘2等’, 122]]
index = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
columns = [‘name’, ‘class_1’, ‘class_2’, ‘num’]
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
print(“=================================================”)
df1 = df.groupby([‘class_1’, ‘class_2’]).sum() # 分组统计求和
print(df1)
1.3 对DataFrameGroupBy对象列名索引(对指定列统计计算)
其中,df.groupby(‘class_1’)得到一个DataFrameGroupBy对象,对该对象可以使用列名进行索引,以对指定的列进行统计。
如:df.groupby(‘class_1’)[‘num’].sum()
import pandas as pd
data = [[‘a’, ‘A’, ‘1等’, 109], [‘b’, ‘B’, ‘1等’, 112], [‘c’, ‘A’, ‘1等’, 125], [‘d’, ‘B’, ‘2等’, 120],
[‘e’, ‘B’, ‘1等’, 126], [‘f’, ‘B’, ‘2等’, 133], [‘g’, ‘A’, ‘2等’, 124], [‘h’, ‘B’, ‘1等’, 134],
[‘i’, ‘A’, ‘2等’, 117], [‘j’, ‘A’, ‘2等’, 128], [‘h’, ‘A’, ‘1等’, 130], [‘i’, ‘B’, ‘2等’, 122]]
index = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
columns = [‘name’, ‘class_1’, ‘class_2’, ‘num’]
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
print(“=================================================”)
df1 = df.groupby(‘class_1’)[‘num’].sum()
print(df1)
代码运行结果同上。
<