1、读取数据并打印数据集前5行记录
import pandas as pd
data = pd.read_csv('xxxxx\wine quality red.csv')
data.head()
2、打印数据集内,品质 “quality” 变量总共的品质等级,统计并打印每个品质下的样本数
group1=data['quality'].groupby(data['quality'])
group1.size()
quality
3 10
4 53
5 681
6 638
7 199
8 18
Name: quality, dtype: int64
3、打印品质 “quality” 变量为3的子集的所有样本列表
quality_3 = data.groupby('quality').get_group(3)
quality_3
4、计算并打印每个品质下变量fixed acidity的均值
fixed_acidity_mean = data.groupby('quality')['fixed acidity'].agg(['mean'])
fixed_acidity_mean