文章目录
pandas_任务4.4 使用分组聚合进行组内计算
!
!!
!!!可以点击下面连接
ipynb格式浏览
4.4.1 使用groupby方法拆分数据
代码 4-51 对菜品订单详情表依据订单编号分组
import pandas as pd
import numpy as np
from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://root:123456@localhost:3306/zuoye')
detail = pd.read_sql_table('meal_order_detail1',con = engine)
detailGroup = detail[['order_id','counts',
'amounts']].groupby(by = 'order_id')
print('分组后的订单详情表为:',detailGroup)
分组后的订单详情表为: <pandas.core.groupby.generic.DataFrameGroupBy object at 0x000002547B8C0DC8>
D:\Study\anaconda\lib\site-packages\pymysql\cursors.py:170: Warning: (1366, "Incorrect string value: '\\xD6\\xD0\\xB9\\xFA\\xB1\\xEA...' for column 'VARIABLE_VALUE' at row 1")
result = self._query(query)
代码 4-52 GroupBy 类求均值,标准差,中位数
print('订单详情表分组后前5组每组的均值为:\n',
detailGroup.mean().head())
print('订单详情表分组后前5组每组的标准差为:\n',
detailGroup.std().head())
print('订单详情表分组后前5组每组的大小为:','\n',
detailGroup.size().head())
订单详情表分组后前5组每组的均值为:
counts amounts
order_id
1002 1.0000 32.000
1003 1.2500 30.125
1004 1.0625 43.875
1008 1.0000 63.000
1011 1.0000 57.700
订单详情表分组后前5组每组的标准差为:
counts amounts
order_id
1002 0.00000 16.000000
1003 0.46291 21.383822
1004 0.25000 31.195886
1008 0.00000 64.880660
1011 0.00000 50.077828
订单详情表分组后前5组每组的大小为:
order_id
1002 7
1003 8
1004 16
1008 5
1011 10
dtype: int64
代码 4-53 agg和aggregate函数的参数及其说明
print('订单详情表的菜品销量与售价的和与均值为:\n',
detail[['counts','amounts']].agg([np.