pandas中的pd.pivot_table()透视表功能

和excel一样,pandas也有一个透视表的功能,具体demo如下:

import numpy as np
import pandas as pd
from pandas import Series, DataFrame


#显示所有列
pd.set_option('display.max_columns', None)
#显示所有行
pd.set_option('display.max_rows', None)
#设置value的显示长度为100,默认为50
pd.set_option('max_colwidth',100)


df = pd.read_excel('./sales-funnel.xlsx')
print(df.head())
'''
   Account                          Name            Rep       Manager  \
0   714466               Trantow-Barrows   Craig Booker  Debra Henley   
1   714466               Trantow-Barrows   Craig Booker  Debra Henley   
2   714466               Trantow-Barrows   Craig Booker  Debra Henley   
3   737550  Fritsch, Russel and Anderson   Craig Booker  Debra Henley   
4   146832                  Kiehn-Spinka  Daniel Hilton  Debra Henley   

       Product  Quantity  Price     Status  
0          CPU         1  30000  presented  
1     Software         1  10000  presented  
2  Maintenance         2   5000    pending  
3          CPU         1  35000   declined  
4          CPU         2  65000        won
'''

print(pd.pivot_table(df, index=['Name']))
'''
                              Account   Price  Quantity
Name                                                   
Barton LLC                     740150   35000  1.000000
Fritsch, Russel and Anderson   737550   35000  1.000000
Herman LLC                     141962   65000  2.000000
Jerde-Hilpert                  412290    5000  2.000000
Kassulke, Ondricka and Metz    307599    7000  3.000000
Keeling LLC                    688981  100000  5.000000
Kiehn-Spinka                   146832   65000  2.000000
Koepp Ltd                      729833   35000  2.000000
Kulas Inc                      218895   25000  1.500000
Purdy-Kunde                    163416   30000  1.000000
Stokes LLC                     239344    7500  1.000000
Trantow-Barrows                714466   15000  1.333333
对名字进行了去重,将每个人的销售记录取进行统计,上例是求了均值。
这是由aggfunc参数来决定的。
'''

print(pd.pivot_table(df, index=['Name'], aggfunc='sum'))
'''
                              Account   Price  Quantity
Name                                                   
Barton LLC                     740150   35000         1
Fritsch, Russel and Anderson   737550   35000         1
Herman LLC                     141962   65000         2
Jerde-Hilpert                  412290    5000         2
Kassulke, Ondricka and Metz    307599    7000         3
Keeling LLC                    688981  100000         5
Kiehn-Spinka                   146832   65000         2
Koepp Ltd                     1459666   70000         4
Kulas Inc                      437790   50000         3
Purdy-Kunde                    163416   30000         1
Stokes LLC                     478688   15000         2
Trantow-Barrows               2143398   45000         4
'''

print(pd.pivot_table(df, index=['Name', 'Rep', 'Manager']))
'''
                                                          Account    ...     Quantity
Name                         Rep           Manager                   ...             
Barton LLC                   John Smith    Debra Henley    740150    ...     1.000000
Fritsch, Russel and Anderson Craig Booker  Debra Henley    737550    ...     1.000000
Herman LLC                   Cedric Moss   Fred Anderson   141962    ...     2.000000
Jerde-Hilpert                John Smith    Debra Henley    412290    ...     2.000000
Kassulke, Ondricka and Metz  Wendy Yule    Fred Anderson   307599    ...     3.000000
Keeling LLC                  Wendy Yule    Fred Anderson   688981    ...     5.000000
Kiehn-Spinka                 Daniel Hilton Debra Henley    146832    ...     2.000000
Koepp Ltd                    Wendy Yule    Fred Anderson   729833    ...     2.000000
Kulas Inc                    Daniel Hilton Debra Henley    218895    ...     1.500000
Purdy-Kunde                  Cedric Moss   Fred Anderson   163416    ...     1.000000
Stokes LLC                   Cedric Moss   Fred Anderson   239344    ...     1.000000
Trantow-Barrows              Craig Booker  Debra Henley    714466    ...     1.333333
'''

print(pd.pivot_table(df, index=['Manager', 'Rep']))
# manager 和 rep 之间 存在 一对多的 关系
'''
                              Account         Price  Quantity
Manager       Rep                                            
Debra Henley  Craig Booker   720237.0  20000.000000  1.250000
              Daniel Hilton  194874.0  38333.333333  1.666667
              John Smith     576220.0  20000.000000  1.500000
Fred Anderson Cedric Moss    196016.5  27500.000000  1.250000
              Wendy Yule     614061.5  44250.000000  3.000000
'''

print(pd.pivot_table(df, index=['Manager', 'Rep'], values=['Price', 'Quantity']))
'''
                                    Price  Quantity
Manager       Rep                                  
Debra Henley  Craig Booker   20000.000000  1.250000
              Daniel Hilton  38333.333333  1.666667
              John Smith     20000.000000  1.500000
Fred Anderson Cedric Moss    27500.000000  1.250000
              Wendy Yule     44250.000000  3.000000
'''

print(pd.pivot_table(df, index=['Manager', 'Rep'], values=['Price', 'Quantity'], columns=['Product']))
'''
                               Price               ...    Quantity         
Product                          CPU Maintenance   ...     Monitor Software
Manager       Rep                                  ...                     
Debra Henley  Craig Booker   32500.0      5000.0   ...         NaN      1.0
              Daniel Hilton  52500.0         NaN   ...         NaN      1.0
              John Smith     35000.0      5000.0   ...         NaN      NaN
Fred Anderson Cedric Moss    47500.0      5000.0   ...         NaN      1.0
              Wendy Yule     82500.0      7000.0   ...         2.0      NaN
              
由以上输出可以看出,当column指定为product之后,price和quantity进行了细分,将每个product的详情列出。
另外还可以设置一个fill_value的参数,可以将nan填充为某个值。
'''

'''
总结:
    使用透视表之前,需要对原始数据有一个大概的了解,这样生成的透视表才能够有意义。
'''

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
©️2022 CSDN 皮肤主题:精致技术 设计师:CSDN官方博客 返回首页
评论 3

打赏作者

我是小蚂蚁

你的鼓励将是我创作的最大动力

¥2 ¥4 ¥6 ¥10 ¥20
输入1-500的整数
余额支付 (余额:-- )
扫码支付
扫码支付:¥2
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值