pd.pivot+table+html,pandas-10 pd.pivot_table()透视表功能

pandas-10 pd.pivot_table()透视表功能

和excel一样,pandas也有一个透视表的功能,具体demo如下:

import numpy as np

import pandas as pd

from pandas import Series, DataFrame

#显示所有列

pd.set_option('display.max_columns', None)

#显示所有行

pd.set_option('display.max_rows', None)

#设置value的显示长度为100,默认为50

pd.set_option('max_colwidth',100)

df = pd.read_excel('./sales-funnel.xlsx')

print(df.head())

'''

Account Name Rep Manager \

0 714466 Trantow-Barrows Craig Booker Debra Henley

1 714466 Trantow-Barrows Craig Booker Debra Henley

2 714466 Trantow-Barrows Craig Booker Debra Henley

3 737550 Fritsch, Russel and Anderson Craig Booker Debra Henley

4 146832 Kiehn-Spinka Daniel Hilton Debra Henley

Product Quantity Price Status

0 CPU 1 30000 presented

1 Software 1 10000 presented

2 Maintenance 2 5000 pending

3 CPU 1 35000 declined

4 CPU 2 65000 won

'''

print(pd.pivot_table(df, index=['Name']))

'''

Account Price Quantity

Name

Barton LLC 740150 35000 1.000000

Fritsch, Russel and Anderson 737550 35000 1.000000

Herman LLC 141962 65000 2.000000

Jerde-Hilpert 412290 5000 2.000000

Kassulke, Ondricka and Metz 307599 7000 3.000000

Keeling LLC 688981 100000 5.000000

Kiehn-Spinka 146832 65000 2.000000

Koepp Ltd 729833 35000 2.000000

Kulas Inc 218895 25000 1.500000

Purdy-Kunde 163416 30000 1.000000

Stokes LLC 239344 7500 1.000000

Trantow-Barrows 714466 15000 1.333333

对名字进行了去重,将每个人的销售记录取进行统计,上例是求了均值。

这是由aggfunc参数来决定的。

'''

print(pd.pivot_table(df, index=['Name'], aggfunc='sum'))

'''

Account Price Quantity

Name

Barton LLC 740150 35000 1

Fritsch, Russel and Anderson 737550 35000 1

Herman LLC 141962 65000 2

Jerde-Hilpert 412290 5000 2

Kassulke, Ondricka and Metz 307599 7000 3

Keeling LLC 688981 100000 5

Kiehn-Spinka 146832 65000 2

Koepp Ltd 1459666 70000 4

Kulas Inc 437790 50000 3

Purdy-Kunde 163416 30000 1

Stokes LLC 478688 15000 2

Trantow-Barrows 2143398 45000 4

'''

print(pd.pivot_table(df, index=['Name', 'Rep', 'Manager']))

'''

Account ... Quantity

Name Rep Manager ...

Barton LLC John Smith Debra Henley 740150 ... 1.000000

Fritsch, Russel and Anderson Craig Booker Debra Henley 737550 ... 1.000000

Herman LLC Cedric Moss Fred Anderson 141962 ... 2.000000

Jerde-Hilpert John Smith Debra Henley 412290 ... 2.000000

Kassulke, Ondricka and Metz Wendy Yule Fred Anderson 307599 ... 3.000000

Keeling LLC Wendy Yule Fred Anderson 688981 ... 5.000000

Kiehn-Spinka Daniel Hilton Debra Henley 146832 ... 2.000000

Koepp Ltd Wendy Yule Fred Anderson 729833 ... 2.000000

Kulas Inc Daniel Hilton Debra Henley 218895 ... 1.500000

Purdy-Kunde Cedric Moss Fred Anderson 163416 ... 1.000000

Stokes LLC Cedric Moss Fred Anderson 239344 ... 1.000000

Trantow-Barrows Craig Booker Debra Henley 714466 ... 1.333333

'''

print(pd.pivot_table(df, index=['Manager', 'Rep']))

# manager 和 rep 之间 存在 一对多的 关系

'''

Account Price Quantity

Manager Rep

Debra Henley Craig Booker 720237.0 20000.000000 1.250000

Daniel Hilton 194874.0 38333.333333 1.666667

John Smith 576220.0 20000.000000 1.500000

Fred Anderson Cedric Moss 196016.5 27500.000000 1.250000

Wendy Yule 614061.5 44250.000000 3.000000

'''

print(pd.pivot_table(df, index=['Manager', 'Rep'], values=['Price', 'Quantity']))

'''

Price Quantity

Manager Rep

Debra Henley Craig Booker 20000.000000 1.250000

Daniel Hilton 38333.333333 1.666667

John Smith 20000.000000 1.500000

Fred Anderson Cedric Moss 27500.000000 1.250000

Wendy Yule 44250.000000 3.000000

'''

print(pd.pivot_table(df, index=['Manager', 'Rep'], values=['Price', 'Quantity'], columns=['Product']))

'''

Price ... Quantity

Product CPU Maintenance ... Monitor Software

Manager Rep ...

Debra Henley Craig Booker 32500.0 5000.0 ... NaN 1.0

Daniel Hilton 52500.0 NaN ... NaN 1.0

John Smith 35000.0 5000.0 ... NaN NaN

Fred Anderson Cedric Moss 47500.0 5000.0 ... NaN 1.0

Wendy Yule 82500.0 7000.0 ... 2.0 NaN

由以上输出可以看出,当column指定为product之后,price和quantity进行了细分,将每个product的详情列出。

另外还可以设置一个fill_value的参数,可以将nan填充为某个值。

'''

'''

总结:

使用透视表之前,需要对原始数据有一个大概的了解,这样生成的透视表才能够有意义。

'''

标签:10,...,Henley,Anderson,pd,LLC,pivot,Debra

来源: https://www.cnblogs.com/wenqiangit/p/11252770.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值