pandas入门十大例题
训练集数据资源地址:https://download.csdn.net/download/qq_41166909/30414605
练习 1. 了解数据--某段时间内股票销售情况(stock)。 按照每题的要求,输入正确的代码并运行。
1 )- 导入正确的库——请导入Pandas。
In [12]:
import pandas as pd
2 )- 从指定地址导入数据,并存入一个名为shopper的框架中。(data.csv)
In [13]:
shopper=pd.read_csv('data/data111257/data.csv')
3 )- 查看前15行内容。
In [14]:
shopper.head(15)
Out[14]:
InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | |
---|---|---|---|---|---|---|---|---|
0 | 536365 | 85123A | WHITE HANGING HEART T-LIGHT HOLDER | 6 | 12/1/2010 8:26 | 2.55 | 17850.0 | United Kingdom |
1 | 536365 | 71053 | WHITE METAL LANTERN | 6 | 12/1/2010 8:26 | 3.39 | 17850.0 | United Kingdom |
2 | 536365 | 84406B | CREAM CUPID HEARTS COAT HANGER | 8 | 12/1/2010 8:26 | 2.75 | 17850.0 | United Kingdom |
3 | 536365 | 84029G | KNITTED UNION FLAG HOT WATER BOTTLE | 6 | 12/1/2010 8:26 | 3.39 | 17850.0 | United Kingdom |
4 | 536365 | 84029E | RED WOOLLY HOTTIE WHITE HEART. | 6 | 12/1/2010 8:26 | 3.39 | 17850.0 | United Kingdom |
5 | 536365 | 22752 | SET 7 BABUSHKA NESTING BOXES | 2 | 12/1/2010 8:26 | 7.65 | 17850.0 | United Kingdom |
6 | 536365 | 21730 | GLASS STAR FROSTED T-LIGHT HOLDER | 6 | 12/1/2010 8:26 | 4.25 | 17850.0 | United Kingdom |
7 | 536366 | 22633 | HAND WARMER UNION JACK | 6 | 12/1/2010 8:28 | 1.85 | 17850.0 | United Kingdom |
8 | 536366 | 22632 | HAND WARMER RED POLKA DOT | 6 | 12/1/2010 8:28 | 1.85 | 17850.0 | United Kingdom |
9 | 536367 | 84879 | ASSORTED COLOUR BIRD ORNAMENT | 32 | 12/1/2010 8:34 | 1.69 | 13047.0 | United Kingdom |
10 | 536367 | 22745 | POPPY'S PLAYHOUSE BEDROOM | 6 | 12/1/2010 8:34 | 2.10 | 13047.0 | United Kingdom |
11 | 536367 | 22748 | POPPY'S PLAYHOUSE KITCHEN | 6 | 12/1/2010 8:34 | 2.10 | 13047.0 | United Kingdom |
12 | 536367 | 22749 | FELTCRAFT PRINCESS CHARLOTTE DOLL | 8 | 12/1/2010 8:34 | 3.75 | 13047.0 | United Kingdom |
13 | 536367 | 22310 | IVORY KNITTED MUG COSY | 6 | 12/1/2010 8:34 | 1.65 | 13047.0 | United Kingdom |
14 | 536367 | 84969 | BOX OF 6 ASSORTED COLOUR TEASPOONS | 6 | 12/1/2010 8:34 | 4.25 | 13047.0 | United Kingdom |
4 )- 数据集中有多少列(columns)
In [15]:
number_of_rows = shopper.shape[1] number_of_rows
Out[15]:
8
5 )- 打印全部列名称。
In [16]:
shopper.columns
Out[16]:
Index(['InvoiceNo', 'StockCode', 'Description', 'Quantity', 'InvoiceDate', 'UnitPrice', 'CustomerID', 'Country'], dtype='object')
6 )- 数据集的索引是怎样的。
In [17]:
shopper.index
Out[17]:
RangeIndex(start=0, stop=36997, step=1)
7 )- 被下单最多StockCode是什么?
In [18]:
a=shopper[['StockCode','Quantity']].groupby(by='StockCode').agg('sum') #根据股票编码分组 a=a.sort_values(['Quantity'],ascending=False) a.head(1)
Out[18]:
Quantity | |
---|---|
StockCode | |
84077 | 4948 |
8 )- 在StockCode这一列中,一共有多少只股票被下单?
In [19]:
pd.unique(shopper['StockCode']).size
Out[19]:
2743
9 )- 在Description列中,下单最多的是哪种股票?
In [20]:
shopper['StockCode'].value_counts().head(1)
Out[20]:
85123A 209 Name: StockCode, dtype: int64
10 )- 一共有多少股票被下单?
In [21]:
shopper['Quantity'].sum()
Out[21]:
298210
11 )- 每一张单据(InvoiceNo)对应一笔交易,总共有多少笔交易?
In [22]:
shopper['InvoiceNo'].unique().size
Out[22]:
1771
12 ).每一笔对应的平均总价是多少?
In [23]:
#求总收入 shopper['sub_total'] = round(shopper['UnitPrice'] * shopper['Quantity'],2) shopper['sub_total'].sum()
Out[23]:
648004.1099999999
In [24]:
#求每一笔订单的平均总价是多少 shopper[['InvoiceNo','sub_total']].groupby(by=['InvoiceNo'] ).agg({'sub_total':'sum'})['sub_total'].mean()
Out[24]:
365.8972953133823
练习 2. 数据过滤与排序——NBA比赛情况统计(NBA_stats.xlsx)。按照每步的要求,输入正确的代码并运行。
1 ) - 导入必要的库
In [25]:
import pandas as pd
2 ) - 从以下地址导入数据集(NBA_stats.xlsx)
In [26]:
NBA2019=pd.read_excel('data/data111257/NBA_stats.xlsx')
3 )- 将数据集命名为NBA2019
In [27]:
NBA2019=pd.read_excel('data/data111257/NBA_stats.xlsx') NBA2019
Out[27]:
排名 | 球队 | 投篮命中率 | 投篮命中 | 投篮出手 | 三分命中率 | 三分命中 | 三分出手 | 罚球命中率 | 罚球命中 | 罚球出手 | 总篮板 | 进攻 | 防守 | 助攻 | 失误 | 抢断 | 盖帽 | 犯规 | 得分 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 雄鹿 | 0.487 | 44.7 | 91.8 | 0.389 | 14.4 | 37.1 | 0.760 | 16.2 | 21.4 | 48.1 | 10.3 | 37.8 | 25.5 | 13.82 | 8.13 | 4.64 | 17.28 | 120.1 |
1 | 2 | 篮网 | 0.494 | 43.1 | 87.3 | 0.392 | 14.2 | 36.1 | 0.804 | 18.1 | 22.5 | 44.4 | 8.9 | 35.5 | 26.8 | 13.54 | 6.72 | 5.26 | 19.04 | 118.6 |
2 | 3 | 奇才 | 0.475 | 43.2 | 90.9 | 0.351 | 10.2 | 29.0 | 0.769 | 20.1 | 26.2 | 45.2 | 9.7 | 35.5 | 25.5 | 14.40 | 7.33 | 4.13 | 21.60 | 116.6 |
3 | 4 | 爵士 | 0.468 | 41.3 | 88.1 | 0.389 | 16.7 | 43.0 | 0.799 | 17.2 | 21.5 | 48.3 | 10.6 | 37.6 | 23.7 | 14.21 | 6.58 | 5.15 | 18.54 | 116.4 |
4 | 5 | 开拓者 | 0.453 | 41.3 | 91.1 | 0.385 | 15.7 | 40.8 | 0.823 | 17.8 | 21.6 | 44.5 | 10.6 | 33.9 | 21.3 | 11.10 | 6.89 | 5.04 | 18.92 | 116.1 |
5 | 6 | 太阳 | 0.490 | 43.3 | 88.3 | 0.378 | 13.1 | 34.6 | 0.834 | 15.6 | 18.7 | 42.9 | 8.8 | 34.2 | 26.9 | 12.53 | 7.18 | 4.33 | 19.08 | 115.3 |
6 | 7 | 步行者 | 0.474 | 43.3 | 91.2 | 0.364 | 12.3 | 34.0 | 0.792 | 16.4 | 20.7 | 42.7 | 9.0 | 33.7 | 27.4 | 13.54 | 8.49 | 6.39 | 20.18 | 115.3 |
7 | 8 | 掘金 | 0.485 | 43.2 | 89.2 | 0.377 | 12.9 | 34.2 | 0.803 | 15.7 | 19.5 | 44.4 | 10.5 | 33.9 | 26.8 | 13.50 | 8.08 | 4.49 | 19.08 | 115.1 |
8 | 9 | 鹈鹕 | 0.477 | 42.5 | 89.1 | 0.348 | 10.6 | 30.4 | 0.729 | 19.0 | 26.1 | 47.4 | 11.7 | 35.7 | 26.0 | 14.61 | 7.57 | 4.38 | 17.99 | 114.6 |
9 | 10 | 快船 | 0.482 | 41.8 | 86.7 | 0.411 | 14.3 | 34.7 | 0.839 | 16.2 | 19.3 | 44.2 | 9.4 | 34.7 | 24.4 | 13.19 | 7.07 | 4.10 | 19.21 | 114.0 |
10 | 11 | 勇士 | 0.468 | 41.3 | 88.2 | 0.376 | 14.6 | 38.7 | 0.785 | 16.6 | 21.1 | 43.0 | 8.0 | 35.1 | 27.7 | 15.00 | 8.15 | 4.75 | 21.19 | 113.7 |
11 | 12 | 老鹰 | 0.468 | 40.8 | 87.2 | 0.373 | 12.4 | 33.4 | 0.812 | 19.7 | 24.2 | 45.6 | 10.6 | 35.1 | 24.1 | 13.24 | 6.99 | 4.75 | 19.33 | 113.7 |
12 | 13 | 国王 | 0.481 | 42.6 | 88.6 | 0.364 | 12.1 | 33.3 | 0.745 | 16.4 | 22.0 | 41.4 | 9.4 | 32.0 | 25.5 | 13.38 | 7.54 | 4.97 | 19.44 | 113.7 |
13 | 14 | 76人 | 0.476 | 41.4 | 86.9 | 0.374 | 11.3 | 30.1 | 0.767 | 19.6 | 25.5 | 45.1 | 10.0 | 35.0 | 23.7 | 14.44 | 9.10 | 6.21 | 20.22 | 113.6 |
14 | 15 | 灰熊 | 0.467 | 42.8 | 91.8 | 0.356 | 11.2 | 31.4 | 0.771 | 16.4 | 21.3 | 46.5 | 11.2 | 35.3 | 26.9 | 13.29 | 9.10 | 5.06 | 18.74 | 113.3 |
15 | 16 | 凯尔特人 | 0.466 | 41.5 | 88.9 | 0.374 | 13.6 | 36.4 | 0.775 | 16.1 | 20.8 | 44.3 | 10.6 | 33.6 | 23.5 | 14.06 | 7.72 | 5.32 | 20.43 | 112.6 |
16 | 17 | 独行侠 | 0.470 | 41.1 | 87.3 | 0.362 | 13.8 | 38.1 | 0.778 | 16.5 | 21.2 | 43.3 | 9.1 | 34.2 | 22.9 | 12.07 | 6.25 | 4.32 | 19.39 | 112.4 |
17 | 18 | 森林狼 | 0.448 | 40.7 | 90.9 | 0.349 | 13.1 | 37.6 | 0.761 | 17.6 | 23.1 | 43.5 | 10.5 | 33.0 | 25.6 | 14.26 | 8.78 | 5.53 | 20.93 | 112.1 |
18 | 19 | 猛龙 | 0.448 | 39.7 | 88.7 | 0.368 | 14.5 | 39.3 | 0.815 | 17.4 | 21.3 | 41.6 | 9.4 | 32.1 | 24.1 | 13.22 | 8.58 | 5.40 | 21.19 | 111.3 |
19 | 20 | 马刺 | 0.462 | 41.9 | 90.5 | 0.350 | 9.9 | 28.4 | 0.792 | 17.4 | 22.0 | 43.9 | 9.3 | 34.6 | 24.4 | 11.40 | 7.01 | 5.08 | 17.96 | 111.1 |
20 | 21 | 公牛 | 0.476 | 42.2 | 88.6 | 0.370 | 12.6 | 34.0 | 0.791 | 13.8 | 17.5 | 45.0 | 9.6 | 35.3 | 26.8 | 15.13 | 6.69 | 4.22 | 18.92 | 110.7 |
21 | 22 | 湖人 | 0.472 | 40.6 | 86.1 | 0.354 | 11.1 | 31.2 | 0.739 | 17.2 | 23.3 | 44.2 | 9.7 | 34.6 | 24.7 | 15.21 | 7.81 | 5.36 | 19.13 | 109.5 |
22 | 23 | 黄蜂 | 0.455 | 39.9 | 87.8 | 0.369 | 13.7 | 37.0 | 0.761 | 15.9 | 20.9 | 43.8 | 10.6 | 33.2 | 26.8 | 14.85 | 7.85 | 4.78 | 18.03 | 109.5 |
23 | 24 | 火箭 | 0.444 | 39.2 | 88.5 | 0.339 | 13.8 | 40.6 | 0.740 | 16.5 | 22.3 | 42.6 | 9.3 | 33.3 | 23.6 | 14.72 | 7.58 | 5.01 | 19.54 | 108.8 |
24 | 25 | 热火 | 0.468 | 39.2 | 83.7 | 0.358 | 12.9 | 36.2 | 0.790 | 16.7 | 21.1 | 41.5 | 8.0 | 33.5 | 26.3 | 14.07 | 7.90 | 3.97 | 18.93 | 108.1 |
25 | 26 | 尼克斯 | 0.456 | 39.4 | 86.5 | 0.392 | 11.8 | 30.0 | 0.784 | 16.4 | 20.9 | 45.1 | 9.7 | 35.5 | 21.4 | 12.94 | 7.04 | 5.07 | 20.46 | 107.0 |
26 | 27 | 活塞 | 0.452 | 38.7 | 85.6 | 0.351 | 11.6 | 32.9 | 0.759 | 17.8 | 23.4 | 42.7 | 9.6 | 33.1 | 24.2 | 14.93 | 7.38 | 5.15 | 20.51 | 106.6 |
27 | 28 | 雷霆 | 0.441 | 38.8 | 88.0 | 0.339 | 11.9 | 35.1 | 0.725 | 15.5 | 21.3 | 45.6 | 9.9 | 35.7 | 22.1 | 16.14 | 7.00 | 4.39 | 18.13 | 105.0 |
28 | 29 | 魔术 | 0.429 | 38.2 | 89.2 | 0.343 | 10.9 | 31.8 | 0.775 | 16.6 | 21.4 | 45.4 | 10.4 | 35.1 | 21.8 | 12.83 | 6.89 | 4.42 | 17.18 | 104.0 |
29 | 30 | 骑士 | 0.450 | 38.6 | 85.8 | 0.336 | 10.0 | 29.7 | 0.743 | 16.7 | 22.4 | 42.8 | 10.4 | 32.3 | 23.8 | 15.47 | 7.76 | 4.51 | 18.17 | 103.8 |
4 )-只选取 “得分” 这一列
In [28]:
NBA2019['得分']
Out[28]:
0 120.1 1 118.6 2 116.6 3 116.4 4 116.1 5 115.3 6 115.3 7 115.1 8 114.6 9 114.0 10 113.7 11 113.7 12 113.7 13 113.6 14 113.3 15 112.6 16 112.4 17 112.1 18 111.3 19 111.1 20 110.7 21 109.5 22 109.5 23 108.8 24 108.1 25 107.0 26 106.6 27 105.0 28 104.0 29 103.8 Name: 得分, dtype: float64
5 )- 有多少球队参与了比赛?
In [29]:
NBA2019['球队'].nunique
Out[29]:
<bound method IndexOpsMixin.nunique of 0 雄鹿 1 篮网 2 奇才 3 爵士 4 开拓者 5 太阳 6 步行者 7 掘金 8 鹈鹕 9 快船 10 勇士 11 老鹰 12 国王 13 76人 14 灰熊 15 凯尔特人 16 独行侠 17 森林狼 18 猛龙 19 马刺 20 公牛 21 湖人 22 黄蜂 23 火箭 24 热火 25 尼克斯 26 活塞 27 雷霆 28 魔术 29 骑士 Name: 球队, dtype: object>
6 )-该数据集中一共有多少列(columns)?
In [30]:
NBA2019.shape[1]
Out[30]:
20
7 )- 将数据集中的列“投篮命中率”、“投篮命中”和“投篮出手”单独存为一个名叫“投篮”的数据框
In [31]:
shots=NBA2019[['投篮命中率','投篮命中','投篮出手']] shots
Out[31]:
投篮命中率 | 投篮命中 | 投篮出手 | |
---|---|---|---|
0 | 0.487 | 44.7 | 91.8 |
1 | 0.494 | 43.1 | 87.3 |
2 | 0.475 | 43.2 | 90.9 |
3 | 0.468 | 41.3 | 88.1 |
4 | 0.453 | 41.3 | 91.1 |
5 | 0.490 | 43.3 | 88.3 |
6 | 0.474 | 43.3 | 91.2 |
7 | 0.485 | 43.2 | 89.2 |
8 | 0.477 | 42.5 | 89.1 |
9 | 0.482 | 41.8 | 86.7 |
10 | 0.468 | 41.3 | 88.2 |
11 | 0.468 | 40.8 | 87.2 |
12 | 0.481 | 42.6 | 88.6 |
13 | 0.476 | 41.4 | 86.9 |
14 | 0.467 | 42.8 | 91.8 |
15 | 0.466 | 41.5 | 88.9 |
16 | 0.470 | 41.1 | 87.3 |
17 | 0.448 | 40.7 | 90.9 |
18 | 0.448 | 39.7 | 88.7 |
19 | 0.462 | 41.9 | 90.5 |
20 | 0.476 | 42.2 | 88.6 |
21 | 0.472 | 40.6 | 86.1 |
22 | 0.455 | 39.9 | 87.8 |
23 | 0.444 | 39.2 | 88.5 |
24 | 0.468 | 39.2 | 83.7 |
25 | 0.456 | 39.4 | 86.5 |
26 | 0.452 | 38.7 | 85.6 |
27 | 0.441 | 38.8 | 88.0 |
28 | 0.429 | 38.2 | 89.2 |
29 | 0.450 | 38.6 | 85.8 |
In [32]:
NBA2019.columns
Out[32]:
Index(['排名', '球队', '投篮命中率', '投篮命中', '投篮出手', '三分命中率', '三分命中', '三分出手', '罚球命中率', '罚球命中', '罚球出手', '总篮板', '进攻', '防守', '助攻', '失误', '抢断', '盖帽', '犯规', '得分'], dtype='object')
8 ) - 对数据框“投篮”按照先“投篮命中率”再“投篮出手”进行排序
In [33]:
shots.sort_values(['投篮命中率','投篮出手'],ascending=[False,False])
Out[33]:
投篮命中率 | 投篮命中 | 投篮出手 | |
---|---|---|---|
1 | 0.494 | 43.1 | 87.3 |
5 | 0.490 | 43.3 | 88.3 |
0 | 0.487 | 44.7 | 91.8 |
7 | 0.485 | 43.2 | 89.2 |
9 | 0.482 | 41.8 | 86.7 |
12 | 0.481 | 42.6 | 88.6 |
8 | 0.477 | 42.5 | 89.1 |
20 | 0.476 | 42.2 | 88.6 |
13 | 0.476 | 41.4 | 86.9 |
2 | 0.475 | 43.2 | 90.9 |
6 | 0.474 | 43.3 | 91.2 |
21 | 0.472 | 40.6 | 86.1 |
16 | 0.470 | 41.1 | 87.3 |
10 | 0.468 | 41.3 | 88.2 |
3 | 0.468 | 41.3 | 88.1 |
11 | 0.468 | 40.8 | 87.2 |
24 | 0.468 | 39.2 | 83.7 |
14 | 0.467 | 42.8 | 91.8 |
15 | 0.466 | 41.5 | 88.9 |
19 | 0.462 | 41.9 | 90.5 |
25 | 0.456 | 39.4 | 86.5 |
22 | 0.455 | 39.9 | 87.8 |
4 | 0.453 | 41.3 | 91.1 |
26 | 0.452 | 38.7 | 85.6 |
29 | 0.450 | 38.6 | 85.8 |
17 | 0.448 | 40.7 | 90.9 |
18 | 0.448 | 39.7< |