【案例1】
(1)使用matplotlib呈现出店铺总数排名前10的国家
(2)使用matplotlib呈现出中国每个城市的店铺数量
'''
(1) df.groupby(by='Country') 按照国家进行分组
(2) sort_values:默认升序(ascending=True) ascend:上升
(3) [:10] 切片:取前10行
(4) df.index 索引
df.values 索引所对应的值
'''
import pandas as pd
from matplotlib import pyplot as plt
file_path = './code2/starbucks_store_worldwide.csv'
df = pd.read_csv(file_path)
data0 = df.groupby(by='Country')['Brand'].count().sort_values(ascending=False)
print(data0)
data1 = data0[:10]
print(data1)
_x = data1.index
_y = data1.values
plt.figure(figsize=(20,8), dpi=80)
plt.bar( range(len(_x)), _y )
plt.xticks( range(len(_x)), _x )
plt.show()
Country
US 13608
CN 2734
CA 1468
JP 1237
KR 993
...
SK 3
TT 3
LU 2
MC 2
AD 1
Name: Brand, Length: 73, dtype: int64
Country
US 13608
CN 2734
CA 1468
JP 1237
KR 993
GB 901
MX 579
TW 394
TR 326
PH 298
Name: Brand, dtype: int64
![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/fcdfdf9ad43c45c985249741e0db62dc.png#pic_center)
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import font_manager
my_font = font_manager.FontProperties(fname='C:\\Windows\\Fonts\\simhei.ttf')
df = pd.read_csv('./code2/starbucks_store_worldwide.csv')
print(df['Country'])
print('*'*30)
df = df[ df['Country'] == 'CN' ]
print(df.head(1))
data1 = df.groupby(by='City')['Brand'].count().sort_values(ascending=False)[:25]
print(data1)
_x = data1.index
_y = data1.values
plt.figure(figsize=(20,12), dpi=80)
plt.barh(range(len(_x)), _y, height=0.3, color='cyan')
plt.yticks(range(len(_x)), _x, fontproperties=my_font)
plt.show()
0 AD
1 AE
2 AE
3 AE
4 AE
..
25595 VN
25596 VN
25597 ZA
25598 ZA
25599 ZA
Name: Country, Length: 25600, dtype: object
******************************
Brand Store Number Store Name Ownership Type \
2091 Starbucks 22901-225145 北京西站第一咖啡店 Company Owned
Street Address City State/Province Country Postcode \
2091 丰台区, 北京西站通廊7-1号, 中关村南大街2号 北京市 11 CN 100073
Phone Number Timezone Longitude Latitude
2091 NaN GMT+08:00 Asia/Beijing 116.32 39.9
City
上海市 542
北京市 234
杭州市 117
深圳市 113
广州市 106
Hong Kong 104
成都市 98
苏州市 90
南京市 73
武汉市 67
宁波市 59
天津市 58
重庆市 41
西安市 40
无锡市 40
佛山市 33
东莞市 31
厦门市 31
青岛市 28
长沙市 26
常州市 26
大连市 25
沈阳市 24
福州市 23
昆明市 21
Name: Brand, dtype: int64
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 21271 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 20140 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24066 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26477 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24030 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 28145 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 22323 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24191 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 25104 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 37117 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 33487 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 21335 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27494 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27721 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23425 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27874 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 22825 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27941 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 37325 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24198 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 35199 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23433 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26080 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38177 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 20315 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23665 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 19996 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 33694 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 21414 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38376 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38738 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23707 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38271 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27801 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24120 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 22823 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 36830 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27784 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38451 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 31119 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26118 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26126 missing from current font.
font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 21271 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 20140 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24066 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26477 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24030 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 28145 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 22323 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24191 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 25104 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 37117 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 33487 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 21335 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27494 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27721 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23425 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27874 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 22825 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27941 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 37325 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24198 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 35199 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23433 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26080 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38177 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 20315 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23665 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 19996 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 33694 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 21414 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38376 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38738 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23707 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38271 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27801 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24120 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 22823 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 36830 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27784 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38451 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 31119 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26118 missing from current font.
font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26126 missing from current font.
font.set_text(s, 0, flags=flags)
![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/11cf48ed83aec54c2cc4962b590ded8d.png#pic_center)
【案例2】
现在我们有全球排名靠前的10000本书的数据,那么请统计一下下面几个问题:
(1)不同年份书的数量
注意:处理缺失值;条形图
(2)不同年份书的平均评分情况
注意:折线图,反映平均评分的变化情况
import pandas as pd
import numpy as np
df = pd.read_csv('./code2/books.csv')
print('\n【df.head(4)】')
print(df.head(4))
print('\n【df.info()】')
print(df.info())
df1 = df['original_publication_year']
print('\n【df1 = df["original_publication_year"]】')
print(df1)
data1 = df1.dropna(axis=0)
print('\n【data1 = df1.dropna()】')
print(data1)
print('**(1)**'*10)
df1 = df[ pd.notnull(df['original_publication_year']) ]
print('\n【df1】')
print(df1)
data2 = df1['original_publication_year']
print('\n【data2 = df1["original_publication_year"]】')
print(data2)
print('**(2)**'*10)
print( data1 == data2 )
【df.head(4)】
id book_id best_book_id work_id books_count isbn isbn13 \
0 1 2767052 2767052 2792775 272 439023483 9.780439e+12
1 2 3 3 4640799 491 439554934 9.780440e+12
2 3 41865 41865 3212258 226 316015849 9.780316e+12
3 4 2657 2657 3275794 487 61120081 9.780061e+12
authors original_publication_year \
0 Suzanne Collins 2008.0
1 J.K. Rowling, Mary GrandPré 1997.0
2 Stephenie Meyer 2005.0
3 Harper Lee 1960.0
original_title ... ratings_count \
0 The Hunger Games ... 4780653
1 Harry Potter and the Philosopher's Stone ... 4602479
2 Twilight ... 3866839
3 To Kill a Mockingbird ... 3198671
work_ratings_count work_text_reviews_count ratings_1 ratings_2 \
0 4942365 155254 66715 127936
1 4800065 75867 75504 101676
2 3916824 95009 456191 436802
3 3340896 72586 60427 117415
ratings_3 ratings_4 ratings_5 \
0 560092 1481305 2706317
1 455024 1156318 3011543
2 793319 875073 1355439
3 446835 1001952 1714267
image_url \
0 https://images.gr-assets.com/books/1447303603m...
1 https://images.gr-assets.com/books/1474154022m...
2 https://images.gr-assets.com/books/1361039443m...
3 https://images.gr-assets.com/books/1361975680m...
small_image_url
0 https://images.gr-assets.com/books/1447303603s...
1 https://images.gr-assets.com/books/1474154022s...
2 https://images.gr-assets.com/books/1361039443s...
3 https://images.gr-assets.com/books/1361975680s...
[4 rows x 23 columns]
【df.info()】
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 23 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 10000 non-null int64
1 book_id 10000 non-null int64
2 best_book_id 10000 non-null int64
3 work_id 10000 non-null int64
4 books_count 10000 non-null int64
5 isbn 9300 non-null object
6 isbn13 9415 non-null float64
7 authors 10000 non-null object
8 original_publication_year 9979 non-null float64
9 original_title 9415 non-null object
10 title 10000 non-null object
11 language_code 8916 non-null object
12 average_rating 10000 non-null float64
13 ratings_count 10000 non-null int64
14 work_ratings_count 10000 non-null int64
15 work_text_reviews_count 10000 non-null int64
16 ratings_1 10000 non-null int64
17 ratings_2 10000 non-null int64
18 ratings_3 10000 non-null int64
19 ratings_4 10000 non-null int64
20 ratings_5 10000 non-null int64
21 image_url 10000 non-null object
22 small_image_url 10000 non-null object
dtypes: float64(3), int64(13), object(7)
memory usage: 1.8+ MB
None
【df1 = df["original_publication_year"]】
0 2008.0
1 1997.0
2 2005.0
3 1960.0
4 1925.0
...
9995 2010.0
9996 1990.0
9997 1977.0
9998 2011.0
9999 1998.0
Name: original_publication_year, Length: 10000, dtype: float64
【data1 = df1.dropna()】
0 2008.0
1 1997.0
2 2005.0
3 1960.0
4 1925.0
...
9995 2010.0
9996 1990.0
9997 1977.0
9998 2011.0
9999 1998.0
Name: original_publication_year, Length: 9979, dtype: float64
**(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)**
【df1】
id book_id best_book_id work_id books_count isbn \
0 1 2767052 2767052 2792775 272 439023483
1 2 3 3 4640799 491 439554934
2 3 41865 41865 3212258 226 316015849
3 4 2657 2657 3275794 487 61120081
4 5 4671 4671 245494 1356 743273567
... ... ... ... ... ... ...
9995 9996 7130616 7130616 7392860 19 441019455
9996 9997 208324 208324 1084709 19 067973371X
9997 9998 77431 77431 2393986 60 039330762X
9998 9999 8565083 8565083 13433613 7 61711527
9999 10000 8914 8914 11817 31 375700455
isbn13 authors original_publication_year \
0 9.780439e+12 Suzanne Collins 2008.0
1 9.780440e+12 J.K. Rowling, Mary GrandPré 1997.0
2 9.780316e+12 Stephenie Meyer 2005.0
3 9.780061e+12 Harper Lee 1960.0
4 9.780743e+12 F. Scott Fitzgerald 1925.0
... ... ... ...
9995 9.780441e+12 Ilona Andrews 2010.0
9996 9.780680e+12 Robert A. Caro 1990.0
9997 9.780393e+12 Patrick O'Brian 1977.0
9998 9.780062e+12 Peggy Orenstein 2011.0
9999 9.780376e+12 John Keegan 1998.0
original_title ... ratings_count \
0 The Hunger Games ... 4780653
1 Harry Potter and the Philosopher's Stone ... 4602479
2 Twilight ... 3866839
3 To Kill a Mockingbird ... 3198671
4 The Great Gatsby ... 2683664
... ... ... ...
9995 Bayou Moon ... 17204
9996 Means of Ascent ... 12582
9997 The Mauritius Command ... 9421
9998 Cinderella Ate My Daughter: Dispatches from th... ... 11279
9999 The First World War ... 9162
work_ratings_count work_text_reviews_count ratings_1 ratings_2 \
0 4942365 155254 66715 127936
1 4800065 75867 75504 101676
2 3916824 95009 456191 436802
3 3340896 72586 60427 117415
4 2773745 51992 86236 197621
... ... ... ... ...
9995 18856 1180 105 575
9996 12952 395 303 551
9997 10733 374 11 111
9998 11994 1988 275 1002
9999 9700 364 117 345
ratings_3 ratings_4 ratings_5 \
0 560092 1481305 2706317
1 455024 1156318 3011543
2 793319 875073 1355439
3 446835 1001952 1714267
4 606158 936012 947718
... ... ... ...
9995 3538 7860 6778
9996 1737 3389 6972
9997 1191 4240 5180
9998 3765 4577 2375
9999 2031 4138 3069
image_url \
0 https://images.gr-assets.com/books/1447303603m...
1 https://images.gr-assets.com/books/1474154022m...
2 https://images.gr-assets.com/books/1361039443m...
3 https://images.gr-assets.com/books/1361975680m...
4 https://images.gr-assets.com/books/1490528560m...
... ...
9995 https://images.gr-assets.com/books/1307445460m...
9996 https://s.gr-assets.com/assets/nophoto/book/11...
9997 https://images.gr-assets.com/books/1455373531m...
9998 https://images.gr-assets.com/books/1279214118m...
9999 https://images.gr-assets.com/books/1403194704m...
small_image_url
0 https://images.gr-assets.com/books/1447303603s...
1 https://images.gr-assets.com/books/1474154022s...
2 https://images.gr-assets.com/books/1361039443s...
3 https://images.gr-assets.com/books/1361975680s...
4 https://images.gr-assets.com/books/1490528560s...
... ...
9995 https://images.gr-assets.com/books/1307445460s...
9996 https://s.gr-assets.com/assets/nophoto/book/50...
9997 https://images.gr-assets.com/books/1455373531s...
9998 https://images.gr-assets.com/books/1279214118s...
9999 https://images.gr-assets.com/books/1403194704s...
[9979 rows x 23 columns]
【data2 = df1["original_publication_year"]】
0 2008.0
1 1997.0
2 2005.0
3 1960.0
4 1925.0
...
9995 2010.0
9996 1990.0
9997 1977.0
9998 2011.0
9999 1998.0
Name: original_publication_year, Length: 9979, dtype: float64
**(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)**
0 True
1 True
2 True
3 True
4 True
...
9995 True
9996 True
9997 True
9998 True
9999 True
Name: original_publication_year, Length: 9979, dtype: bool
import pandas as pd
df = pd.read_csv('./code2/books.csv')
df1 = df[ pd.notnull(df['original_publication_year']) ]
print('\n【df1】')
print(df1)
grouped1 = df1.groupby( by=df1['original_publication_year'] )
print('\n【grouped1】')
print( grouped1 )
grouped2 = grouped1.count()
print('\n【计数:grouped2 = grouped1.count()】')
print(grouped2)
grouped3 = grouped2['average_rating']
print('\n【计数:grouped3 = grouped2["average_rating"] 】')
print(grouped3)
print('**(1)**'*10)
grouped4 = grouped1.mean()
print('\n【平均值:grouped4 = grouped1.mean()】')
print(grouped4)
grouped5 = grouped4['average_rating']
print('\n【平均值:grouped5 = grouped4["average_rating"]】')
print(grouped5)
print('**(2)**'*10)
grouped6 = grouped1.sum()
print('\n【求和:grouped6 = grouped1.sum()】')
print(grouped6)
grouped7 = grouped6['average_rating']
print('\n【求和:grouped7 = grouped6["average_rating"]】')
print(grouped7)
【df1】
id book_id best_book_id work_id books_count isbn \
0 1 2767052 2767052 2792775 272 439023483
1 2 3 3 4640799 491 439554934
2 3 41865 41865 3212258 226 316015849
3 4 2657 2657 3275794 487 61120081
4 5 4671 4671 245494 1356 743273567
... ... ... ... ... ... ...
9995 9996 7130616 7130616 7392860 19 441019455
9996 9997 208324 208324 1084709 19 067973371X
9997 9998 77431 77431 2393986 60 039330762X
9998 9999 8565083 8565083 13433613 7 61711527
9999 10000 8914 8914 11817 31 375700455
isbn13 authors original_publication_year \
0 9.780439e+12 Suzanne Collins 2008.0
1 9.780440e+12 J.K. Rowling, Mary GrandPré 1997.0
2 9.780316e+12 Stephenie Meyer 2005.0
3 9.780061e+12 Harper Lee 1960.0
4 9.780743e+12 F. Scott Fitzgerald 1925.0
... ... ... ...
9995 9.780441e+12 Ilona Andrews 2010.0
9996 9.780680e+12 Robert A. Caro 1990.0
9997 9.780393e+12 Patrick O'Brian 1977.0
9998 9.780062e+12 Peggy Orenstein 2011.0
9999 9.780376e+12 John Keegan 1998.0
original_title ... ratings_count \
0 The Hunger Games ... 4780653
1 Harry Potter and the Philosopher's Stone ... 4602479
2 Twilight ... 3866839
3 To Kill a Mockingbird ... 3198671
4 The Great Gatsby ... 2683664
... ... ... ...
9995 Bayou Moon ... 17204
9996 Means of Ascent ... 12582
9997 The Mauritius Command ... 9421
9998 Cinderella Ate My Daughter: Dispatches from th... ... 11279
9999 The First World War ... 9162
work_ratings_count work_text_reviews_count ratings_1 ratings_2 \
0 4942365 155254 66715 127936
1 4800065 75867 75504 101676
2 3916824 95009 456191 436802
3 3340896 72586 60427 117415
4 2773745 51992 86236 197621
... ... ... ... ...
9995 18856 1180 105 575
9996 12952 395 303 551
9997 10733 374 11 111
9998 11994 1988 275 1002
9999 9700 364 117 345
ratings_3 ratings_4 ratings_5 \
0 560092 1481305 2706317
1 455024 1156318 3011543
2 793319 875073 1355439
3 446835 1001952 1714267
4 606158 936012 947718
... ... ... ...
9995 3538 7860 6778
9996 1737 3389 6972
9997 1191 4240 5180
9998 3765 4577 2375
9999 2031 4138 3069
image_url \
0 https://images.gr-assets.com/books/1447303603m...
1 https://images.gr-assets.com/books/1474154022m...
2 https://images.gr-assets.com/books/1361039443m...
3 https://images.gr-assets.com/books/1361975680m...
4 https://images.gr-assets.com/books/1490528560m...
... ...
9995 https://images.gr-assets.com/books/1307445460m...
9996 https://s.gr-assets.com/assets/nophoto/book/11...
9997 https://images.gr-assets.com/books/1455373531m...
9998 https://images.gr-assets.com/books/1279214118m...
9999 https://images.gr-assets.com/books/1403194704m...
small_image_url
0 https://images.gr-assets.com/books/1447303603s...
1 https://images.gr-assets.com/books/1474154022s...
2 https://images.gr-assets.com/books/1361039443s...
3 https://images.gr-assets.com/books/1361975680s...
4 https://images.gr-assets.com/books/1490528560s...
... ...
9995 https://images.gr-assets.com/books/1307445460s...
9996 https://s.gr-assets.com/assets/nophoto/book/50...
9997 https://images.gr-assets.com/books/1455373531s...
9998 https://images.gr-assets.com/books/1279214118s...
9999 https://images.gr-assets.com/books/1403194704s...
[9979 rows x 23 columns]
【grouped1】
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000002927AE79C70>
【计数:grouped2 = grouped1.count()】
id book_id best_book_id work_id books_count \
original_publication_year
-1750.0 1 1 1 1 1
-762.0 1 1 1 1 1
-750.0 2 2 2 2 2
-720.0 1 1 1 1 1
-560.0 1 1 1 1 1
... ... ... ... ... ...
2013.0 518 518 518 518 518
2014.0 437 437 437 437 437
2015.0 306 306 306 306 306
2016.0 198 198 198 198 198
2017.0 11 11 11 11 11
isbn isbn13 authors original_title title ... \
original_publication_year ...
-1750.0 1 1 1 1 1 ...
-762.0 1 1 1 1 1 ...
-750.0 2 2 2 2 2 ...
-720.0 1 1 1 1 1 ...
-560.0 1 1 1 1 1 ...
... ... ... ... ... ... ...
2013.0 386 408 518 434 518 ...
2014.0 347 358 437 391 437 ...
2015.0 241 241 306 259 306 ...
2016.0 146 147 198 173 198 ...
2017.0 9 9 11 9 11 ...
ratings_count work_ratings_count \
original_publication_year
-1750.0 1 1
-762.0 1 1
-750.0 2 2
-720.0 1 1
-560.0 1 1
... ... ...
2013.0 518 518
2014.0 437 437
2015.0 306 306
2016.0 198 198
2017.0 11 11
work_text_reviews_count ratings_1 ratings_2 \
original_publication_year
-1750.0 1 1 1
-762.0 1 1 1
-750.0 2 2 2
-720.0 1 1 1
-560.0 1 1 1
... ... ... ...
2013.0 518 518 518
2014.0 437 437 437
2015.0 306 306 306
2016.0 198 198 198
2017.0 11 11 11
ratings_3 ratings_4 ratings_5 image_url \
original_publication_year
-1750.0 1 1 1 1
-762.0 1 1 1 1
-750.0 2 2 2 2
-720.0 1 1 1 1
-560.0 1 1 1 1
... ... ... ... ...
2013.0 518 518 518 518
2014.0 437 437 437 437
2015.0 306 306 306 306
2016.0 198 198 198 198
2017.0 11 11 11 11
small_image_url
original_publication_year
-1750.0 1
-762.0 1
-750.0 2
-720.0 1
-560.0 1
... ...
2013.0 518
2014.0 437
2015.0 306
2016.0 198
2017.0 11
[293 rows x 22 columns]
【计数:grouped3 = grouped2["average_rating"] 】
original_publication_year
-1750.0 1
-762.0 1
-750.0 2
-720.0 1
-560.0 1
...
2013.0 518
2014.0 437
2015.0 306
2016.0 198
2017.0 11
Name: average_rating, Length: 293, dtype: int64
**(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)**
【平均值:grouped4 = grouped1.mean()】
id book_id best_book_id \
original_publication_year
-1750.0 2076.000000 1.935100e+04 1.935100e+04
-762.0 2142.000000 1.375000e+03 1.375000e+03
-750.0 3253.500000 2.678300e+05 2.678300e+05
-720.0 79.000000 1.381000e+03 1.381000e+03
-560.0 1120.000000 2.134800e+04 2.134800e+04
... ... ... ...
2013.0 5319.747104 1.633865e+07 1.659865e+07
2014.0 5705.828375 1.919866e+07 1.941922e+07
2015.0 5755.503268 2.268179e+07 2.281951e+07
2016.0 5971.434343 2.639991e+07 2.656994e+07
2017.0 8043.636364 2.925244e+07 2.925244e+07
work_id books_count isbn13 \
original_publication_year
-1750.0 3.802528e+06 266.000000 9.780141e+12
-762.0 1.474309e+06 255.000000 9.780148e+12
-750.0 1.907469e+06 933.000000 9.780416e+12
-720.0 3.356006e+06 1703.000000 9.780143e+12
-560.0 8.682630e+05 942.000000 9.780193e+12
... ... ... ...
2013.0 2.206115e+07 31.816602 9.663512e+12
2014.0 2.927728e+07 31.299771 9.742640e+12
2015.0 3.989789e+07 29.830065 9.780866e+12
2016.0 4.410693e+07 26.964646 9.780837e+12
2017.0 4.663936e+07 28.727273 9.780937e+12
average_rating ratings_count work_ratings_count \
original_publication_year
-1750.0 3.630000 44345.000000 55856.000000
-762.0 4.030000 47825.000000 51098.000000
-750.0 4.005000 126934.500000 144132.500000
-720.0 3.730000 670326.000000 710757.000000
-560.0 4.050000 88508.000000 98962.000000
... ... ... ...
2013.0 4.012297 35561.127413 42097.054054
2014.0 3.985378 29191.851259 35508.691076
2015.0 3.954641 29524.398693 37084.562092
2016.0 4.027576 26676.030303 33942.565657
2017.0 4.100909 25181.090909 32611.545455
work_text_reviews_count ratings_1 \
original_publication_year
-1750.0 2247.000000 1551.000000
-762.0 537.000000 916.000000
-750.0 2519.000000 3939.500000
-720.0 8101.000000 29703.000000
-560.0 1441.000000 773.000000
... ... ...
2013.0 4063.857143 862.785714
2014.0 3914.050343 718.244851
2015.0 4358.232026 778.522876
2016.0 4365.611111 648.661616
2017.0 5774.909091 487.363636
ratings_2 ratings_3 ratings_4 \
original_publication_year
-1750.0 5850.000000 17627.000000 17485.000000
-762.0 2608.000000 10439.000000 17404.000000
-750.0 10722.000000 35746.500000 46807.000000
-720.0 65629.000000 183082.000000 224120.000000
-560.0 3717.000000 22587.000000 34885.000000
... ... ... ...
2013.0 2163.617761 7917.409266 14859.515444
2014.0 1782.915332 6655.775744 12622.457666
2015.0 1975.650327 7325.369281 13478.388889
2016.0 1758.944444 6606.020202 12147.893939
2017.0 1354.272727 5118.090909 11250.909091
ratings_5
original_publication_year
-1750.0 13343.000000
-762.0 19731.000000
-750.0 46917.500000
-720.0 208223.000000
-560.0 37000.000000
... ...
2013.0 16293.725869
2014.0 13729.297483
2015.0 13526.630719
2016.0 12781.045455
2017.0 14400.909091
[293 rows x 15 columns]
【平均值:grouped5 = grouped4["average_rating"]】
original_publication_year
-1750.0 3.630000
-762.0 4.030000
-750.0 4.005000
-720.0 3.730000
-560.0 4.050000
...
2013.0 4.012297
2014.0 3.985378
2015.0 3.954641
2016.0 4.027576
2017.0 4.100909
Name: average_rating, Length: 293, dtype: float64
**(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)**
【求和:grouped6 = grouped1.sum()】
id book_id best_book_id work_id \
original_publication_year
-1750.0 2076 19351 19351 3802528
-762.0 2142 1375 1375 1474309
-750.0 6507 535660 535660 3814938
-720.0 79 1381 1381 3356006
-560.0 1120 21348 21348 868263
... ... ... ... ...
2013.0 2755629 8463418354 8598098696 11427677552
2014.0 2493447 8389815459 8486197677 12794170667
2015.0 1761184 6940628219 6982769611 12208753377
2016.0 1182344 5227181248 5260848297 8733172252
2017.0 88480 321776859 321776859 513032910
books_count isbn13 average_rating \
original_publication_year
-1750.0 266 9.780141e+12 3.63
-762.0 255 9.780148e+12 4.03
-750.0 1866 1.956083e+13 8.01
-720.0 1703 9.780143e+12 3.73
-560.0 942 9.780193e+12 4.05
... ... ... ...
2013.0 16481 3.942713e+15 2078.37
2014.0 13678 3.487865e+15 1741.61
2015.0 9128 2.357189e+15 1210.12
2016.0 5339 1.437783e+15 797.46
2017.0 316 8.802843e+13 45.11
ratings_count work_ratings_count \
original_publication_year
-1750.0 44345 55856
-762.0 47825 51098
-750.0 253869 288265
-720.0 670326 710757
-560.0 88508 98962
... ... ...
2013.0 18420664 21806274
2014.0 12756839 15517298
2015.0 9034466 11347876
2016.0 5281854 6720628
2017.0 276992 358727
work_text_reviews_count ratings_1 ratings_2 \
original_publication_year
-1750.0 2247 1551 5850
-762.0 537 916 2608
-750.0 5038 7879 21444
-720.0 8101 29703 65629
-560.0 1441 773 3717
... ... ... ...
2013.0 2105078 446923 1120754
2014.0 1710440 313873 779134
2015.0 1333619 238228 604549
2016.0 864391 128435 348271
2017.0 63524 5361 14897
ratings_3 ratings_4 ratings_5
original_publication_year
-1750.0 17627 17485 13343
-762.0 10439 17404 19731
-750.0 71493 93614 93835
-720.0 183082 224120 208223
-560.0 22587 34885 37000
... ... ... ...
2013.0 4101218 7697229 8440150
2014.0 2908574 5516014 5999703
2015.0 2241563 4124387 4139149
2016.0 1307992 2405283 2530647
2017.0 56299 123760 158410
[293 rows x 15 columns]
【求和:grouped7 = grouped6["average_rating"]】
original_publication_year
-1750.0 3.63
-762.0 4.03
-750.0 8.01
-720.0 3.73
-560.0 4.05
...
2013.0 2078.37
2014.0 1741.61
2015.0 1210.12
2016.0 797.46
2017.0 45.11
Name: average_rating, Length: 293, dtype: float64
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv('./code2/books.csv')
df1 = df[ pd.notnull(df['original_publication_year']) ]
grouped = df1.groupby( by=df1['original_publication_year'] ).count()
data1 = grouped['title']
_x = data1.index
_y = data1.values
plt.figure(figsize=(20,8), dpi=80)
plt.plot(range(len(_x)), _y)
plt.xticks(list(range(len(_x)))[::20], _x[::20].astype('int'))
plt.show()
print(data1)
data2 = data1.sort_values(ascending=False)
print(data2)
data3 = data2[:2000]
print(data3)
_x = data3.index
_y = data3.values
plt.figure(figsize=(20,8), dpi=80)
plt.bar(range(len(_x)), _y, width=0.3)
plt.xticks(range(len(_x)), _x.astype('int'))
plt.show()
![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/8605c46fa02bfd3726dbeab72e4a4bc8.png#pic_center)
original_publication_year
-1750.0 1
-762.0 1
-750.0 2
-720.0 1
-560.0 1
...
2013.0 518
2014.0 437
2015.0 306
2016.0 198
2017.0 11
Name: title, Length: 293, dtype: int64
original_publication_year
2012.0 568
2011.0 556
2013.0 518
2010.0 473
2014.0 437
...
1749.0 1
1759.0 1
1762.0 1
1764.0 1
-1750.0 1
Name: title, Length: 293, dtype: int64
original_publication_year
2012.0 568
2011.0 556
2013.0 518
2010.0 473
2014.0 437
2009.0 432
2008.0 383
2007.0 363
2006.0 362
2005.0 326
2004.0 307
2015.0 306
2003.0 288
2001.0 226
2002.0 225
2000.0 209
Name: title, dtype: int64
![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/9fe286d2c8c89e3f6d5580949fb512d6.png#pic_center)
'''
average_rating:平均评分
(1)average_rating:是不同机构对一本书评分的平均值
(2)现在按年份分组之后,再对年份的评分求平均值
grouped1 = df1.groupby( by=df['original_publication_year'] )['average_rating'].mean() 先按年份分组,再求评分的平均值
(1)应该先groupby完后,再选“rating”列,最后求均值mean()。这个顺序更好
'''
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv('./code2/books.csv')
df1 = df[ pd.notnull(df['original_publication_year']) ]
data1 = df1.groupby( by=df['original_publication_year'] )['average_rating'].mean()
print(data1)
_x = data1.index
_y = data1.values
plt.figure(figsize=(20,8), dpi=80)
plt.plot(range(len(_x)), _y)
plt.xticks(list(range(len(_x)))[::10], _x[::10].astype('int'), rotation=45)
plt.show()
original_publication_year
-1750.0 3.630000
-762.0 4.030000
-750.0 4.005000
-720.0 3.730000
-560.0 4.050000
...
2013.0 4.012297
2014.0 3.985378
2015.0 3.954641
2016.0 4.027576
2017.0 4.100909
Name: average_rating, Length: 293, dtype: float64
![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/17617d8359b6d2d4da0315cedc244763.png#pic_center)