5 --- 数据的合并和分组聚合之案例1(pandas)

【案例1】
(1)使用matplotlib呈现出店铺总数排名前10的国家
(2)使用matplotlib呈现出中国每个城市的店铺数量
# (1)使用matplotlib呈现出店铺总数排名前10的国家
'''
(1) df.groupby(by='Country')    按照国家进行分组
(2) sort_values:默认升序(ascending=True)   ascend:上升
(3) [:10]  切片:取前10行
(4) df.index    索引
      df.values   索引所对应的值
'''
import pandas as pd
from matplotlib import pyplot as plt

file_path = './code2/starbucks_store_worldwide.csv'
df = pd.read_csv(file_path)




# step1: 准备数据
data0 = df.groupby(by='Country')['Brand'].count().sort_values(ascending=False)   # a.sort_values()默认升序    
print(data0)
data1 = data0[:10]
print(data1)


# step2:画条形图
_x = data1.index
_y = data1.values

plt.figure(figsize=(20,8), dpi=80)

plt.bar( range(len(_x)), _y )

plt.xticks( range(len(_x)), _x )

plt.show()
Country
US    13608
CN     2734
CA     1468
JP     1237
KR      993
      ...  
SK        3
TT        3
LU        2
MC        2
AD        1
Name: Brand, Length: 73, dtype: int64
Country
US    13608
CN     2734
CA     1468
JP     1237
KR      993
GB      901
MX      579
TW      394
TR      326
PH      298
Name: Brand, dtype: int64

在这里插入图片描述

# (2)使用matplotlib呈现出中国每个城市的店铺数量
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import font_manager

my_font = font_manager.FontProperties(fname='C:\\Windows\\Fonts\\simhei.ttf')     #设置字体

df = pd.read_csv('./code2/starbucks_store_worldwide.csv')
print(df['Country'])
print('*'*30)

df = df[ df['Country'] == 'CN' ]          # 切片
print(df.head(1))





# step1:准备数据
data1 = df.groupby(by='City')['Brand'].count().sort_values(ascending=False)[:25]    #切片:选前25行      sort_values:默认升序
print(data1)



# step2:画条形图
_x = data1.index
_y = data1.values


plt.figure(figsize=(20,12), dpi=80)
#plt.bar(range(len(_x)), _y, width=0.3, color='cyan')
#plt.xticks(range(len(_x)), _x, fontproperties=my_font)         # 给x轴设置中文字体
plt.barh(range(len(_x)), _y, height=0.3, color='cyan')
plt.yticks(range(len(_x)), _x, fontproperties=my_font)          # 给y轴设置中文字体


plt.show()
0        AD
1        AE
2        AE
3        AE
4        AE
         ..
25595    VN
25596    VN
25597    ZA
25598    ZA
25599    ZA
Name: Country, Length: 25600, dtype: object
******************************
          Brand  Store Number Store Name Ownership Type  \
2091  Starbucks  22901-225145  北京西站第一咖啡店  Company Owned   

                 Street Address City State/Province Country Postcode  \
2091  丰台区, 北京西站通廊7-1号, 中关村南大街2号  北京市             11      CN   100073   

     Phone Number                Timezone  Longitude  Latitude  
2091          NaN  GMT+08:00 Asia/Beijing     116.32      39.9  
City
上海市          542
北京市          234
杭州市          117
深圳市          113
广州市          106
Hong Kong    104
成都市           98
苏州市           90
南京市           73
武汉市           67
宁波市           59
天津市           58
重庆市           41
西安市           40
无锡市           40
佛山市           33
东莞市           31
厦门市           31
青岛市           28
长沙市           26
常州市           26
大连市           25
沈阳市           24
福州市           23
昆明市           21
Name: Brand, dtype: int64


C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 21271 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 20140 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24066 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26477 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24030 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 28145 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 22323 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24191 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 25104 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 37117 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 33487 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 21335 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27494 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27721 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23425 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27874 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 22825 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27941 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 37325 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24198 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 35199 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23433 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26080 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38177 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 20315 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23665 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 19996 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 33694 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 21414 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38376 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38738 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 23707 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38271 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27801 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 24120 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 22823 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 36830 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 27784 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 38451 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 31119 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26118 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 26126 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 21271 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 20140 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24066 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26477 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24030 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 28145 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 22323 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24191 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 25104 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 37117 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 33487 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 21335 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27494 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27721 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23425 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27874 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 22825 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27941 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 37325 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24198 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 35199 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23433 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26080 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38177 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 20315 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23665 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 19996 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 33694 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 21414 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38376 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38738 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 23707 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38271 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27801 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 24120 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 22823 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 36830 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 27784 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 38451 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 31119 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26118 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\matplotlib\backends\backend_agg.py:201: RuntimeWarning: Glyph 26126 missing from current font.
  font.set_text(s, 0, flags=flags)

在这里插入图片描述

【案例2】

现在我们有全球排名靠前的10000本书的数据,那么请统计一下下面几个问题:
(1)不同年份书的数量
     注意:处理缺失值;条形图
    
(2)不同年份书的平均评分情况
     注意:折线图,反映平均评分的变化情况
# (1)前情提要---1
import pandas as pd
import numpy as np


df = pd.read_csv('./code2/books.csv')
print('\n【df.head(4)】')
print(df.head(4))
print('\n【df.info()】')
print(df.info())





# 方法1
df1 = df['original_publication_year']
print('\n【df1 = df["original_publication_year"]】')
print(df1)
data1 = df1.dropna(axis=0)            # 默认删除列: axis=0  ----- 去掉缺失值
print('\n【data1 = df1.dropna()】')
print(data1)

print('**(1)**'*10)








# 方法2
df1 = df[ pd.notnull(df['original_publication_year']) ]       # pd.notnull() ---- 布尔类型 (df1:去掉缺失值)
print('\n【df1】')
print(df1)
data2 = df1['original_publication_year']
print('\n【data2 = df1["original_publication_year"]】')
print(data2)

print('**(2)**'*10)



# 测试data1 与 data2 是否相等----是
print( data1 == data2 )
【df.head(4)】
   id  book_id  best_book_id  work_id  books_count       isbn        isbn13  \
0   1  2767052       2767052  2792775          272  439023483  9.780439e+12   
1   2        3             3  4640799          491  439554934  9.780440e+12   
2   3    41865         41865  3212258          226  316015849  9.780316e+12   
3   4     2657          2657  3275794          487   61120081  9.780061e+12   

                       authors  original_publication_year  \
0              Suzanne Collins                     2008.0   
1  J.K. Rowling, Mary GrandPré                     1997.0   
2              Stephenie Meyer                     2005.0   
3                   Harper Lee                     1960.0   

                             original_title  ... ratings_count  \
0                          The Hunger Games  ...       4780653   
1  Harry Potter and the Philosopher's Stone  ...       4602479   
2                                  Twilight  ...       3866839   
3                     To Kill a Mockingbird  ...       3198671   

  work_ratings_count  work_text_reviews_count  ratings_1  ratings_2  \
0            4942365                   155254      66715     127936   
1            4800065                    75867      75504     101676   
2            3916824                    95009     456191     436802   
3            3340896                    72586      60427     117415   

   ratings_3  ratings_4  ratings_5  \
0     560092    1481305    2706317   
1     455024    1156318    3011543   
2     793319     875073    1355439   
3     446835    1001952    1714267   

                                           image_url  \
0  https://images.gr-assets.com/books/1447303603m...   
1  https://images.gr-assets.com/books/1474154022m...   
2  https://images.gr-assets.com/books/1361039443m...   
3  https://images.gr-assets.com/books/1361975680m...   

                                     small_image_url  
0  https://images.gr-assets.com/books/1447303603s...  
1  https://images.gr-assets.com/books/1474154022s...  
2  https://images.gr-assets.com/books/1361039443s...  
3  https://images.gr-assets.com/books/1361975680s...  

[4 rows x 23 columns]

【df.info()】
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 23 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   id                         10000 non-null  int64  
 1   book_id                    10000 non-null  int64  
 2   best_book_id               10000 non-null  int64  
 3   work_id                    10000 non-null  int64  
 4   books_count                10000 non-null  int64  
 5   isbn                       9300 non-null   object 
 6   isbn13                     9415 non-null   float64
 7   authors                    10000 non-null  object 
 8   original_publication_year  9979 non-null   float64
 9   original_title             9415 non-null   object 
 10  title                      10000 non-null  object 
 11  language_code              8916 non-null   object 
 12  average_rating             10000 non-null  float64
 13  ratings_count              10000 non-null  int64  
 14  work_ratings_count         10000 non-null  int64  
 15  work_text_reviews_count    10000 non-null  int64  
 16  ratings_1                  10000 non-null  int64  
 17  ratings_2                  10000 non-null  int64  
 18  ratings_3                  10000 non-null  int64  
 19  ratings_4                  10000 non-null  int64  
 20  ratings_5                  10000 non-null  int64  
 21  image_url                  10000 non-null  object 
 22  small_image_url            10000 non-null  object 
dtypes: float64(3), int64(13), object(7)
memory usage: 1.8+ MB
None

【df1 = df["original_publication_year"]】
0       2008.0
1       1997.0
2       2005.0
3       1960.0
4       1925.0
         ...  
9995    2010.0
9996    1990.0
9997    1977.0
9998    2011.0
9999    1998.0
Name: original_publication_year, Length: 10000, dtype: float64

【data1 = df1.dropna()】
0       2008.0
1       1997.0
2       2005.0
3       1960.0
4       1925.0
         ...  
9995    2010.0
9996    1990.0
9997    1977.0
9998    2011.0
9999    1998.0
Name: original_publication_year, Length: 9979, dtype: float64
**(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)**

【df1】
         id  book_id  best_book_id   work_id  books_count        isbn  \
0         1  2767052       2767052   2792775          272   439023483   
1         2        3             3   4640799          491   439554934   
2         3    41865         41865   3212258          226   316015849   
3         4     2657          2657   3275794          487    61120081   
4         5     4671          4671    245494         1356   743273567   
...     ...      ...           ...       ...          ...         ...   
9995   9996  7130616       7130616   7392860           19   441019455   
9996   9997   208324        208324   1084709           19  067973371X   
9997   9998    77431         77431   2393986           60  039330762X   
9998   9999  8565083       8565083  13433613            7    61711527   
9999  10000     8914          8914     11817           31   375700455   

            isbn13                      authors  original_publication_year  \
0     9.780439e+12              Suzanne Collins                     2008.0   
1     9.780440e+12  J.K. Rowling, Mary GrandPré                     1997.0   
2     9.780316e+12              Stephenie Meyer                     2005.0   
3     9.780061e+12                   Harper Lee                     1960.0   
4     9.780743e+12          F. Scott Fitzgerald                     1925.0   
...            ...                          ...                        ...   
9995  9.780441e+12                Ilona Andrews                     2010.0   
9996  9.780680e+12               Robert A. Caro                     1990.0   
9997  9.780393e+12              Patrick O'Brian                     1977.0   
9998  9.780062e+12              Peggy Orenstein                     2011.0   
9999  9.780376e+12                  John Keegan                     1998.0   

                                         original_title  ... ratings_count  \
0                                      The Hunger Games  ...       4780653   
1              Harry Potter and the Philosopher's Stone  ...       4602479   
2                                              Twilight  ...       3866839   
3                                 To Kill a Mockingbird  ...       3198671   
4                                      The Great Gatsby  ...       2683664   
...                                                 ...  ...           ...   
9995                                         Bayou Moon  ...         17204   
9996                                   Means of Ascent   ...         12582   
9997                              The Mauritius Command  ...          9421   
9998  Cinderella Ate My Daughter: Dispatches from th...  ...         11279   
9999                                The First World War  ...          9162   

     work_ratings_count  work_text_reviews_count  ratings_1  ratings_2  \
0               4942365                   155254      66715     127936   
1               4800065                    75867      75504     101676   
2               3916824                    95009     456191     436802   
3               3340896                    72586      60427     117415   
4               2773745                    51992      86236     197621   
...                 ...                      ...        ...        ...   
9995              18856                     1180        105        575   
9996              12952                      395        303        551   
9997              10733                      374         11        111   
9998              11994                     1988        275       1002   
9999               9700                      364        117        345   

      ratings_3  ratings_4  ratings_5  \
0        560092    1481305    2706317   
1        455024    1156318    3011543   
2        793319     875073    1355439   
3        446835    1001952    1714267   
4        606158     936012     947718   
...         ...        ...        ...   
9995       3538       7860       6778   
9996       1737       3389       6972   
9997       1191       4240       5180   
9998       3765       4577       2375   
9999       2031       4138       3069   

                                              image_url  \
0     https://images.gr-assets.com/books/1447303603m...   
1     https://images.gr-assets.com/books/1474154022m...   
2     https://images.gr-assets.com/books/1361039443m...   
3     https://images.gr-assets.com/books/1361975680m...   
4     https://images.gr-assets.com/books/1490528560m...   
...                                                 ...   
9995  https://images.gr-assets.com/books/1307445460m...   
9996  https://s.gr-assets.com/assets/nophoto/book/11...   
9997  https://images.gr-assets.com/books/1455373531m...   
9998  https://images.gr-assets.com/books/1279214118m...   
9999  https://images.gr-assets.com/books/1403194704m...   

                                        small_image_url  
0     https://images.gr-assets.com/books/1447303603s...  
1     https://images.gr-assets.com/books/1474154022s...  
2     https://images.gr-assets.com/books/1361039443s...  
3     https://images.gr-assets.com/books/1361975680s...  
4     https://images.gr-assets.com/books/1490528560s...  
...                                                 ...  
9995  https://images.gr-assets.com/books/1307445460s...  
9996  https://s.gr-assets.com/assets/nophoto/book/50...  
9997  https://images.gr-assets.com/books/1455373531s...  
9998  https://images.gr-assets.com/books/1279214118s...  
9999  https://images.gr-assets.com/books/1403194704s...  

[9979 rows x 23 columns]

【data2 = df1["original_publication_year"]】
0       2008.0
1       1997.0
2       2005.0
3       1960.0
4       1925.0
         ...  
9995    2010.0
9996    1990.0
9997    1977.0
9998    2011.0
9999    1998.0
Name: original_publication_year, Length: 9979, dtype: float64
**(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)**
0       True
1       True
2       True
3       True
4       True
        ... 
9995    True
9996    True
9997    True
9998    True
9999    True
Name: original_publication_year, Length: 9979, dtype: bool
#(2)前情提要---2
import pandas as pd

df = pd.read_csv('./code2/books.csv')



df1 = df[ pd.notnull(df['original_publication_year']) ]         # df:去掉nan空值
print('\n【df1】')
print(df1)



grouped1 = df1.groupby( by=df1['original_publication_year'] )    # 按照年份分组(df1:已经去掉nan值)
print('\n【grouped1】')
print( grouped1 )








# case1:计数
grouped2 = grouped1.count()
print('\n【计数:grouped2 = grouped1.count()】')
print(grouped2)

grouped3 = grouped2['average_rating']            # average_rating列,没有缺失值-----故选择此列
print('\n【计数:grouped3 = grouped2["average_rating"] 】')
print(grouped3)
print('**(1)**'*10)







# case2:平均值
grouped4 = grouped1.mean()
print('\n【平均值:grouped4 = grouped1.mean()】')
print(grouped4)


grouped5 = grouped4['average_rating']
print('\n【平均值:grouped5 = grouped4["average_rating"]】')
print(grouped5)
print('**(2)**'*10)









# case3:求和
grouped6 = grouped1.sum()
print('\n【求和:grouped6 = grouped1.sum()】')
print(grouped6)

grouped7 = grouped6['average_rating']
print('\n【求和:grouped7 = grouped6["average_rating"]】')
print(grouped7)
【df1】
         id  book_id  best_book_id   work_id  books_count        isbn  \
0         1  2767052       2767052   2792775          272   439023483   
1         2        3             3   4640799          491   439554934   
2         3    41865         41865   3212258          226   316015849   
3         4     2657          2657   3275794          487    61120081   
4         5     4671          4671    245494         1356   743273567   
...     ...      ...           ...       ...          ...         ...   
9995   9996  7130616       7130616   7392860           19   441019455   
9996   9997   208324        208324   1084709           19  067973371X   
9997   9998    77431         77431   2393986           60  039330762X   
9998   9999  8565083       8565083  13433613            7    61711527   
9999  10000     8914          8914     11817           31   375700455   

            isbn13                      authors  original_publication_year  \
0     9.780439e+12              Suzanne Collins                     2008.0   
1     9.780440e+12  J.K. Rowling, Mary GrandPré                     1997.0   
2     9.780316e+12              Stephenie Meyer                     2005.0   
3     9.780061e+12                   Harper Lee                     1960.0   
4     9.780743e+12          F. Scott Fitzgerald                     1925.0   
...            ...                          ...                        ...   
9995  9.780441e+12                Ilona Andrews                     2010.0   
9996  9.780680e+12               Robert A. Caro                     1990.0   
9997  9.780393e+12              Patrick O'Brian                     1977.0   
9998  9.780062e+12              Peggy Orenstein                     2011.0   
9999  9.780376e+12                  John Keegan                     1998.0   

                                         original_title  ... ratings_count  \
0                                      The Hunger Games  ...       4780653   
1              Harry Potter and the Philosopher's Stone  ...       4602479   
2                                              Twilight  ...       3866839   
3                                 To Kill a Mockingbird  ...       3198671   
4                                      The Great Gatsby  ...       2683664   
...                                                 ...  ...           ...   
9995                                         Bayou Moon  ...         17204   
9996                                   Means of Ascent   ...         12582   
9997                              The Mauritius Command  ...          9421   
9998  Cinderella Ate My Daughter: Dispatches from th...  ...         11279   
9999                                The First World War  ...          9162   

     work_ratings_count  work_text_reviews_count  ratings_1  ratings_2  \
0               4942365                   155254      66715     127936   
1               4800065                    75867      75504     101676   
2               3916824                    95009     456191     436802   
3               3340896                    72586      60427     117415   
4               2773745                    51992      86236     197621   
...                 ...                      ...        ...        ...   
9995              18856                     1180        105        575   
9996              12952                      395        303        551   
9997              10733                      374         11        111   
9998              11994                     1988        275       1002   
9999               9700                      364        117        345   

      ratings_3  ratings_4  ratings_5  \
0        560092    1481305    2706317   
1        455024    1156318    3011543   
2        793319     875073    1355439   
3        446835    1001952    1714267   
4        606158     936012     947718   
...         ...        ...        ...   
9995       3538       7860       6778   
9996       1737       3389       6972   
9997       1191       4240       5180   
9998       3765       4577       2375   
9999       2031       4138       3069   

                                              image_url  \
0     https://images.gr-assets.com/books/1447303603m...   
1     https://images.gr-assets.com/books/1474154022m...   
2     https://images.gr-assets.com/books/1361039443m...   
3     https://images.gr-assets.com/books/1361975680m...   
4     https://images.gr-assets.com/books/1490528560m...   
...                                                 ...   
9995  https://images.gr-assets.com/books/1307445460m...   
9996  https://s.gr-assets.com/assets/nophoto/book/11...   
9997  https://images.gr-assets.com/books/1455373531m...   
9998  https://images.gr-assets.com/books/1279214118m...   
9999  https://images.gr-assets.com/books/1403194704m...   

                                        small_image_url  
0     https://images.gr-assets.com/books/1447303603s...  
1     https://images.gr-assets.com/books/1474154022s...  
2     https://images.gr-assets.com/books/1361039443s...  
3     https://images.gr-assets.com/books/1361975680s...  
4     https://images.gr-assets.com/books/1490528560s...  
...                                                 ...  
9995  https://images.gr-assets.com/books/1307445460s...  
9996  https://s.gr-assets.com/assets/nophoto/book/50...  
9997  https://images.gr-assets.com/books/1455373531s...  
9998  https://images.gr-assets.com/books/1279214118s...  
9999  https://images.gr-assets.com/books/1403194704s...  

[9979 rows x 23 columns]

【grouped1】
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000002927AE79C70>

【计数:grouped2 = grouped1.count()】
                            id  book_id  best_book_id  work_id  books_count  \
original_publication_year                                                     
-1750.0                      1        1             1        1            1   
-762.0                       1        1             1        1            1   
-750.0                       2        2             2        2            2   
-720.0                       1        1             1        1            1   
-560.0                       1        1             1        1            1   
...                        ...      ...           ...      ...          ...   
 2013.0                    518      518           518      518          518   
 2014.0                    437      437           437      437          437   
 2015.0                    306      306           306      306          306   
 2016.0                    198      198           198      198          198   
 2017.0                     11       11            11       11           11   

                           isbn  isbn13  authors  original_title  title  ...  \
original_publication_year                                                ...   
-1750.0                       1       1        1               1      1  ...   
-762.0                        1       1        1               1      1  ...   
-750.0                        2       2        2               2      2  ...   
-720.0                        1       1        1               1      1  ...   
-560.0                        1       1        1               1      1  ...   
...                         ...     ...      ...             ...    ...  ...   
 2013.0                     386     408      518             434    518  ...   
 2014.0                     347     358      437             391    437  ...   
 2015.0                     241     241      306             259    306  ...   
 2016.0                     146     147      198             173    198  ...   
 2017.0                       9       9       11               9     11  ...   

                           ratings_count  work_ratings_count  \
original_publication_year                                      
-1750.0                                1                   1   
-762.0                                 1                   1   
-750.0                                 2                   2   
-720.0                                 1                   1   
-560.0                                 1                   1   
...                                  ...                 ...   
 2013.0                              518                 518   
 2014.0                              437                 437   
 2015.0                              306                 306   
 2016.0                              198                 198   
 2017.0                               11                  11   

                           work_text_reviews_count  ratings_1  ratings_2  \
original_publication_year                                                  
-1750.0                                          1          1          1   
-762.0                                           1          1          1   
-750.0                                           2          2          2   
-720.0                                           1          1          1   
-560.0                                           1          1          1   
...                                            ...        ...        ...   
 2013.0                                        518        518        518   
 2014.0                                        437        437        437   
 2015.0                                        306        306        306   
 2016.0                                        198        198        198   
 2017.0                                         11         11         11   

                           ratings_3  ratings_4  ratings_5  image_url  \
original_publication_year                                               
-1750.0                            1          1          1          1   
-762.0                             1          1          1          1   
-750.0                             2          2          2          2   
-720.0                             1          1          1          1   
-560.0                             1          1          1          1   
...                              ...        ...        ...        ...   
 2013.0                          518        518        518        518   
 2014.0                          437        437        437        437   
 2015.0                          306        306        306        306   
 2016.0                          198        198        198        198   
 2017.0                           11         11         11         11   

                           small_image_url  
original_publication_year                   
-1750.0                                  1  
-762.0                                   1  
-750.0                                   2  
-720.0                                   1  
-560.0                                   1  
...                                    ...  
 2013.0                                518  
 2014.0                                437  
 2015.0                                306  
 2016.0                                198  
 2017.0                                 11  

[293 rows x 22 columns]

【计数:grouped3 = grouped2["average_rating"] 】
original_publication_year
-1750.0      1
-762.0       1
-750.0       2
-720.0       1
-560.0       1
          ... 
 2013.0    518
 2014.0    437
 2015.0    306
 2016.0    198
 2017.0     11
Name: average_rating, Length: 293, dtype: int64
**(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)****(1)**

【平均值:grouped4 = grouped1.mean()】
                                    id       book_id  best_book_id  \
original_publication_year                                            
-1750.0                    2076.000000  1.935100e+04  1.935100e+04   
-762.0                     2142.000000  1.375000e+03  1.375000e+03   
-750.0                     3253.500000  2.678300e+05  2.678300e+05   
-720.0                       79.000000  1.381000e+03  1.381000e+03   
-560.0                     1120.000000  2.134800e+04  2.134800e+04   
...                                ...           ...           ...   
 2013.0                    5319.747104  1.633865e+07  1.659865e+07   
 2014.0                    5705.828375  1.919866e+07  1.941922e+07   
 2015.0                    5755.503268  2.268179e+07  2.281951e+07   
 2016.0                    5971.434343  2.639991e+07  2.656994e+07   
 2017.0                    8043.636364  2.925244e+07  2.925244e+07   

                                work_id  books_count        isbn13  \
original_publication_year                                            
-1750.0                    3.802528e+06   266.000000  9.780141e+12   
-762.0                     1.474309e+06   255.000000  9.780148e+12   
-750.0                     1.907469e+06   933.000000  9.780416e+12   
-720.0                     3.356006e+06  1703.000000  9.780143e+12   
-560.0                     8.682630e+05   942.000000  9.780193e+12   
...                                 ...          ...           ...   
 2013.0                    2.206115e+07    31.816602  9.663512e+12   
 2014.0                    2.927728e+07    31.299771  9.742640e+12   
 2015.0                    3.989789e+07    29.830065  9.780866e+12   
 2016.0                    4.410693e+07    26.964646  9.780837e+12   
 2017.0                    4.663936e+07    28.727273  9.780937e+12   

                           average_rating  ratings_count  work_ratings_count  \
original_publication_year                                                      
-1750.0                          3.630000   44345.000000        55856.000000   
-762.0                           4.030000   47825.000000        51098.000000   
-750.0                           4.005000  126934.500000       144132.500000   
-720.0                           3.730000  670326.000000       710757.000000   
-560.0                           4.050000   88508.000000        98962.000000   
...                                   ...            ...                 ...   
 2013.0                          4.012297   35561.127413        42097.054054   
 2014.0                          3.985378   29191.851259        35508.691076   
 2015.0                          3.954641   29524.398693        37084.562092   
 2016.0                          4.027576   26676.030303        33942.565657   
 2017.0                          4.100909   25181.090909        32611.545455   

                           work_text_reviews_count     ratings_1  \
original_publication_year                                          
-1750.0                                2247.000000   1551.000000   
-762.0                                  537.000000    916.000000   
-750.0                                 2519.000000   3939.500000   
-720.0                                 8101.000000  29703.000000   
-560.0                                 1441.000000    773.000000   
...                                            ...           ...   
 2013.0                                4063.857143    862.785714   
 2014.0                                3914.050343    718.244851   
 2015.0                                4358.232026    778.522876   
 2016.0                                4365.611111    648.661616   
 2017.0                                5774.909091    487.363636   

                              ratings_2      ratings_3      ratings_4  \
original_publication_year                                               
-1750.0                     5850.000000   17627.000000   17485.000000   
-762.0                      2608.000000   10439.000000   17404.000000   
-750.0                     10722.000000   35746.500000   46807.000000   
-720.0                     65629.000000  183082.000000  224120.000000   
-560.0                      3717.000000   22587.000000   34885.000000   
...                                 ...            ...            ...   
 2013.0                     2163.617761    7917.409266   14859.515444   
 2014.0                     1782.915332    6655.775744   12622.457666   
 2015.0                     1975.650327    7325.369281   13478.388889   
 2016.0                     1758.944444    6606.020202   12147.893939   
 2017.0                     1354.272727    5118.090909   11250.909091   

                               ratings_5  
original_publication_year                 
-1750.0                     13343.000000  
-762.0                      19731.000000  
-750.0                      46917.500000  
-720.0                     208223.000000  
-560.0                      37000.000000  
...                                  ...  
 2013.0                     16293.725869  
 2014.0                     13729.297483  
 2015.0                     13526.630719  
 2016.0                     12781.045455  
 2017.0                     14400.909091  

[293 rows x 15 columns]

【平均值:grouped5 = grouped4["average_rating"]】
original_publication_year
-1750.0    3.630000
-762.0     4.030000
-750.0     4.005000
-720.0     3.730000
-560.0     4.050000
             ...   
 2013.0    4.012297
 2014.0    3.985378
 2015.0    3.954641
 2016.0    4.027576
 2017.0    4.100909
Name: average_rating, Length: 293, dtype: float64
**(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)****(2)**

【求和:grouped6 = grouped1.sum()】
                                id     book_id  best_book_id      work_id  \
original_publication_year                                                   
-1750.0                       2076       19351         19351      3802528   
-762.0                        2142        1375          1375      1474309   
-750.0                        6507      535660        535660      3814938   
-720.0                          79        1381          1381      3356006   
-560.0                        1120       21348         21348       868263   
...                            ...         ...           ...          ...   
 2013.0                    2755629  8463418354    8598098696  11427677552   
 2014.0                    2493447  8389815459    8486197677  12794170667   
 2015.0                    1761184  6940628219    6982769611  12208753377   
 2016.0                    1182344  5227181248    5260848297   8733172252   
 2017.0                      88480   321776859     321776859    513032910   

                           books_count        isbn13  average_rating  \
original_publication_year                                              
-1750.0                            266  9.780141e+12            3.63   
-762.0                             255  9.780148e+12            4.03   
-750.0                            1866  1.956083e+13            8.01   
-720.0                            1703  9.780143e+12            3.73   
-560.0                             942  9.780193e+12            4.05   
...                                ...           ...             ...   
 2013.0                          16481  3.942713e+15         2078.37   
 2014.0                          13678  3.487865e+15         1741.61   
 2015.0                           9128  2.357189e+15         1210.12   
 2016.0                           5339  1.437783e+15          797.46   
 2017.0                            316  8.802843e+13           45.11   

                           ratings_count  work_ratings_count  \
original_publication_year                                      
-1750.0                            44345               55856   
-762.0                             47825               51098   
-750.0                            253869              288265   
-720.0                            670326              710757   
-560.0                             88508               98962   
...                                  ...                 ...   
 2013.0                         18420664            21806274   
 2014.0                         12756839            15517298   
 2015.0                          9034466            11347876   
 2016.0                          5281854             6720628   
 2017.0                           276992              358727   

                           work_text_reviews_count  ratings_1  ratings_2  \
original_publication_year                                                  
-1750.0                                       2247       1551       5850   
-762.0                                         537        916       2608   
-750.0                                        5038       7879      21444   
-720.0                                        8101      29703      65629   
-560.0                                        1441        773       3717   
...                                            ...        ...        ...   
 2013.0                                    2105078     446923    1120754   
 2014.0                                    1710440     313873     779134   
 2015.0                                    1333619     238228     604549   
 2016.0                                     864391     128435     348271   
 2017.0                                      63524       5361      14897   

                           ratings_3  ratings_4  ratings_5  
original_publication_year                                   
-1750.0                        17627      17485      13343  
-762.0                         10439      17404      19731  
-750.0                         71493      93614      93835  
-720.0                        183082     224120     208223  
-560.0                         22587      34885      37000  
...                              ...        ...        ...  
 2013.0                      4101218    7697229    8440150  
 2014.0                      2908574    5516014    5999703  
 2015.0                      2241563    4124387    4139149  
 2016.0                      1307992    2405283    2530647  
 2017.0                        56299     123760     158410  

[293 rows x 15 columns]

【求和:grouped7 = grouped6["average_rating"]】
original_publication_year
-1750.0       3.63
-762.0        4.03
-750.0        8.01
-720.0        3.73
-560.0        4.05
            ...   
 2013.0    2078.37
 2014.0    1741.61
 2015.0    1210.12
 2016.0     797.46
 2017.0      45.11
Name: average_rating, Length: 293, dtype: float64
# (3)不同年份书的数量----条形图
import pandas as pd
from matplotlib import pyplot as plt 

df = pd.read_csv('./code2/books.csv')



# step1:去掉nan值
df1 = df[ pd.notnull(df['original_publication_year']) ]          # pd.notnul()----返回bull值






# step2:按照年份分组
grouped = df1.groupby( by=df1['original_publication_year'] ).count()
data1 = grouped['title']                   # 由于计数,每一行的值都相等------->故选择“title”,此列没有缺失值。选择任意列都可。







# step3:画图

# 方法1----折线图
_x = data1.index         # data1.index  :年份
_y = data1.values         # data1.values :计数的值

plt.figure(figsize=(20,8), dpi=80)
plt.plot(range(len(_x)), _y)
plt.xticks(list(range(len(_x)))[::20], _x[::20].astype('int'))   # 取步长:[::20]   a.astype('int'):默认小数,将年份转化为整数
plt.show()






# 方法1----条形图
print(data1)
data2 = data1.sort_values(ascending=False)              # sort_values默认升序:ascending=True              ascend:上升
print(data2)
data3 = data2[:2000]            # 取2020----2000之间的数据
print(data3)


_x = data3.index
_y = data3.values

plt.figure(figsize=(20,8), dpi=80)
plt.bar(range(len(_x)), _y, width=0.3)
plt.xticks(range(len(_x)), _x.astype('int')) 
plt.show()

在这里插入图片描述

original_publication_year
-1750.0      1
-762.0       1
-750.0       2
-720.0       1
-560.0       1
          ... 
 2013.0    518
 2014.0    437
 2015.0    306
 2016.0    198
 2017.0     11
Name: title, Length: 293, dtype: int64
original_publication_year
 2012.0    568
 2011.0    556
 2013.0    518
 2010.0    473
 2014.0    437
          ... 
 1749.0      1
 1759.0      1
 1762.0      1
 1764.0      1
-1750.0      1
Name: title, Length: 293, dtype: int64
original_publication_year
2012.0    568
2011.0    556
2013.0    518
2010.0    473
2014.0    437
2009.0    432
2008.0    383
2007.0    363
2006.0    362
2005.0    326
2004.0    307
2015.0    306
2003.0    288
2001.0    226
2002.0    225
2000.0    209
Name: title, dtype: int64

在这里插入图片描述

# (4)不同年份书的平均评分情况----折线图(反映变化情况)
'''
average_rating:平均评分
(1)average_rating:是不同机构对一本书评分的平均值
(2)现在按年份分组之后,再对年份的评分求平均值

grouped1 = df1.groupby( by=df['original_publication_year'] )['average_rating'].mean()    先按年份分组,再求评分的平均值
(1)应该先groupby完后,再选“rating”列,最后求均值mean()。这个顺序更好 
'''
import pandas as pd
from matplotlib import pyplot as plt

df = pd.read_csv('./code2/books.csv')




# step1:去掉"年份"中的缺失值
df1 = df[ pd.notnull(df['original_publication_year']) ]       # pd.notnull()----返回bull值






# step2:按照”年份“分组,再求评分的平均值
data1 = df1.groupby( by=df['original_publication_year'] )['average_rating'].mean()
print(data1)





# step3:画折线图(年份不连续)
_x = data1.index
_y = data1.values

plt.figure(figsize=(20,8), dpi=80)
plt.plot(range(len(_x)), _y)
plt.xticks(list(range(len(_x)))[::10], _x[::10].astype('int'), rotation=45)     # 取步长:[::10]     a.astype():默认为小数
plt.show()
original_publication_year
-1750.0    3.630000
-762.0     4.030000
-750.0     4.005000
-720.0     3.730000
-560.0     4.050000
             ...   
 2013.0    4.012297
 2014.0    3.985378
 2015.0    3.954641
 2016.0    4.027576
 2017.0    4.100909
Name: average_rating, Length: 293, dtype: float64

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值