Python探索性数据分析畅销书

亚图跨际

已于 2022-04-24 16:47:50 修改

阅读量222

点赞数

分类专栏：数据科学 Python 文章标签： python 探索性数据分析

于 2022-03-06 09:46:51 首次发布

本文链接：https://blog.csdn.net/jiyotin/article/details/123305959

版权

Python 同时被 2 个专栏收录

377 篇文章 24 订阅

订阅专栏

数据科学

47 篇文章 2 订阅

订阅专栏

探索性数据分析

探索性数据分析（EDA）是一种分析和调查数据集以了解数据特征的方法。

数据集

查看数据集示例，有许多与 2009 年至 2019 年在销售的畅销书的标题和作者相关的信息。除了标题和作者之外，数据中还有其他元素，例如用户评分、评论、价格、年份和书籍类型。

	Price	Reviews	User Rating
Count	550	550	550
Mean	13.1	11953.28	4.61
Std	10.84	11731.13	0.22
Min	0	37	3.3
25%	7	4058	4.5
50%	11	8580	4.7
75%	16	17253.25	4.8
Max	105	87841	4.9

Python 分析数据集

数值数据行为

简述：使用Python探索畅销图书的评价和评分，直方图显示。

#distribution type books that goes hits by year (fiction / non fiction ) 
df_books = df.groupby(['Genre','Year']).agg({'Name':'nunique'}).reset_index()

ax = sns.barplot(x="Year", y="Name", hue='Genre', data=df_books)
sns.set(rc={'figure.figsize':(15,9)})
ax.set(xlabel='Year', ylabel='Total Books')

autolabel(ax.patches, labels= df_books.Name, height_factor=1.02)
ax.legend(loc=1, bbox_to_anchor=(1.0,1.1))

plt.title('Distribution of Total Books by Genre per Year') # Set the title