Python可视化展现—看博客大佬们的写作规律

最新推荐文章于 2024-05-02 21:49:49 发布

__HelloWorld__

最新推荐文章于 2024-05-02 21:49:49 发布

阅读量475

点赞数

分类专栏：综合 Python 前端文章标签： Python 可视化 Matplotlib CSDN博客

本文链接：https://blog.csdn.net/kangkanglou/article/details/78796123

版权

综合同时被 3 个专栏收录

91 篇文章 0 订阅

订阅专栏

前端

51 篇文章 2 订阅

订阅专栏

Python

41 篇文章 2 订阅

订阅专栏

在《使用BeautifulSoup爬取CSDN博客文章》这篇文章中，我们使用了BeautifulSoup爬取了CSDN博客的访问情况以及文章列表信息，紧接着，我们可以做一些更好玩的事情，比如：分析下这些排名靠前的一些博客写作规律等信息，我们首先按时间维度分析下这些“博客大佬们“的产量信息：
我们定义一个dict用以记录博客发表年份与发表文章数量关系，修改walk_tree如下：

article_dict = {}


def walk_tree_c(html, num):
    for li in html.find_all("li"):
        num = num + 1
        print("%s %s %s%s" % (num, li.h3.a.string, CSDN_BLOG_URL, li.h3.a["href"]))
        for d in li.find_all("div"):
            if "class" in d.attrs and str.strip(d["class"][0]) == "unit-control":
                print(d.div.find_all("div")[0].string + "，发表时间：" + d.div.find_all("div")[1].string + "，阅读量：" +
                      d.div.find_all("div")[2].span.string + "，评论数：" + d.div.find_all("div")[3].span.string)
                t_value = d.div.find_all("div")[1].string
                year = int(str.strip(t_value)[0:4])
                if article_dict.get(year, 0) == 0:
                    article_dict[year] = 1
                else:
                    article_dict[year] = article_dict[year] + 1

    return num

文章年份处理

t_value = d.div.find_all(“div”)[1].string
year = int(str.strip(t_value)[0:4])
if article_dict.get(year, 0) == 0:
article_dict[year] = 1
else:
article_dict[year] = article_dict[year]

获取到了文章发表年份与文章列表总数之后，我们使用matplotlib.pyplot来进行绘图展现，当然，因为dict是无序存储的，所以在进行展现之前，我们先对article_dict处理一下

year_list = []
num_list = []
for key in article_dict:
    year_list.append(key)
year_list.sort()
for y in year_list:
    num_list.append(article_dict[y])

print(article_dict)

然后，我们使用matplotlib.pyplot展现

plt.plot(year_list, num_list)
plt.xlabel("year")
plt.ylabel("number")
plt.xticks(year_list)
plt.title("CSDN Blog Statistics")
plt.show()

结果查看：

比如对于：http://blog.csdn.net/phphot

这里写图片描述

比如，对于：http://blog.csdn.net/littletigerat

这里写图片描述

__HelloWorld__

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录