数据结构化与保存

最新推荐文章于 2023-08-08 10:50:25 发布

weixin_34390105

最新推荐文章于 2023-08-08 10:50:25 发布

阅读量62

点赞数

文章标签：数据结构与算法 python

原文链接：http://www.cnblogs.com/mimimi/p/8855098.html

版权

1. 将新闻的正文内容保存到文本文件。

def writeNewsDetail(content):
    f = open('gzccNews.txt', 'a',encoding='utf-8')
    f.write(content)
    f.close()

2. 将新闻数据结构化为字典的列表:

单条新闻的详情-->字典news
```
news ={}
```
一个列表页所有单条新闻汇总-->列表newsls.append(news)
```
newsList.append(getNewsDetail(newsUrl))
```
所有列表页的所有新闻汇总列表newstotal.extend(newsls)

newsTotal.extend(getListPage(firstPageUrl))

3. 安装pandas，用pandas.DataFrame(newstotal)，创建一个DataFrame对象df.

　　

df = pandas.DataFrame(newsTotal)

4. 通过df将提取的数据保存到csv或excel 文件。

　　

df.to_excel("gzcc01.xlsx")

5. 用pandas提供的函数和方法进行数据分析：

提取包含点击次数、标题、来源的前6行数据

print(df[['click','title','source']][0:6])

提取‘学校综合办’发布的，‘点击次数’超过3000的新闻。
```
print(df[df['click']>3000])
```
提取'国际学院'和'学生工作处'发布的新闻。

dtSource = ['学生工作处','国际学院']
print(df[df['source'].isin(dtSource)])

转载于:https://www.cnblogs.com/mimimi/p/8855098.html

weixin_34390105

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
数据结构化与保存

1. 将新闻的正文内容保存到文本文件。def writeNewsDetail(content): f = open('gzccNews.txt', 'a',encoding='utf-8') f.write(content) f.close()2. 将新闻数据结构化为字典的列表:单条新闻的详情-->字典newsnews ={}一个列表页所有单...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。