Python爬虫学习（六）---- 爬虫输出器

最新推荐文章于 2023-08-31 20:01:47 发布

梦想周游全国的孩子

最新推荐文章于 2023-08-31 20:01:47 发布

阅读量699

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/qq_37163479/article/details/79194048

版权

Python 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

爬虫输出器

此输出器的主要作用是从经过筛选的下载中获取到对应的值，然后转化成你想要的文件形式来对你所需要的内容进行输出，你可以对其进行排版转化为html的table标签形式，这样一般会比较美观。

实现代码

#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
__author__ = 'Gary'

# 输出器

class HtmlOutputer(object):
    def __init__(self):
        self.datas = []


    def collect_data(self, data):
        if data is None:
            return
        self.datas.append(data)

    def output_html(self):
        fout = open('output.html', 'w', encoding='utf-8')
        fout.write("<html>")
        fout.write("<body>")
        #fout.write("<table>")
        fout.write("<a>")

        for data in self.datas:
            # fout.write("<tr>")
            # fout.write("<td>%s</td>" % data['url'])
            # fout.write("<td>%s</td>" % data['title'])
            # fout.write("<td>%s</td>" % data['summary'])
            # fout.write("</tr>")
            fout.write('<a href="%s">%s</a>' % (data['url'], data['title']))
            fout.write('<p>%s</p>' % data['summary'])

        fout.write("</a>")
        #fout.write("</table>")
        fout.write("</body>")
        fout.write("</html>")
        fout.close()