scrapy_TypeError: Object of type 'QiubaiItem' is not JSON serializable

最新推荐文章于 2023-10-14 22:07:48 发布

Urila

最新推荐文章于 2023-10-14 22:07:48 发布

阅读量578

点赞数 3

分类专栏： python问题及解决方案爬虫问题错误整理 scrapy 爬虫

本文链接：https://blog.csdn.net/jss19940414/article/details/85315967

版权

错误整理同时被 3 个专栏收录

66 篇文章 1 订阅

订阅专栏

python问题及解决方案

33 篇文章 1 订阅

订阅专栏

爬虫问题

27 篇文章 0 订阅

订阅专栏

问题描述：

使用scrapy对糗事百科进行爬虫的时，在spider文件中返回在items文件的实例化对象，然后在管道文件进行磁盘持久化的时候，想将从spider文件传过来的数据通过json转换成字符串然后在写入文件，结果报错

Traceback (most recent call last):
  File "e:\anaconda3\lib\site-packages\twisted\internet\defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "E:\Scrapy\Qiubai\Qiubai\pipelines.py", line 19, in process_item
    self.f.write(json.dumps(item, ensure_ascii=False) + ",\n")
  File "e:\anaconda3\lib\json\__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "e:\anaconda3\lib\json\encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "e:\anaconda3\lib\json\encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "e:\anaconda3\lib\json\encoder.py", line 180, in default
    o.__class__.__name__)

代码如下：

class QiubaiPipeline(object):
    def open_spider(self, spider):
        self.f = open("糗百.json", "a", encoding="utf-8")

    def process_item(self, item, spider):
        self.f.write(json.dumps(item, ensure_ascii=False) + ",\n")
        print("保存成功")

        return item

    def close_spider(self, spider):
        self.f.close()

原因分析：

既然不能使用json进行转换，那么就打印它的数据类型

import json


class QiubaiPipeline(object):
    def open_spider(self, spider):
        self.f = open("糗百.json", "a", encoding="utf-8")

    def process_item(self, item, spider):
        print("process_item")
        print(type(item))
        self.f.write(json.dumps(item, ensure_ascii=False) + ",\n")
        print("保存成功")

        return item

    def close_spider(self, spider):
        self.f.close()

process_item
<class 'Qiubai.items.QiubaiItem'>

百度了下：

解决方式是自己定义一个类，当遇到报错类型的数据时，就直接转换成字符串。

我这里给出的解决方案：

先将从spider文件传过来的数据使用dict()进行数据类型的转换成字典，然后在使用如下代码进行磁盘持久化。

self.f.write(json.dumps(dict(item), ensure_ascii=False) + ",\n")

Urila

关注

3
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录