爬虫爬取电商网站的商品数据并保存成json文件

最新推荐文章于 2024-07-12 10:00:20 发布

chen_ke_hao

最新推荐文章于 2024-07-12 10:00:20 发布

阅读量5.1k

点赞数 1

文章标签： python 框架 json 数据

本文链接：https://blog.csdn.net/chen_ke_hao/article/details/78682248

版权

本文将展示如何利用Python的Scrapy框架爬取当当网上的地方特产商品信息，包括商品名称、价格、链接和评论数。通过分析网页源代码获取字段，并最终将数据导出为JSON文件。

摘要由CSDN通过智能技术生成

这里爬取的电商网站为当当网的地方特产为例

首先建立爬虫项目

scrapy startproject autop

然后就要编写items文件了

# -*- coding: utf-8 -*-
# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html
import scrapy
class AutopItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    name=scrapy.Field()#商品名称
    price=scrapy.Field()#商品价格
    link=scrapy.Field()#商品的链接
    comnum=scrapy.Field()#商品的评论数

pipelines的编写

# -*- coding: utf-8 -*-
import codecs
import json
# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html


class AutopPipeline(object):
    def __init__(self):#初始化
        self.f=codecs.open(