本文中我们将详细介绍使用Scrapy抓取数据并存入MongoDB数据库,首先给出我们需要抓取得数据:
抓取起点网得全部作品,网址为:https://www.qidian.com/all
关于Scrapy的下载与安装请移步上篇博客Scrapy简单案例
关于MongoDB的下载安装请移步博客MongoDB安装
下面直接给出相关代码;
(1) 数据封装类item.py
# -*- coding: utf-8 -*-
# Define here the models for your scraped items
#
# See documentation in:
# https://doc.scrapy.org/en/latest/topics/items.html
import scrapy
class NovelItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()
link = scrapy.Field()#URL
category = scrapy.Field()
bookname = scrapy.Field()
author = scrapy.Field()
content = sc