将scrapy提取的数据保存到 pymysql 数据库中

一、将我们再命名的文件中的代码写好之后yield item中

        item = HongxiuItem()
        item['title'] = title
        item['author'] = author
        item['tags'] = tags
        item['total_word_num'] = total_word_num
        item['keep_num'] = keep_num
        item['click_num'] = click_num
        item['info'] = info
        yield item

二、我们还要在 items  中配置field

class HongxiuItem(scrapy.Item):
    title = scrapy.Field()
    author = scrapy.Field()
    tags = scrapy.Field()
    total_word_num = scrapy.Field()
    keep_num = scrapy.Field()
    click_num = scrapy.Field()
    info = scrapy.Field()

三、接下来我们要写 pipelines的代码 

import pymysql

class HongxiuPipeline(object):
    # open_spider()和close_spider():只在爬虫被打开和关闭时,执行一次。
    def open_spider(self, spider):
        self.connect = pymysql.connect(
            host='localhost',
            user='root',
            port=3306,
            passwd='123456',
            db='hongxiu',
            charset='utf8'
        )
        self.cursor = self.connect.cursor()

    def process_item(self, item, spider):
        insert_sql = "INSERT INTO hx(title, author, tags, total_word_num, keep_num, click_num, info) VALUES (%s, %s, %s, %s, %s, %s, %s)"
        self.cursor.execute(insert_sql, (
        item['title'], item['author'], item['tags'], item['total_word_num'], item['keep_num'], item['click_num'],
        item['info']))
        self.connect.commit()
        
    def close_spider(self, spider):
        self.cursor.close()
        self.connect.close()

四、我们还要在settings中放开我们的 ITEM_PIPELINES

ITEM_PIPELINES = {
   'hongxiu.pipelines.HongxiuPipeline': 300,
}


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值