python< 解决OperationalError (2006, 'MySQL server has gone away')>

今天在将爬虫数据写入MySQL的时候,发现了一个很蛋疼的问题
在管道中写操作MySQL的逻辑的时候,有一部分数据能写入数据库,但是另外一部分数据写不进去,经过检查 发现并不是数据格式的问题。

 OperationalError (2006, 'MySQL server has gone away')

经过多方排查最终确定为由于爬虫的时候有些网页可能需要 很长时间才能打开,导致MySQL连接超时。

解决的办法:

在代码中加入:

self.db.ping(True)

仔细看这个方法:

    def ping(self): # real signature unknown; restored from __doc__
        """
        Checks whether or not the connection to the server is
        working. If it has gone down, an automatic reconnection is
        attempted.

        This function can be used by clients that remain idle for a
        long while, to check whether or not the server has closed the
        connection and reconnect if necessary.

        New in 1.2.2: Accepts an optional reconnect parameter. If True,
        then the client will attempt reconnection. Note that this setting
        is persistent. By default, this is on in MySQL<5.0.3, and off
        thereafter.

        Non-standard. You should assume that ping() performs an
        implicit rollback; use only when starting a new transaction.
        You have been warned.
        """
        pass

说的而是检测MySQL的连接是否断开,如果断开就自动获取新的连接。

管道的完整代码如下:

# -*- coding: utf-8 -*-

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html
import MySQLdb
import sys

reload(sys)
sys.setdefaultencoding('utf-8')


class LetvShowPipeline(object):
    def __init__(self):
        self.db = MySQLdb.connect("IP地址", "用户名", "密码", "vboxDB", charset='utf8', port=端口)
        self.cursor = self.db.cursor()

    def process_item(self, item, spider):
        # 提取字段
        area = item['area']
        host = item['host']
        live_channel = item['live_channel']
        name = item['name']
        programs = item['programs']
        selfproduced = item['selfproduced']
        type = item['type']
        number = item['number']
        vv = item['vv']
        year = item['year']
        print "==============================================="
        allhost = ''
        for h in host:
            allhost += h + '$'
        print area, allhost, live_channel, name, programs, selfproduced, type, vv, year
        # 写进数据库
        self.db.ping(True)
        # 先查询数据库中有没有数据
        sql = "SELECT * FROM letv_show WHERE name='%s'" % name
        self.cursor.execute(sql)
        result = self.cursor.fetchall()
        if len(result) != 0:
            print 'updating data....'
            # 如果数据库中已经存在该卡通,更新vv和number字段
            sql = "UPDATE letv_show  SET vv= '%s',number='%s'  WHERE  name='%s'" % (vv, number, name)
        else:
            print 'insert new data ....'
            # 如果数据库中没有该卡通将数据写入MySQL
            sql = "INSERT INTO letv_show(area, host, live_channel, name, programs, selfproduced, type, vv, year,number) \
                                                VALUES ('%s','%s','%s','%s','%s','%s','%s','%s','%s','%s')" % \
                  (area, allhost, live_channel, name, programs, selfproduced, type, vv, year, number)
        try:
            # 执行sql语句
            self.cursor.execute(sql)
            # 提交到数据库执行
            self.db.commit()
        except:
            # Rollback in case there is any error
            self.db.rollback()

        return item
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值