scrapy 排错记录

之前在服务器上用scrapy写爬虫,一直用得好好的。结果前天一同学在上面装了NLTK后就再也用不了了(不管是用shell还是crawl),报错如下:

Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 9, in <module>
    load_entry_point('Scrapy==0.24.4', 'console_scripts', 'scrapy')()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/shell.py", line 46, in run
    self.crawler_process.start_crawling()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 124, in start_crawling
    return self._start_crawler() is not None
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 139, in _start_crawler
    crawler.configure()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 46, in configure
    self.extensions = ExtensionManager.from_crawler(self)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 50, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 29, in from_settings
    mwcls = load_object(clspath)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 42, in load_object
    raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'scrapy.telnet.TelnetConsole': No module named conch 

提示 conch 这个模块没有找到,这可能是 sys.path 被改了。所幸之前在tmux上挂着一个python交互窗口,可以查得旧的sys.path。与现在的 sys.path 对比发现多出了两项:

'/usr/local/lib/python2.7/dist-packages/jieba-0.36.1-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages/setuptools-15.0-py2.7.egg'

按理说找不到东西应该是 sys.path 少了一些东西才是,这个一时看不出什么。

于是沿着python报错信息,试图简单地重现错误。
错误是在 /usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py 中抛出的,代码如下:

    try:
        mod = import_module(module)
    except ImportError as e:
        raise ImportError("Error loading object '%s': %s" % (path, e))

于是可以这样重现这个错误:

>>> from importlib import import_module
>>> import_module('scrapy.telnet')

同样是得到 No module named conch 的报错。在 scrapy 项目的 telnet.py 里一开始就有这么一行代码:

from twisted.conch import manhole, telnet

这行代码没有执行成功,因为找不到 conch 这个模块。尝试直接 import twisted.conch 也是失败的。

python的第三方包都放在dist-packages目录里,在 /usr/local/lib/python2.7/dist-packages 我找到了 twisted 目录,里面是有 conch 的!
然后我用 locate 指令看系统中的 twisted 目录都在哪里,因为有可能新装了什么把原来可用的给替代了。

$ locate twisted

最终发现,在 /usr/lib/python2.7/dist-packages 下也有一个 twisted 目录,而且里面确实没有 conch 这个子目录。查看 _version.py,有这么一行:

     version = versions.Version('twisted', 11, 1, 0)

而看原来在用的 /usr/local/lib/python2.7/dist-packages/twisted 里的 _version.py,这一行是:

     version = versions.Version('twisted', 14, 0, 2)

这说明现在沿 sys.path 搜到的是老版本的 twisted(可能是以前谁装的),sys.path 被改动后,又指向了这个老的 twisted. 仔细比较 sys.path,有两行的顺序改变了。
这是之前正常的 sys.path:

''
'/usr/local/lib/python2.7/dist-packages/requests-2.0.0-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/kafka_python-0.8.1_1-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/tox-1.6.1-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/py-1.4.19-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/virtualenv-1.10.1-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/pymongo-2.6.3-py2.7-linux-x86_64.egg'
'/usr/lib/python2.7'
'/usr/lib/python2.7/plat-linux2'
'/usr/lib/python2.7/lib-tk'
'/usr/lib/python2.7/lib-old'
'/usr/lib/python2.7/lib-dynload'
'/usr/local/lib/python2.7/dist-packages'        ### 注意这一行
'/usr/lib/python2.7/dist-packages'              ### 还有这一行
'/usr/lib/python2.7/dist-packages/PIL'
'/usr/lib/python2.7/dist-packages/gst-0.10'
'/usr/lib/python2.7/dist-packages/gtk-2.0'
'/usr/lib/pymodules/python2.7'
'/usr/lib/python2.7/dist-packages/ubuntu-sso-client'
'/usr/lib/python2.7/dist-packages/ubuntuone-client'
'/usr/lib/python2.7/dist-packages/ubuntuone-control-panel'
'/usr/lib/python2.7/dist-packages/ubuntuone-couch'
'/usr/lib/python2.7/dist-packages/ubuntuone-installer'
'/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol'

这是被人装了东西后,即现在的sys.path:

''
'/usr/local/lib/python2.7/dist-packages/requests-2.0.0-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/kafka_python-0.8.1_1-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/tox-1.6.1-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/py-1.4.19-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/virtualenv-1.10.1-py2.7.egg'
'/usr/local/lib/python2.7/dist-packages/pymongo-2.6.3-py2.7-linux-x86_64.egg'
'/usr/local/lib/python2.7/dist-packages/setuptools-15.0-py2.7.egg'
'/usr/lib/python2.7/dist-packages'          # 这一行被挪到了前面
'/usr/local/lib/python2.7/dist-packages/jieba-0.36.1-py2.7.egg'
'/usr/lib/python2.7'
'/usr/lib/python2.7/plat-linux2'
'/usr/lib/python2.7/lib-tk'
'/usr/lib/python2.7/lib-old'
'/usr/lib/python2.7/lib-dynload'
'/usr/local/lib/python2.7/dist-packages'    # 这一行相比就在后面了
'/usr/lib/python2.7/dist-packages/PIL'
'/usr/lib/python2.7/dist-packages/gst-0.10'
'/usr/lib/python2.7/dist-packages/gtk-2.0'
'/usr/lib/pymodules/python2.7'
'/usr/lib/python2.7/dist-packages/ubuntu-sso-client'
'/usr/lib/python2.7/dist-packages/ubuntuone-client'
'/usr/lib/python2.7/dist-packages/ubuntuone-control-panel'
'/usr/lib/python2.7/dist-packages/ubuntuone-couch'
'/usr/lib/python2.7/dist-packages/ubuntuone-installer'
'/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol'

注意注释的那两行,/usr/lib/python2.7/dist-packages 在新的sys.path 中被提到了前面,于是就先找到了那个老版本的 twisted!终于知道为什么出错了,长吁一口气~
接下来,把老版本的 twisted 目录删掉(或改名)就行了,同样处理掉的还有对应的几个 egg-info 文件。文件名如下:

twisted  Twisted_Core-11.1.0.egg-info  Twisted_Names-11.1.0.egg-info  Twisted_Web-11.1.0.egg-info

当然也可以改默认的 sys.path,把 /usr/local/lib/python2.7/dist-packages 放在前面。但考虑到可能同样会影响别人,还是直接把老版本的东西丢掉得了,反正没什么用。

最后的解决虽然简单,但还是花了不少时间来找这个问题,服务器排错本身就是一个考验耐心的事情!
写这篇文章可能没什么直接的参考价值,因为每个人的环境不一样,出错的原因也不一样。只是排错的思路,或许可以给无助的朋友一点帮助,因为一开始我遇到这个问题的时候,也是非常地懊恼,网上找不到什么帮得上忙的资料。最终还是得静下心来,加深对 Python 的理解。总之要有这个信念:问题总是能解决的!

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值