它没有找到 名叫 baidu 的爬虫,所以我根据提示一步一步让它跑起来了
先安装了 scrapy
再创建了 scrapy01 项目
scrapy startproject scrapy01
然后我根据提示,到了 scrapy01 这个页面, scrapy genspider example example.com
再修改了 example.py:
就有了下面的运行日志,不过并没有 “百度知道” 这些字样… …
于是,就把 settings.py 里面的
ROBOTSTXT_OBEY = True
改成了 ROBOTSTXT_OBEY = False
下面是运行日志
<!--
D:\pyFile>scrapy startproject scrapy01
New Scrapy project 'scrapy01', using template directory 'd:\python37-32\lib\site-packages\scrapy\templates\project', created in:
D:\pyFile\scrapy01
You can start your first spider with:
cd scrapy01
scrapy genspider example example.com
D:\pyFile>cd scrapy01
D:\pyFile\scrapy01>scrapy genspider example example.com
Created spider 'example' using template 'basic' in module:
scrapy01.spiders.example
D:\pyFile\scrapy01>cd ..
D:\pyFile>scrapy crawl baidu
Scrapy 2.5.0 - no active project
Unknown command: crawl
Use "scrapy" to see available commands
D:\pyFile>cd scrapy01
D:\pyFile\scrapy01>scrapy crawl baidu
2021-08-01 19:14:38 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: scrapy01)
2021-08-01 19:14:38 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Ma
r 2021), cryptography 3.4.7, Platform Windows-10-10.0.17763-SP0
2021-08-01 19:14:38 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
Traceback (most recent call last):
File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 75, in load
return self._spiders[spider_name]
KeyError: 'baidu'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\python37-32\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "d:\python37-32\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\Python37-32\Scripts\scrapy.exe\__main__.py", line 7, in <module>
File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 145, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 100, in _run_print_help
func(*a, **kw)
File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 153, in _run_command
cmd.run(args, opts)
File "d:\python37-32\lib\site-packages\scrapy\commands\crawl.py", line 22, in run
crawl_defer = self.crawler_process.crawl(spname, **opts.spargs)
File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 191, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "d:\py