LinkExtractor
LinkExtractor构造器所有的参数都有默认值,如果构造对象不传参,默认提取页面中所有的链接
2020-07-13 15:24:53 [parso.python.diff] DEBUG: diff parser end
In [1]: from scrapy.linkextractors import LinkExtractor
In [2]: le = LinkExtractor()
In [3]: links = le.extract_links(response)
In [4]: [link.url for link in links]
Out[4]:
['http://books.toscrape.com/index.html',
'http://books.toscrape.com/catalogue/category/books_1/index.html',
'http://books.toscrape.com/catalogue/category/books/travel_2/index.html',
'http://books.toscrape.com/catalogue/category/books/mystery_3/index.html',