一、目标
爬取多页人人车的车辆信息
二、分析
2.1 网站分析
在网页源代码中可以搜索到页面中的数据,所以可以判断该页面为静态加载的
三、完整代码
renrenche.py
import scrapy
from car.items import RrcItem
class RenrencheSpider(scrapy.Spider):
name = 'renrenche'
allowed_domains = ['www.renrenche.com']
start_urls = ['https://www.renrenche.com/bj/ershouche/?&plog_id=618ab1bbf616cab93022afa088592885']
base_url = 'https://www.renrenche.com'
def parse(self, response):
selector = response.xpath('//ul[contains(@class,"row-fluid list-row js-car-list")]/li/a[not(@rel)]')
# print(len(selector))
# print(selector)
for car in selector:
car_name = car.xpath('./h3/text()'