Scrapy抓取乐有家二手房信息与数据分析_乐有家怎么获取真实的价钱爬虫-CSDN博客

本文链接：https://blog.csdn.net/qq_42206477/article/details/84889119

本文利用Scrapy爬虫抓取乐有家网站上的长沙二手房信息，通过对数据的分析和可视化，发现开福区二手房销售占比最高，房价主要集中在1万到1.45万每平米之间。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

通过抓取乐有家房产公司的信息，研究下长沙的房价。最后用Pandas和Matplotlib进行了分析。

网页结构分析

乐有家长沙二手房信息网页（https://changsha.leyoujia.com/esf/）

接着用Scrapy shell验证二手房XPath表达式

#标题
response.xpath('./div[@class="text"]/p[@class="tit"]/a/text()').extract_first()
#总价
response.xpath('./div[@class="price"]/p[@class="sup"]/span[@class="salePrice"]/text()').extract_first()
#单价
response.xpath('./div[@class="price"]/p[@class="sub"]/text()').re(r'单价(.*?)元/㎡')[0]
#面积
reponse.xpath('./div[@class="text"]/p[@class="attr"]/span/text()').re(r'套内面积(.*?)㎡')[0]
#区域
response.xpath('./div[@class="text"]//a/text()').re(r'开福|雨花|岳麓|天心|芙蓉|望城|星沙')[0]

二手房爬虫

二手房的信息比较少，用一般的Scrapy就可以。
在目标文件夹中运行以下代码，创建一个爬虫：

scrapy startproject ershoufang_spider

在items.py文件中定义items：

# -*- coding: utf-8 -*-

# Define here the models for your scraped items
#
# See documentation in:
# https://doc.scrapy.org/en/latest/topics/items.html

import scrapy
class LeyoujiaItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    title = scrapy