北京二手房价格预测

最新推荐文章于 2023-02-06 22:51:22 发布

what colour

最新推荐文章于 2023-02-06 22:51:22 发布

阅读量2.3k

点赞数 5

本文链接：https://blog.csdn.net/qq_48201996/article/details/108420344

版权

该项目基于链家数据，分析了北京二手房价格与区域、朝向、装修、房型等因素的关系，通过数据预处理、建模和实例模拟，发现面积是影响价格的主要因素，随机森林模型预测效果优于决策树。

摘要由CSDN通过智能技术生成

北京二手房价格预测

项目介绍

根据链家上的北京二手房信息，对数据进行进一步的清洗处理，分析各特征和价格之间的关系，筛选对价格影响比较显著的特征，探索北京二手房的价格情况，并建立房价预测模型

数据预处理

读取数据

#导入库
import numpy as np
import pandas as pd
import random
from datetime import datetime
from matplotlib import pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import Lasso
from sklearn.ensemble import RandomForestRegressor
%matplotlib inline
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus']=False

df=pd.read_csv(r'D:\BI study download\lianjia\lianjia.csv')
df.head()

在这里插入图片描述
‘Direction’：房屋朝向
‘District’：房屋所在区域
‘Elevator’：电梯情况
‘Floor’：所在楼层
‘Garden’：所在楼盘
‘Id’：房屋id
‘Layout’：房屋房型
‘Price’：价格
‘Region’：所在行政区
‘Renovation’：装修情况
‘Size’：面积
‘Year’：所建年份

df.info()

在这里插入图片描述
观察数据后发现，Elevator有大量空值，其属于分类变量，NAN代表无电梯，因此用无电梯替换内容

df['Elevator'].fillna('无电梯',inplace=True)

删除重复值

df.drop_duplicates(inplace=True)
df.reset_index(drop=True, inplace=True)

数据分析

区域和价格的关系

df.describe()

在这里插入图片描述

fig,ax = plt.subplots(1,2,figsize=(16,6))
data0.boxplot(column=['Price'], flierprops={
   'markeredgecolor':'red', 'markersize':4}, ax=ax[0])
data0.boxplot(column=['Size'], flierprops={
   'markeredgecolor':'red', 'markersize':4}, ax=ax[1])

在这里插入图片描述
经观察数据，北京二手房的面积大概率集中在15-200平米，价格集中在60-1200万。价格最低的二手房为60万，最高6000万；面积最小为15平米，最大为1019平米。北京一套二手房的平均价格为607万，平均面积99平米。
本次分析中，为减少价格极高的二手房对整体数据的影响，只考虑大部分人可以承担的二手房价格，放弃价格属于异常值得数据，即价格超过1200的二手房数据作为异常值删除。¶

df.drop(index = df[df['Price'] > 1200].index, inplace=True)
df.info()

在这里插入图片描述
可视化不同地区的房屋价格和数量

fig,ax = plt.subplots(2,1,figsize=(30,18

最低0.47元/天解锁文章

what colour

关注

5
点赞
踩
45

收藏

觉得还不错? 一键收藏
2
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫