机器学习【1.预测房价】

最新推荐文章于 2024-08-05 23:56:13 发布

TwinkelStar

最新推荐文章于 2024-08-05 23:56:13 发布

阅读量1.2k

点赞数

分类专栏：机器学习文章标签： tensorflow

本文链接：https://blog.csdn.net/qq_48715321/article/details/121484350

版权

机器学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

预测房价：

数据集介绍

boston_dataset:

Boston house prices dataset

Data Set Characteristics:

:Number of Instances: 506 

:Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

:Attribute Information (in order):
    - CRIM     per capita crime rate by town
    - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
    - INDUS    proportion of non-retail business acres per town
    - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
    - NOX      nitric oxides concentration (parts per 10 million)
    - RM       average number of rooms per dwelling
    - AGE      proportion of owner-occupied units built prior to 1940
    - DIS      weighted distances to five Boston employment centres
    - RAD      index of accessibility to radial highways
    - TAX      full-value property-tax rate per $10,000
    - PTRATIO  pupil-teacher ratio by town
    - B        1000(Bk - 0.63)^2 where Bk is the proportion of black people by town
    - LSTAT    % lower status of the population
    - MEDV     Median value of owner-occupied homes in $1000's

:Missing Attribute Values: None

:Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
https://archive.ics.uci.edu/ml/machine-learning-databases/housing/

This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. ‘Hedonic
prices and the demand for clean air’, J. Environ. Economics & Management,
vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch, ‘Regression diagnostics
…’, Wiley, 1980. N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that address regression
problems.

… topic:: References

Belsley, Kuh & Welsch, ‘Regression diagnostics: Identifying Influential Data and Sources of Collinearity’, Wiley, 1980. 244-261.
Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.

tensorflow 代码

# -*- coding: utf-8 -*-
"""
Created on Sun Nov 21 17:48:53 2021

@MysteriousKnight: 23608
@Email: xingchenziyi@163.com   
 

"""

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import tensorflow as tf

data = load_boston()

x_train,x_test,y_train,y_test = train_test_split(data["data"], data.target, test_size=0.5, random_state = 50)

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, input_dim=x_train.shape[1], activation="relu"),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(128, activation="relu"),
    tf.keras.layers.Dense(256, activation="relu"),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(128, activation="relu"),
    tf.keras.layers.Dense(64, activation="relu"),
    tf.keras.layers.Dense(1)
])

model.compile(
    optimizer="adam",
    loss="mse"
    )


model.summary()

history = model.fit(
    x_train,
    y_train,
    epochs=1000,
    batch_size=512,
    validation_data=(x_test,y_test)
    
)

test_pre = model.predict(x_test)

print("y1 mse:%.4f" % np.mean(np.square((test_pre - y_test))))

loss = history.history['loss']
val_loss = history.history['val_loss']

#  绘制loss图像
plt.plot(loss, lw=3, label='loss')
plt.plot(val_loss, lw=3, label='val_loss')
plt.legend()
plt.show()

loss曲线

在这里插入图片描述

误差

y1 mse:590.4769

对比测试集数据和预测数据：

在这里插入图片描述

结论

从图中就能看出，随着迭代次数的增加，模型出现严重过拟合现象，训练集和测试集的数据集我划分为五五开，但因为模型的数据量太少了，总是会出现过拟合现象，想要缓解过拟合，唯有增大数据集，或者尝试调整模型的网络结构或者参数，也许能训练出不错的模型？数据集只有500多行，完全不够看。

TwinkelStar

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
机器学习【1.预测房价】

预测房价：数据集介绍boston_dataset:Boston house prices datasetData Set Characteristics::Number of Instances: 506 :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.:Attribute Information (in order):
复制链接

扫一扫

专栏目录