目录
2.1.2.3K近邻(回归)
1、模型介绍
在回归任务中,K近邻(回归)模型同样只是借助周围K个最近训练样本的目标数值,对待测样本的回归值进行决策。自然,也衍生出衡量待测样本回归值的不同方式,即到底是对K个近邻目标数值使用普通的算术平均算法,还是同时考虑距离的差异进行加权平均。因此,也初始化不同配置的K近邻(回归)模型来比较回归性能的差异。
2、数据描述
(1)美国波士顿地区房价数据描述
# 代码34:美国波士顿地区房价数据描述
# 从sklearn.datasets导入波士顿房价数据读取器。
from sklearn.datasets import load_boston
# 从读取房价数据存储在变量boston中。
boston = load_boston()
# 输出数据描述。
print(boston.DESCR)
本地输出:
.. _boston_dataset:
Boston house prices dataset
---------------------------
**Data Set Characteristics:**
:Number of Instances: 506
:Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.
:Attribute Information (in order):
- CRIM per capita crime rate by town
- ZN proportion of residential land zoned for lots over 25,000 sq.ft.
- INDUS proportion of non-retail business acres per town
- CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
- NOX nitric oxides concentration (parts per 10 million)
- RM average number of rooms per dwelling
- AGE proportion of owner-occupied units built prior to 1940
- DIS weighted distances to five Boston employment centres
- RAD index of accessibility to radial highways
- TAX full-value property-tax rate per $10,000
- PTRATIO pupil-teacher ratio by town
- B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
- LSTAT % lower status of the population
- MEDV Median value of owner-occupied homes in $1000's
:Missing Attribute Values: None
:Creator: Harrison, D. and Rubinfeld, D.L.
This is a copy o