机器学习——梯度下降算法之回归预测

该文展示了如何利用Python的sklearn库进行线性回归模型的构建和训练,以预测波士顿地区的房价。首先加载波士顿房价数据集,探讨特征间的相关性,然后将数据集划分为训练集和测试集,训练线性回归模型,并评估模型在测试集上的表现。
摘要由CSDN通过智能技术生成

 

 

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression

[2]:

from sklearn.datasets import load_boston

[3]:

dir(load_boston())
E:\Anaconda\lib\site-packages\sklearn\utils\deprecation.py:87: FutureWarning: Function load_boston is deprecated; `load_boston` is deprecated in 1.0 and will be removed in 1.2.

    The Boston housing prices dataset has an ethical problem. You can refer to
    the documentation of this function for further details.

    The scikit-learn maintainers therefore strongly discourage the use of this
    dataset unless the purpose of the code is to study and educate about
    ethical issues in data science and machine learning.

    In this special case, you can fetch the dataset from the original
    source::

        import pandas as pd
        import numpy as np


        data_url = "http://lib.stat.cmu.edu/datasets/boston"
        raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
        data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
        target = raw_df.values[1::2, 2]

    Alternative datasets include the California housing dataset (i.e.
    :func:`~sklearn.datasets.fetch_california_housing`) and the Ames housing
    dataset. You can load the datasets as follows::

        from sklearn.datasets import fetch_california_housing
        housing = fetch_california_housing()

    for the California housing dataset and::

        from sklearn.datasets import fetch_openml
        housing = fetch_openml(name="house_prices", as_frame=True)

    for the Ames housing dataset.
    
  warnings.warn(msg, category=FutureWarning)

[3]:

['DESCR', 'data', 'data_module', 'feature_names', 'filename', 'target']

[4]:

print(load_boston().DESCR)
.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the proportion of black people by town
        - LSTAT    % lower status of the population
        - MEDV     Median value of owner-occupied homes in $1000's

    :Missing Attribute Values: None

    :Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
Index of /ml/machine-learning-databases/housing


This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978.   Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980.   N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that address regression
problems.   
     
.. topic:: References

   - Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261.
   - Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.

[5]:

X = load_boston().data
y = load_boston().target
print(X.shape,y.shape)
df = pd.DataFrame(X, columns=load_boston().feature_names)
df.head()
(506, 13) (506,)

[5]:

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTAT
00.0063218.02.310.00.5386.57565.24.09001.0296.015.3396.904.98
10.027310.07.070.00.4696.42178.94.96712.0242.017.8396.909.14
20.027290.07.070.00.4697.18561.14.96712.0242.017.8392.834.03
30.032370.02.180.00.4586.99845.86.06223.0222.018.7394.632.94
40.069050.02.180.00.4587.14754.26.06223.0222.018.7396.905.33

[6]:

plt.figure(figsize=(12,8))
sns.heatmap(df.corr(), annot=True)
plt.figure(figsize=(12,8))

[6]:

<Figure size 1200x800 with 0 Axes>

<Figure size 1200x800 with 0 Axes>

[7]:

df.corr()

[7]:

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTAT
CRIM1.000000-0.2004690.406583-0.0558920.420972-0.2192470.352734-0.3796700.6255050.5827640.289946-0.3850640.455621
ZN-0.2004691.000000-0.533828-0.042697-0.5166040.311991-0.5695370.664408-0.311948-0.314563-0.3916790.175520-0.412995
INDUS0.406583-0.5338281.0000000.0629380.763651-0.3916760.644779-0.7080270.5951290.7207600.383248-0.3569770.603800
CHAS-0.055892-0.0426970.0629381.0000000.0912030.0912510.086518-0.099176-0.007368-0.035587-0.1215150.048788-0.053929
NOX0.420972-0.5166040.7636510.0912031.000000-0.3021880.731470-0.7692300.6114410.6680230.188933-0.3800510.590879
RM-0.2192470.311991-0.3916760.091251-0.3021881.000000-0.2402650.205246-0.209847-0.292048-0.3555010.128069-0.613808
AGE0.352734-0.5695370.6447790.0865180.731470-0.2402651.000000-0.7478810.4560220.5064560.261515-0.2735340.602339
DIS-0.3796700.664408-0.708027-0.099176-0.7692300.205246-0.7478811.000000-0.494588-0.534432-0.2324710.291512-0.496996
RAD0.625505-0.3119480.595129-0.0073680.611441-0.2098470.456022-0.4945881.0000000.9102280.464741-0.4444130.488676
TAX0.582764-0.3145630.720760-0.0355870.668023-0.2920480.506456-0.5344320.9102281.0000000.460853-0.4418080.543993
PTRATIO0.289946-0.3916790.383248-0.1215150.188933-0.3555010.261515-0.2324710.4647410.4608531.000000-0.1773830.374044
B-0.3850640.175520-0.3569770.048788-0.3800510.128069-0.2735340.291512-0.444413-0.441808-0.1773831.000000-0.366087
LSTAT0.455621-0.4129950.603800-0.0539290.590879-0.6138080.602339-0.4969960.4886760.5439930.374044-0.3660871.000000

[8]:

from sklearn.model_selection import train_test_split

[9]:

#种子 random_state为888,按照8:2划分数据
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=888)
 

[10]:

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
(404, 13)
(102, 13)
(404,)
(102,)

[11]:

#构建并训练线性模型,约两行
linear_model=LinearRegression()
linear_model.fit(X_train,y_train)

[11]:

LinearRegression()

[12]:

coef = linear_model.coef_#回归系数
print(coef)
[-1.19007229e-01  3.64055815e-02  1.68552680e-02  2.29397031e+00
 -1.60706448e+01  3.72371469e+00  9.22765437e-03 -1.30674803e+00
  3.43072685e-01 -1.45830386e-02 -9.73486692e-01  7.89797436e-03
 -5.72555056e-01]

[13]:

linear_model.score(X_test,y_test)

[13]:

0.7558932220633297

[14]:

df_coef = pd.DataFrame()
df_coef['Title'] =df.columns
df_coef['Coef'] = coef
df_coef
​

[14]:

TitleCoef
0CRIM-0.119007
1ZN0.036406
2INDUS0.016855
3CHAS2.293970
4NOX-16.070645
5RM3.723715
6AGE0.009228
7DIS-1.306748
8RAD0.343073
9TAX-0.014583
10PTRATIO-0.973487
11B0.007898
12LSTAT-0.572555

[15]:

#生成训练数据X_test的预测值
line_pre = linear_model.predict(X_test)

[16]:

hos_pre = pd.DataFrame()
hos_pre['Predict'] = line_pre
hos_pre['Truth'] = y_test
hos_pre.plot(figsize=(18,8))

[16]:

<AxesSubplot:>

完整代码

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
dir(load_boston())
print(load_boston().DESCR)
X = load_boston().data
y = load_boston().target
print(X.shape,y.shape)
df = pd.DataFrame(X, columns=load_boston().feature_names)
df.head()
plt.figure(figsize=(12,8))
sns.heatmap(df.corr(), annot=True)
plt.figure(figsize=(12,8))
df.corr()
from sklearn.model_selection import train_test_split
#种子 random_state为888,按照8:2划分数据
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=888)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
#构建并训练线性模型,约两行
linear_model=LinearRegression()
linear_model.fit(X_train,y_train)                                                   
coef = linear_model.coef_#回归系数
print(coef)
linear_model.score(X_test,y_test)
df_coef = pd.DataFrame()
df_coef['Title'] =df.columns
df_coef['Coef'] = coef
df_coef
#生成训练数据X_test的预测值
line_pre = linear_model.predict(X_test)
hos_pre = pd.DataFrame()
hos_pre['Predict'] = line_pre
hos_pre['Truth'] = y_test
hos_pre.plot(figsize=(18,8))

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
梯度下降算法机器学习中一种广泛应用的最优化算法,其主要目的是通过迭代找到目标函数的最小值,或者收敛到最小值。梯度下降算法的原理可以从一个下山的场景开始理解。算法的基本思想是沿着目标函数梯度的方向更新参数值,以期望达到目标函数的最小值。 在机器学习中,梯度下降算法常常用于求解损失函数的最小值。在简单的线性回归中,我们可以使用最小二乘法来求解损失函数的最小值。然而,在绝大多数情况下,损失函数是非线性的且复杂。因此,梯度下降算法机器学习领域得到了广泛的应用。实际上,许多优秀的算法都是在梯度下降算法的启发下诞生的,例如AdaGrad、RMSProp、Momentum等等。 梯度下降算法的核心思想是通过计算目标函数的梯度来确定参数更新的方向。梯度表示了函数在某一点上的变化率,沿着梯度的方向可以使函数值快速减小。因此,梯度下降算法沿着梯度的反方向更新参数值,朝着目标函数的最小值靠近。算法的迭代过程会持续更新参数值,直到收敛到最小值或达到停止条件。 在实际应用中,为了提高算法的效率和准确性,通常会对梯度下降算法进行改进和优化。例如,可以使用学习率来控制参数值的更新步长,或者采用批量梯度下降来同时计算多个样本的梯度。这些改进可以帮助算法更快地收敛并找到更好的解。 总之,梯度下降算法是一种重要的最优化算法,在机器学习中被广泛应用。其原理是通过计算目标函数的梯度来更新参数值,以期望达到最小值。通过迭代的方式,梯度下降算法可以找到目标函数的最优解或者接近最优解。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

张謹礧

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值