《机器学习与数据挖掘》实验二-使用梯度下降法训练多元线性回归模型

实验题目:   使用梯度下降法训练多元线性回归模型                                             

实验目的:   掌握线性回归的基本原理,以及梯度下降法                                           

实验环境(硬件和软件)   Anaconda/Jupyter notebook/Pycharm                               

实验内容:

(1)编码实现基于梯度下降的多元线性回归算法,包括梯度的计算与验证;

(2)画数据散点图,以及得到的直线;

(3)画梯度下降过程中损失的变化图;

(4)基于训练得到的参数,输入新的样本数据,输出预测值;

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from sklearn.model_selection import train_test_split

train_data = pd.read_csv("输入自己文件的路径", header=1,
                         names=['Size', 'Bedrooms', 'Price'])
x = np.array(train_data, 'float32')
x = np.delete(x, [2], axis=1)
y = np.array(train_data.Price, 'float32')

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)


def scaler(train, test):
    '''
     #归一化函数
    :param train:
    :param test:
    :return: train test
    '''
    min = train.min(axis=0)
    max = train.max(axis=0)
    gap = max - min
    train -= min
    train /= gap
    test -= min
    test /= gap
    return train, test


def min_max_gap(train):
    '''
    计算训练集最小值。
    :param train:
    :return:
    '''
    min = train.min(axis=0)
    max = train.max(axis=0)
    gap = max - min
    return min, max, gap


y_min, y_max, y_gap = min_max_gap(y_train)
x_train_origin = x_train.copy()



def loss_function(x, y, W):
    '''
     损失函数
    :param X:
    :param y:
    :param W:
    :return:
    '''
    y_hat = x.dot(W.T)
    loss = y_hat - y.reshape((len(y_hat), 1))
    cost = np.sum(loss ** 2) / (2 * len(x))
    return cost


x_train, x_test = scaler(x_train, x_test)
y_train, y_test = scaler(y_train, y_test)

iterations = 1000
alpha = 0.1
weight = np.array([[1, 1],])
# print(loss_function(x_train, y_train, weight))


def gradient_descent(x, y, w, lr, iterations):
    '''
    梯度下降函数
    :param X:
    :param y:
    :param W:
    :param lr:
    :param iter:
    :return:
    '''
    l_history = np.zeros(iterations)
    w_history = np.zeros((iterations, 2))
    for iter in range(iterations):
        y_hat = x.dot(w.T)
        loss = y_hat - y.reshape((len(y_hat), 1))
        derivative_w = x.T.dot(loss) / len(x)
        derivative_w = derivative_w.T
        w = w - lr * derivative_w
        l_history[iter] = loss_function(x, y, w)
        w_history[iter] = w
    return l_history, w_history


def liner_regression(x, y, weight, alpha, iter):
    loss_history, weight_history = gradient_descent(x, y, weight, alpha, iter)
    print("训练最终损失", loss_history[-1])
    return loss_history, weight_history
loss_history, weight_history = liner_regression(x_train, y_train, weight, alpha, iterations)
print(weight_history)
print("#"*20)
print(loss_history)

plt.rcParams["font.sans-serif"] = ["SimHei"]
plt.title("损失函数图")
plt.plot(np.arange(iterations),loss_history,'r')
plt.show()

theat = weight_history[-1]
print(theat)

x1 = np.linspace(x_train[:, 0].min(), x_train[:, 0].max(), 100)
x2 = np.linspace(x_train[:, 1].min(), x_train[:, 1].max(), 100)
x1, x2 = np.meshgrid(x1, x2)
f = theat[0] * x1 + theat[1] * x2

fig = plt.figure()
Ax = Axes3D(fig)
Ax.plot_surface(x1, x2, f,
                rstride=1,
                cstride=1,
                cmap=plt.get_cmap('winter'))

plt.title("3D散点图")
Ax.scatter(x_train[:, 0], x_train[:, 1], y_train, c="r")


def costs_fun(theat1, theat2):
    global x_train, y_train
    theat = np.array([theat1, theat2])
    y_hat = x_train.dot(theat.T)
    loss = y_hat.reshape(len(y_hat), 1)
    cost = np.sum(loss * 2) / (2 * len(x_train))
    return cost


theat1 = np.arange(0.0, 1.0, 0.005)
theat2 = np.arange(0.0, 1.0, 0.005)
theat1, theat2 = np.meshgrid(theat1, theat2)
f = np.array(list(map(lambda t:costs_fun(t[0], t[1]),zip(theat1.flatten(), theat2.flatten()))))
f = f.reshape(theat1.shape[0], -1)
fig = plt.figure()
Ax = Axes3D(fig)
Ax.plot_surface(theat1, theat2, f, rstride=1, cstride=1, cmap=plt.get_cmap('winter'))
plt.title('在theat1 theat2两个参数下损失变化图')
plt.show()


x_plan = np.random.randn(1650, 2)
x_train, x_plan = scaler(x_train_origin,x_plan)

n = weight_history.shape[0]-1
t = weight_history[n, :].reshape(2,-1)
y_plan = np.dot(x_plan, t)

y_value = y_plan * y_gap + y_min


print(x_plan)
print(y_value)
print("预测值", y_value.astype(int))

  • 2
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
基于梯度下降法的多元线性回归是一种用于求解多个特征变量线性回归问题的算法。在多元线性回归中,我们构建一个代价函数,目标是找到使得代价函数最小的一系列参数。梯度下降算法通过迭代的方式,不断调整参数的值,使得代价函数逐渐减小,最终找到最优的参数值。 具体而言,梯度下降算法通过计算代价函数对参数的偏导数来确定参数的更新方向。在每一次迭代中,根据当前参数的值和偏导数的值,更新参数的值,使得代价函数逐渐减小。这个过程会一直进行,直到达到预定的停止条件。 在多元线性回归中,我们可以使用批量梯度下降算法来求解代价函数的最小值。批量梯度下降算法通过计算所有样本的梯度来更新参数的值,因此每一次迭代都需要遍历整个训练集。这种算法的优点是可以找到全局最优解,但计算量较大。 总结来说,基于梯度下降法的多元线性回归是一种通过迭代调整参数值的算法,用于求解多个特征变量线性回归问题。它可以通过计算代价函数的偏导数来确定参数的更新方向,并通过不断迭代来逐渐减小代价函数的值,最终找到最优的参数值。 #### 引用[.reference_title] - *1* *2* *3* [机器学习-线性回归-多元梯度下降法](https://blog.csdn.net/kingsure001/article/details/107465231)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值