week2_lab2 Gradient Descent for Linear Regression

spoon2.0

已于 2023-11-02 20:06:38 修改

阅读量45

点赞数

文章标签：线性回归 numpy 机器学习

于 2023-10-30 21:05:00 首次发布

本文链接：https://blog.csdn.net/m0_56798535/article/details/134127406

版权

week2_lab2 Gradient Descent for Linear Regression(Boston Housing dataset)(学习笔记)

In this exercise, you will learn the following

implement the gradient descent method
implement the minibatch gradient descent method

We will use the Boston Housing data, similar to Week 1. We can import the dataset and preprocess it as follows. Note we add a feature of to x_input to get a n x (d+1) matrix x_in

import pandas as pd
import numpy as np
data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
boston_data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
target = raw_df.values[1::2, 2]

data = boston_data;
x_input = data  # a data matrix
y_target = target; # a vector for all outputs
# add a feature 1 to the dataset, then we do not need to consider the bias and weight separately
x_in = np.concatenate([np.ones([np.shape(x_input)[0], 1]), x_input], axis=1)
# we normalize the data so that each has regularity
x_in = preprocessing.normalize(x_in)

x_in = np.concatenate([np.ones([np.shape(x_input)[0], 1]), x_input], axis=1): 这一行用于添加一个额外的特征（常数1）到输入特征矩阵 x_input 前面，这是为了不需要单独考虑偏置项（bias）和权重项（weights）。具体解释如下：

np.ones([np.shape(x_input)[0], 1]) 创建一个形状为 (n, 1) 的矩阵，其中 n 是示例的数量，每个示例的该列都是1。
x_input 是原始输入特征矩阵，形状为 (n, d)，其中 n 是示例数量，d 是特征数量。
np.concatenate(…, axis=1) 用于按列（axis=1）将这两个矩阵连接起来，形成一个新的特征矩阵 x_in，其中第一列是1，表示常数项，后面是原始特征。

Linear Model & Cost Function

def linearmat_2(w, X):
    '''
    a vectorization of linearmat_1 in Week 1 lab.
    Input: w is a weight parameter (including the bias), and X is a data matrix (n x (d+1)) (including the feature)
    Output: a vector containing the predictions of linear models
    '''
    return np.dot(X, w)

def cost(w, X, y):
    '''
    Evaluate the cost function in a vectorized manner for
    inputs `X` and outputs `y`, at weights `w`.
    '''
    residual = y - linearmat_2(w, X)  # get the residual
    err = np.dot(residual, residual) / (2 * len(y)) # compute the error

    return err

Gradient Computation

计算cost function 的梯度
在这里插入图片描述

# Vectorized gradient function
def gradfn(weights, X, y):
    '''
    Given `weights` - a current "Guess" of what our weights should be
          `X` - matrix of shape (N,d+1) of input features including the feature $1$
          `y` - target y values
    Return gradient of each weight evaluated at the current value
    '''

    y_pred = np.dot(X, weights)
    error = y_pred - y
    return np.dot(X.T, error) / len(y)

Gradient Descent

使用计算完的梯度进行梯度下降
在这里插入图片描述

def solve_via_gradient_descent(X, y, print_every=100,
                               niter=5000, eta=1):
    '''
    Given `X` - matrix of shape (N,D) of input features
          `y` - target y values
          `print_every` - we report performance every 'print_every' iterations
          `niter` - the number of iterates allowed
          `eta` - learning rate
    Solves for linear regression weights with gradient descent.

    Return
        `w` - weights after `niter` iterations
        `idx_res` - the indices of iterations where we compute the cost
        `err_res` - the cost at iterations indicated by idx_res
    '''
    N, D = np.shape(X)
    # initialize all the weights to zeros
    w = np.zeros([D])
    idx_res = []
    err_res = []
    for k in range(niter):
        # compute the gradient
        dw = gradfn(w, X, y)
        # gradient descent
        w = w - eta * dw
        # we report the progress every print_every iterations
        if k % print_every == print_every - 1:
            t_cost = cost(w, X, y)
            print('error after %d iteration: %s' % (k, t_cost))
            idx_res.append(k)
            err_res.append(t_cost)
    return w, idx_res, err_res

Minibatch Gradient Descent

在这里插入图片描述

def solve_via_minibatch(X, y, print_every=100,
                               niter=5000, eta=1, batch_size=50):
    '''
    Solves for linear regression weights with nesterov momentum.
    Given `X` - matrix of shape (N,D) of input features
          `y` - target y values
          `print_every` - we report performance every 'print_every' iterations
          `niter` - the number of iterates allowed
          `eta` - learning rate
          `batch_size` - the size of minibatch
    Return
        `w` - weights after `niter` iterations
        `idx_res` - the indices of iterations where we compute the cost
        `err_res` - the cost at iterations
    '''
    N, D = np.shape(X)
    # initialize all the weights to zeros
    w = np.zeros([D])
    idx_res = []
    err_res = []
    tset = list(range(N))
    for k in range(niter):
        # TODO: Insert your code to update w by minibatch gradient descent
        idx = random.sample(tset, batch_size)
        #sample batch of data
        sample_X = X[idx, :]
        sample_y = y[idx]
        dw = gradfn(w, sample_X, sample_y)
        w = w - eta * dw
        if k % print_every == print_every - 1:
            t_cost = cost(w, X, y)
            print('error after %d iteration: %s' % (k, t_cost))
            idx_res.append(k)
            err_res.append(t_cost)
    return w, idx_res, err_res

Comparison between Minibatch Gradient Descent and Gradient Descent

在这里插入图片描述

spoon2.0

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
week2_lab2 Gradient Descent for Linear Regression

x_in = np.concatenate([np.ones([np.shape(x_input)[0], 1]), x_input], axis=1): 这一行用于添加一个额外的特征（常数1）到输入特征矩阵 x_input 前面，这是为了不需要单独考虑偏置项（bias）和权重项（weights）。计算cost function 的梯度。使用计算完的梯度进行梯度下降。
复制链接

扫一扫