week2_lab2 Gradient Descent for Linear Regression

week2_lab2 Gradient Descent for Linear Regression(Boston Housing dataset)(学习笔记)

In this exercise, you will learn the following

  1. implement the gradient descent method
  2. implement the minibatch gradient descent method

We will use the Boston Housing data, similar to Week 1. We can import the dataset and preprocess it as follows. Note we add a feature of to x_input to get a n x (d+1) matrix x_in

import pandas as pd
import numpy as np
data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
boston_data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
target = raw_df.values[1::2, 2]

data = boston_data;
x_input = data  # a data matrix
y_target = target; # a vector for all outputs
# add a feature 1 to the dataset, then we do not need to consider the bias and weight separately
x_in = np.concatenate([np.ones([np.shape(x_input)[0], 1]), x_input], axis=1)
# we normalize the data so that each has regularity
x_in = preprocessing.normalize(x_in)

x_in = np.concatenate([np.ones([np.shape(x_input)[0], 1]), x_input], axis=1): 这一行用于添加一个额外的特征(常数1)到输入特征矩阵 x_input 前面,这是为了不需要单独考虑偏置项(bias)和权重项(weights)。具体解释如下:

  1. np.ones([np.shape(x_input)[0], 1]) 创建一个形状为 (n, 1) 的矩阵,其中 n 是示例的数量,每个示例的该列都是1。
  2. x_input 是原始输入特征矩阵,形状为 (n, d),其中 n 是示例数量,d 是特征数量。
  3. np.concatenate(…, axis=1) 用于按列(axis=1)将这两个矩阵连接起来,形成一个新的特征矩阵 x_in,其中第一列是1,表示常数项,后面是原始特征。

Linear Model & Cost Function

def linearmat_2(w, X):
    '''
    a vectorization of linearmat_1 in Week 1 lab.
    Input: w is a weight parameter (including the bias), and X is a data matrix (n x (d+1)) (including the feature)
    Output: a vector containing the predictions of linear models
    '''
    return np.dot(X, w)
def cost(w, X, y):
    '''
    Evaluate the cost function in a vectorized manner for
    inputs `X` and outputs `y`, at weights `w`.
    '''
    residual = y - linearmat_2(w, X)  # get the residual
    err = np.dot(residual, residual) / (2 * len(y)) # compute the error

    return err

Gradient Computation

计算cost function 的梯度
在这里插入图片描述

# Vectorized gradient function
def gradfn(weights, X, y):
    '''
    Given `weights` - a current "Guess" of what our weights should be
          `X` - matrix of shape (N,d+1) of input features including the feature $1$
          `y` - target y values
    Return gradient of each weight evaluated at the current value
    '''

    y_pred = np.dot(X, weights)
    error = y_pred - y
    return np.dot(X.T, error) / len(y)

Gradient Descent

使用计算完的梯度进行梯度下降
在这里插入图片描述

def solve_via_gradient_descent(X, y, print_every=100,
                               niter=5000, eta=1):
    '''
    Given `X` - matrix of shape (N,D) of input features
          `y` - target y values
          `print_every` - we report performance every 'print_every' iterations
          `niter` - the number of iterates allowed
          `eta` - learning rate
    Solves for linear regression weights with gradient descent.

    Return
        `w` - weights after `niter` iterations
        `idx_res` - the indices of iterations where we compute the cost
        `err_res` - the cost at iterations indicated by idx_res
    '''
    N, D = np.shape(X)
    # initialize all the weights to zeros
    w = np.zeros([D])
    idx_res = []
    err_res = []
    for k in range(niter):
        # compute the gradient
        dw = gradfn(w, X, y)
        # gradient descent
        w = w - eta * dw
        # we report the progress every print_every iterations
        if k % print_every == print_every - 1:
            t_cost = cost(w, X, y)
            print('error after %d iteration: %s' % (k, t_cost))
            idx_res.append(k)
            err_res.append(t_cost)
    return w, idx_res, err_res

Minibatch Gradient Descent

在这里插入图片描述

def solve_via_minibatch(X, y, print_every=100,
                               niter=5000, eta=1, batch_size=50):
    '''
    Solves for linear regression weights with nesterov momentum.
    Given `X` - matrix of shape (N,D) of input features
          `y` - target y values
          `print_every` - we report performance every 'print_every' iterations
          `niter` - the number of iterates allowed
          `eta` - learning rate
          `batch_size` - the size of minibatch
    Return
        `w` - weights after `niter` iterations
        `idx_res` - the indices of iterations where we compute the cost
        `err_res` - the cost at iterations
    '''
    N, D = np.shape(X)
    # initialize all the weights to zeros
    w = np.zeros([D])
    idx_res = []
    err_res = []
    tset = list(range(N))
    for k in range(niter):
        # TODO: Insert your code to update w by minibatch gradient descent
        idx = random.sample(tset, batch_size)
        #sample batch of data
        sample_X = X[idx, :]
        sample_y = y[idx]
        dw = gradfn(w, sample_X, sample_y)
        w = w - eta * dw
        if k % print_every == print_every - 1:
            t_cost = cost(w, X, y)
            print('error after %d iteration: %s' % (k, t_cost))
            idx_res.append(k)
            err_res.append(t_cost)
    return w, idx_res, err_res

Comparison between Minibatch Gradient Descent and Gradient Descent

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值