吴恩达机器学习 EX1 作业第一部分单变量线性回归

最新推荐文章于 2022-10-31 18:55:38 发布

lsnow8624

最新推荐文章于 2022-10-31 18:55:38 发布

阅读量1.9k

点赞数 4

分类专栏：吴恩达机器学习作业

本文链接：https://blog.csdn.net/lsnow8624/article/details/88659403

版权

吴恩达机器学习作业专栏收录该内容

14 篇文章 9 订阅

订阅专栏

1、单变量线性回归

1.1作业介绍

在本练习中，您将使用一个变量实现线性回归来预测一辆食品卡车的利润。假设你是一家连锁餐厅的首席执行官，正在考虑在不同的城市开设一家新餐厅。这个连锁店已经在各个城市有了卡车，你可以得到城市的利润和人口数据。

1.2 导入模块

import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import LogNorm
from mpl_toolkits.mplot3d import axes3d, Axes3D
from computeCost import * # 计算代价函数模块
from gradientDescent import * #计算梯度下降模块
from plotData import * #绘图模块

1.3 导入数据

数据集请到网上自行下载

# ===================== Part 1: Plotting =====================
data = np.loadtxt('ex1data1.txt', delimiter=',', usecols=(0, 1))
X = data[:, 0]
y = data[:, 1]
m = y.size

1.4 显示X，y纬度

X为97条单变量的人口数据，y97条单变量利润数据

print('X.shape: ', X.shape, '\ny.shape: ', y.shape)

X.shape:  (97,) 
y.shape:  (97,)

1.5 plotData.py绘图函数

绘制散点图

import matplotlib.pyplot as plt

def plot_data(x, y):
    # ==========================================================
    # Instructions : Plot the training data into a figure using the matplotlib.pyplot
    #                using the "plt.scatter" function. Set the axis labels using
    #                "plt.xlabel" and "plt.ylabel". Assume the population and revenue data
    #                have been passed in as the x and y.
    #   'c'： 颜色 参数
    #   'marker'： 散点形状参数

    plt.scatter(x, y, c='r', marker='x')
    plt.ylabel('Profit in $10,000s')
    plt.xlabel('Population of City in 10,000s')
    # ===========================================================

    plt.show()

1.6 调用plot_data 函数绘图

plt.ion()
plt.figure(figsize=(12, 8))
plot_data(X, y)

x轴城市人口数量，单位：10,000；y轴利润，单位：10000

1.7 线性回归最小化代价函数(computeCost.py)

最小化代价函数公式：所有训练样本通过假设函数计算预测值和真实值差的平方和，除2倍样本数量：

线性回归通过上文公式计算出所有训练样本的最小化代价函数，得出线性回归假设函数：

def compute_cost(X, y, theta):
    # Initialize some useful values
    m = y.size # 样本数量

    hypothesis = np.dot(X, theta.T)# 假设函数计算预测值
    cost = np.sum(np.power(np.subtract(hypothesis, y), 2)) / (2 * m)#样本预测值和真实值差(即损失)的平方，除2倍样本数量
    return cost

1.8 初始化theta(全部为0)等，计算损失值

每个X(i)都增加1，作为偏置bias单元

# ===================== Part 2: Gradient descent =====================
X = np.c_[np.ones(m), X]  # 所有训练样本均增加1为偏置单元
theta = np.zeros(2)  # 初始化theta，全部为0

# Some gradient descent settings
iterations = 1500
alpha = 0.01

# Compute and display initial cost
print('Initial cost : %.2f' % compute_cost(X, y, theta) , ' (This value should be about 32.07)')

Initial cost : 32.07  (This value should be about 32.07)

1.9 批量梯度下降函数(gradientDescent.py)

批量梯度下降公式：

其中\alpha，是学习率（learning rate），它决定了我们沿着能让代价函数下降程度最大的方向向下迈出的步子有多大，在批量梯度下降中，我们每一次都同时让所有的参数减去学习速率乘以代价函数的导数
批量梯度下降过程，计算最小化theta参数

import numpy as np
from computeCost import *

def gradient_descent(X, y, theta, alpha, num_iters):
    # Initialize some useful values
    m = y.size #样本数量
    J_history = np.zeros(num_iters)#存放每次迭代损失值

    for i in range(0, num_iters):#梯度下降，迭代更新theta值，同时计算损失值
        error = np.dot(X, theta).flatten() - y #计算训练样本和真实值的差(即损失值)
        theta -= (alpha/m) * np.sum(X * error[:, np.newaxis], 0)#更新学习率
        J_history[i] = compute_cost(X, y, theta)#计算代价值
        
    return theta, J_history
#多变量批量梯度下降和单变量批量题库下降公式相同
def gradient_descent_multi(X, y, theta, alpha, num_iters):
    # Initialize some useful values
    m = y.size
    J_history = np.zeros(num_iters)

    for i in range(0, num_iters):
        # Save the cost every iteration
        error = np.dot(X, theta).flatten() - y
        theta -= (alpha/m) * np.sum(X * error[:, np.newaxis], 0)
        J_history[i] = compute_cost(X, y, theta)
        
    return theta, J_history

1.10 通过批量梯度下降函数计算代价和更新theta

theta, J_history = gradient_descent(X, y, theta, alpha, iterations)

print('Theta found by gradient descent: ' + str(theta.reshape(2)))

Theta found by gradient descent: [-3.63029144  1.16636235]

1.11 用更新后的theta绘制新的散点图和线性直线

# Plot the linear fit

plt.ion()
plt.figure(figsize=(12, 8))
# plot_data(X[:, 1], y)
plt.scatter(X[:, 1], y, c='r', marker='X')
plt.plot(X[:, 1], np.dot(X, theta), label='Linear Regression')
plt.legend()
plt.show()

散点图和线性执行

1.12 用训练的theta预测不同人口城市利润

# Predict values for population sizes of 35,000 and 70,000
predict1 = np.dot(np.array([1, 3.5]), theta)
print('For population = 35,000, we predict a profit of {:0.3f} (This value should be about 4519.77)'.format(predict1*10000))
predict2 = np.dot(np.array([1, 7]), theta)
print('For population = 70,000, we predict a profit of {:0.3f} (This value should be about 45342.45)'.format(predict2*10000))

For population = 35,000, we predict a profit of 4519.768 (This value should be about 4519.77)
For population = 70,000, we predict a profit of 45342.450 (This value should be about 45342.45)

1.13 绘制迭代训练过程代价值

plt.plot(np.arange(1, 1501), J_history, c='r', marker='x' )
plt.show()

从图形可以看出200次迭代后代价值趋于稳定
训练过程代价值

1.14 绘制theta0和theta1和代价值的三围和等高线图

从图形可以看出，

# ===================== Part 3: Visualizing J(theta0, theta1) =====================

theta0_vals = np.linspace(-10, 10, 100)
theta1_vals = np.linspace(-1, 4, 100)

xs, ys = np.meshgrid(theta0_vals, theta1_vals)
J_vals = np.zeros(xs.shape)

# Fill out J_vals
for i in range(0, theta0_vals.size):
    for j in range(0, theta1_vals.size):
        t = np.array([theta0_vals[i], theta1_vals[j]])
        J_vals[i][j] = compute_cost(X, y, t)

J_vals = np.transpose(J_vals)
# 三维图
fig1 = plt.figure(1)
ax = fig1.gca(projection='3d')
ax.plot_surface(xs, ys, J_vals)
plt.xlabel(r'$\theta_0$')
plt.ylabel(r'$\theta_1$')
# 等高线图
plt.figure(2)
lvls = np.logspace(-2, 3, 20)
plt.contour(xs, ys, J_vals, levels=lvls, norm=LogNorm())
plt.plot(theta[0], theta[1], c='r', marker="x")
plt.show()