吴恩达机器学习作业一Python实现

理智点

已于 2023-05-16 22:51:06 修改

阅读量1.9k

点赞数 8

分类专栏：机器学习文章标签： python 机器学习开发语言

于 2023-03-21 20:24:36 首次发布

本文链接：https://blog.csdn.net/leezed525/article/details/129695417

版权

机器学习专栏收录该内容

2 篇文章 1 订阅

订阅专栏

文章介绍了作者在学习机器学习过程中对线性回归的理解和实践，包括使用Python进行数据预处理，通过梯度下降法和正规方程法求解线性回归模型，以及绘制损失函数随迭代次数和参数变化的图形，展示了学习过程中的思考和实验结果。

摘要由CSDN通过智能技术生成

作业一

文章目录

作业一

前言

最近开始学习机器学习，看了吴恩达的网课，做了作业发现有点意思，但是当时关于作业有一点疑惑不能解决，所以自己解决后决定发一篇博文，算是留个档（ps:疑问是通过问newbing解决的）

题面

生成一个5*5的单位矩阵（这个问题我觉得是拿来练手的我就没写）
单变量线性回归
多变量线性回归

第二题解析（嫌啰嗦可以直接看源代码）

第一步准备数据

一步一步来
先看题面

在这里插入图片描述
翻译一下就是说我们把数据读取并生成坐标图
这部分没什么好说的
我用的是pycharm 运行Jupyter notebook,看下我的目录结构

先看数据格式在这里插入图片描述
直接上代码

# 导入绘图包
import matplotlib.pyplot as plt
# 导入numpy包
import numpy as np

#导入数据
datafile = open('ex1data1.txt')
data = datafile.readlines()
dataCount = len(data)
tx = []
ty = []
for i in range(dataCount):
    tx.append(float(data[i].split(',')[0]))
    ty.append(float(data[i].split(',')[1]))
plt.scatter(tx, ty)
plt.show()

绘制出来的图片
在这里插入图片描述

第二步梯度下降

准备好数据之后我们就应该准备开始准备梯度下降了，希望你是看过了吴恩达的课再来的做的

我会粗略实现的原理(一下是个人粗浅的见解，可能有错误，如果有错误欢迎指正)：

为什么现在主流都是拿python实现各种AI的算法，我认为

其一是因为他的各种三方库功能强大，很多功能如果拿别的语言去写可能需要各种依赖和遇到限制。例如矩阵的各种操作，例如画图(这玩意你让我用c++写我一点头绪都没有，我是菜鸟)
其二是因为他的语法简单，让人不用拘泥于各种语法的限制，我要输出一个hello world 我就只要简简单单一句print(“hello world”)就行了，不需要各种头文件main函数乱七八糟的

而运行梯度下降需要的就是这种简单暴力
下面是预测函数和损失函数的式子
在这里插入图片描述
对于本体的数据我们应该拟合一条一元一次函数
一元一次函数的方程式为 y = b + a * x;
其中可以把b 看做 b * 1
所以我们可以吧数据抽离出来，自变量x为一个n*2的矩阵(n为训练数据量)，其中第一列为n维元素全为1的向量

# 创建自变量矩阵
x = np.matrix(tx).T
# 插入1
x = np.insert(x,0,np.ones(x.shape[0]),1)
# print(x)
y = np.matrix(ty).T

再根据式子定义损失函数和预测函数

# 损失函数
def J_theta(theta,x,y):
    return np.sum(np.power(x * theta - y,2 )) / (2 * len(x))

# 预测函数
def h_theta(x,theta):
    return x * theta

其中的theta 为一个21的矩阵，这样x * theta 的话就可以轻松地获得一个n1的预测函数值矩阵，更方便我们计算损失函数，如图

在这里插入图片描述
题目里让我们先计算theta0和theta1都为0的时候的损失值

# 初始化
theta  = [0.0,0.0] # theta0 = 0,theta1 = 0
theta = np.matrix(theta).T
# 输出初始化时候的损失数
print(J_theta(theta,x,y))
alpha = 0.01  # 定义学习速度
iterations = 15000 #迭代次数

得到结果
在这里插入图片描述

接下来开始梯度下降

# 梯度下降
lastJ = 35 #定义上一次的损失函数值，偏大一点方便运算
minx1 = 100 #定义theta 0 的最小值
maxx1 = -100 #定义theta 1 的最大值
minx2 = 100 #定义theta 0 的最小值
maxx2 = -100 #定义theta 1 的最大值
j = [] #记录每次迭代产生的损失函数值
for time in range(iterations):
    tempSum = x * theta - y # 计算预测函数的值
    minx1 = min(minx1,theta[0,0])  #更新
    maxx1 = max(maxx1,theta[0,0])  #更新
    minx2 = min(minx2,theta[1,0])  #更新
    maxx2 = max(maxx2,theta[1,0])  #更新
    J = J_theta(theta,x,y)
    j.append(J)
    if J > lastJ: # 如果j 反倒变大了，说明学习速率太快了
        alpha /= 2
    elif J == lastJ: # 如果几乎没变化，说明差不多已经优化到极限了，输出迭代次数
        print(time)
        break
    lastJ = J
    for i in range(2): #更新theta
        theta[i, 0] -= alpha * np.sum(np.multiply(tempSum, x[:, i])) / len(x) # np.multipy 是矩阵对应位置相乘，而不是使用矩阵乘法

print(minx1,maxx1,minx2,maxx2)

输出结果
在这里插入图片描述

绘图

预测的函数图

# 绘图
lx = range(5,25)
ly = [theta[0,0] + theta[1,0] * i for i in lx]

print(lastJ)

plt.scatter(tx, ty)
plt.plot(lx,ly)
plt.show()

在这里插入图片描述

损失函数随着迭代次数的变化图

plt.show()
#%%
#绘制损失函数随着次数的曲线
plt.plot([i for i in range(len(j))],j)
plt.show()

在这里插入图片描述

损失函数随着theta 1 和 theta2 的参数变化的3d曲面图

# 绘制损失函数随着参数的曲线
imaged3x = np.linspace(minx1,maxx1, num=100)
image3dy = np.linspace(minx2,maxx2,num=100)
imaged3x,image3dy = np.meshgrid(imaged3x,image3dy)

loss_grid = np.zeros((100,100))
for i in range(100):
    for j in range(100):
        tmp_theta = np.array([[imaged3x[i,j]],[image3dy[i,j]]])
        loss_grid[i,j] = J_theta(tmp_theta,x,y)

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(imaged3x, image3dy, loss_grid,cmap='rainbow')
ax.set_xlabel('theta1')
ax.set_ylabel('theta2')
ax.set_zlabel('loss')
ax.view_init( azim=45)
plt.show()

在这里插入图片描述
跟题面给的也差不了太多

在这里插入图片描述

源代码

#!/usr/bin/env python
# coding: utf-8

# In[9]:


# 导入绘图包
import matplotlib.pyplot as plt
# 导入numpy包
import numpy as np

#导入数据
datafile = open('ex1data1.txt')
data = datafile.readlines()
dataCount = len(data)
tx = []
ty = []
for i in range(dataCount):
    tx.append(float(data[i].split(',')[0]))
    ty.append(float(data[i].split(',')[1]))
plt.scatter(tx, ty)
plt.show()


# In[10]:


# 创建自变量矩阵
x = np.matrix(tx).T
# 插入1
x = np.insert(x,0,np.ones(x.shape[0]),1)
# print(x)
y = np.matrix(ty).T


# In[11]:


# 损失函数
def J_theta(theta,x,y):
    return np.sum(np.power(x * theta - y,2 )) / (2 * len(x))

# 预测函数
def h_theta(x,theta):
    return x * theta


# In[12]:


# 初始化
theta  = [0.0,0.0] # theta0 = 0,theta1 = 0
theta = np.matrix(theta).T
# 输出初始化时候的损失数
print(J_theta(theta,x,y))
alpha = 0.01  # 定义学习速度
iterations = 15000 #迭代次数


# In[13]:


# 梯度下降
lastJ = 35 #定义上一次的损失函数值，偏大一点方便运算
minx1 = 100 #定义theta 0 的最小值
maxx1 = -100 #定义theta 1 的最大值
minx2 = 100 #定义theta 0 的最小值
maxx2 = -100 #定义theta 1 的最大值
j = [] #记录每次迭代产生的损失函数值
for time in range(iterations):
    tempSum = x * theta - y # 计算预测函数的值
    minx1 = min(minx1,theta[0,0])  #更新
    maxx1 = max(maxx1,theta[0,0])  #更新
    minx2 = min(minx2,theta[1,0])  #更新
    maxx2 = max(maxx2,theta[1,0])  #更新
    J = J_theta(theta,x,y)
    j.append(J)
    if J > lastJ: # 如果j 反倒变大了，说明学习速率太快了
        alpha /= 2
    elif J == lastJ: # 如果几乎没变化，说明差不多已经优化到极限了，输出迭代次数
        print(time)
        break
    lastJ = J
    for i in range(2): #更新theta
        theta[i, 0] -= alpha * np.sum(np.multiply(tempSum, x[:, i])) / len(x) # np.multipy 是矩阵对应位置相乘，而不是使用矩阵乘法

print(minx1,maxx1,minx2,maxx2)


# In[14]:


# 绘图
lx = range(5,25)
ly = [theta[0,0] + theta[1,0] * i for i in lx]

print(lastJ)

plt.scatter(tx, ty)
plt.plot(lx,ly)
plt.show()


# In[15]:


#绘制损失函数随着次数的曲线
plt.plot([i for i in range(len(j))],j)
plt.show()


# In[16]:


# 绘制损失函数随着参数的曲线
imaged3x = np.linspace(minx1,maxx1, num=100)
image3dy = np.linspace(minx2,maxx2,num=100)
imaged3x,image3dy = np.meshgrid(imaged3x,image3dy)

loss_grid = np.zeros((100,100))
for i in range(100):
    for j in range(100):
        tmp_theta = np.array([[imaged3x[i,j]],[image3dy[i,j]]])
        loss_grid[i,j] = J_theta(tmp_theta,x,y)

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(imaged3x, image3dy, loss_grid,cmap='rainbow')
ax.set_xlabel('theta1')
ax.set_ylabel('theta2')
ax.set_zlabel('loss')
ax.view_init( azim=45)
plt.show()

第三题解析（嫌啰嗦可以直接看源代码）

题面

在这里插入图片描述
首先就是跟之前一样的读入数据，但是与之前不同的是，需要先将数据进行特征值缩放，即归一化。

步骤一：抽取每个特征的平均值
步骤二：将每个数据减去对应的平均值，然后将结果除以他们的标准差

第一步，处理数据

import numpy as np
import matplotlib.pyplot as plt

file = open("./ex1data2.txt")
datas = file.readlines()
data_count = len(datas)
tx = []
ty = []
# 获取数据
for i in range(data_count):
    tx.append([float(datas[i].split(',')[0]), float(datas[i].split(',')[1])])
    ty.append(float(datas[i].split(',')[2]))
print(tx[0:10])
print(ty[0:10])
# 绘图
ax = plt.subplot(projection='3d')
ax.scatter([tx[i][0] for i in range(len(tx))], [tx[i][1] for i in range(len(tx))], [ty[i] for i in range(len(tx))])
plt.show()

得到结果

在这里插入图片描述
接着就是进行特征值缩放，归一化

其中在numpy中，可以通过np.std()来求标准差

# 任务一 : 特征值缩放 归一化
# 先将数组矩阵化
x = np.matrix(tx, dtype=float)
# 特征值缩放 : (x[i] - 平均值 ) / x 的 标准差
x[:, 0] = (x[:, 0] - np.mean(x[:, 0])) / np.std(x[:, 0])
x[:, 1] = (x[:, 1] - np.mean(x[:, 1])) / np.std(x[:, 1])
y = np.matrix(ty, dtype=float).T
# 查看图像
ax = plt.subplot(projection='3d')
ax.scatter(x[:, 0], x[:, 1], y[:, 0])
plt.show()
# 插入常数 1
x = np.insert(x, 0, np.ones(x.shape[0]), 1)

得到结果
在这里插入图片描述
接着进行预测函数 和 损失函数 的定义了``

#定义预测函数
def h_theta(x, theta):
    return x * theta


#定义损失函数
def j_theta(x, theta, y):
    return np.sum(np.power(y - h_theta(x, theta), 2)) / (2 * len(x))

第二步梯度下降

#开始梯度下降
iterations = 50  #学习次数
rate = 0.1  # 学习速度
#初始学习参数
theta = np.matrix([0, 0, 0], dtype=float).T
print(j_theta(x, theta, y))
last_j = 0
j_his = []
for i in range(iterations):
    tmpSum = h_theta(x, theta) - y
    # print(tmpSum)
    last_j = j_theta(x, theta, y)
    j_his.append(last_j)
    for j in range(3):
        theta[j, 0] -= rate * np.sum(np.multiply(tmpSum, x[:, j])) / len(x)
        # print(np.sum(np.multiply(tmpSum, x[:, j])))
    # print(theta)

print(last_j)
# print(j_his)

得到结果
在这里插入图片描述

绘图

#画出随着迭代次数变化的损失值
fig2, ax = plt.subplots(figsize=(6, 4))
ax.plot(np.arange(iterations), j_his, 'r')
ax.set_xlabel('Iteration')
ax.set_ylabel('Cost')
ax.set_title('Error vs. Training Epoch')
plt.show()

在这里插入图片描述

最后题目里还说了让我们尝试用正规方程法

#正规方程法
normal_equations_theta = np.linalg.inv(x.T * x) * x.T * y
print(normal_equations_theta)
print(j_theta(x,theta,y))

在这里插入图片描述

最后得到的结果也跟我们梯度下降得到的结果差不多

源代码

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import numpy as np
import matplotlib.pyplot as plt

file = open("./ex1data2.txt")
datas = file.readlines()
data_count = len(datas)
tx = []
ty = []
# 获取数据
for i in range(data_count):
    tx.append([float(datas[i].split(',')[0]), float(datas[i].split(',')[1])])
    ty.append(float(datas[i].split(',')[2]))
print(tx[0:10])
print(ty[0:10])
# 绘图
ax = plt.subplot(projection='3d')
ax.scatter([tx[i][0] for i in range(len(tx))], [tx[i][1] for i in range(len(tx))], [ty[i] for i in range(len(tx))])
plt.show()


# In[2]:


# 任务一 : 特征值缩放 归一化
# 先将数组矩阵化
x = np.matrix(tx, dtype=float)
# 特征值缩放 : (x[i] - 平均值 ) / x 的 标准差
x[:, 0] = (x[:, 0] - np.mean(x[:, 0])) / np.std(x[:, 0])
x[:, 1] = (x[:, 1] - np.mean(x[:, 1])) / np.std(x[:, 1])
y = np.matrix(ty, dtype=float).T
# 查看图像
ax = plt.subplot(projection='3d')
ax.scatter(x[:, 0], x[:, 1], y[:, 0])
plt.show()
# 插入常数 1
x = np.insert(x, 0, np.ones(x.shape[0]), 1)


# In[3]:


#定义预测函数
def h_theta(x, theta):
    return x * theta


#定义损失函数
def j_theta(x, theta, y):
    return np.sum(np.power(y - h_theta(x, theta), 2)) / (2 * len(x))


# In[4]:


#开始梯度下降
iterations = 50  #学习次数
rate = 0.1  # 学习速度
#初始学习参数
theta = np.matrix([0, 0, 0], dtype=float).T
print(j_theta(x, theta, y))
last_j = 0
j_his = []
for i in range(iterations):
    tmpSum = h_theta(x, theta) - y
    # print(tmpSum)
    last_j = j_theta(x, theta, y)
    j_his.append(last_j)
    for j in range(3):
        theta[j, 0] -= rate * np.sum(np.multiply(tmpSum, x[:, j])) / len(x)
        # print(np.sum(np.multiply(tmpSum, x[:, j])))
    # print(theta)

print(last_j)
# print(j_his)


# In[5]:


#画出随着迭代次数变化的损失值
fig2, ax = plt.subplots(figsize=(6, 4))
ax.plot(np.arange(iterations), j_his, 'r')
ax.set_xlabel('Iteration')
ax.set_ylabel('Cost')
ax.set_title('Error vs. Training Epoch')
plt.show()


# In[7]:


#正规方程法
normal_equations_theta = np.linalg.inv(x.T * x) * x.T * y
print(normal_equations_theta)
print(j_theta(x,theta,y))