吴恩达机器学习作业三 python实现

最新推荐文章于 2024-08-18 23:42:17 发布

理智点

最新推荐文章于 2024-08-18 23:42:17 发布

阅读量459

点赞数

文章标签：机器学习 python 人工智能

本文链接：https://blog.csdn.net/leezed525/article/details/131444808

版权

作业三

文章目录（嫌啰嗦可以直接看源代码）

第一题多元分类

在这里插入图片描述
让我们自己实现多元分类

1.1 读取数据集

在这里插入图片描述
读取数据集

代码

import numpy as np
import matplotlib.pyplot as plt
# 因为本次的数据集是.mat格式的，无法直接通过txt文件读取，因此我们需要通过这个方法来读取
from scipy.io import loadmat

data = loadmat('./ex3data1.mat')
x = data['X']
m = len(x)
y = data['y']
data

结果

在这里插入图片描述

讲解

本题中的数据集跟之前的文件不一样，是保存在一个.mat文件之中，因此我们也无法直接读取，故需要使用 scipy.io包的loadmat来读取mat文件，上面的结果也展示了读取出来的数据是什么样的

1.2 可视化数据

在这里插入图片描述

先展示单个数字

我们先看下单个图片的数据长什么样

代码

need_show_img_count = 100
show_img = x[need_show_img_count, :].reshape((20, 20))
print("这个数据的识别结果是", 0 if y[need_show_img_count, 0] == 10 else y[need_show_img_count, 0])
# 画出这幅图
plt.imshow(show_img, cmap='gray')
plt.show()

结果

在这里插入图片描述

讲解

因为我们获得的数据是一个5000 * 400的矩阵，代表着有5000张图片的灰度数据，每张图片是20 * 20 的尺寸，一维化就是400的大小，因此我们只需将400的数组重新组装成20 * 20的矩阵就可以获得图片通过plt.imshow(就可以获得这张图片)

完成题目所要求的100个随机数字展示

代码

random_selected_index = np.random.choice(len(x), 100)
need_show_data = x[random_selected_index]
need_show_img = None

for i in range(10):
    tmp_array = need_show_data[i * 10].reshape(20, 20)
    for j in range(1, 10):
        tmp_array = np.concatenate((tmp_array, need_show_data[i * 10 + j].reshape(20, 20)), axis=1)
    need_show_img = tmp_array if need_show_img is None else np.concatenate((need_show_img, tmp_array), axis=0)
print(need_show_img.shape)
plt.imshow(need_show_img, cmap='gray')
plt.show()

结果

在这里插入图片描述

思路

展示图片的方法就是跟上面展示单个图片的方法一样，只不过不一样的地方在于这次需要将图片拼接，具体的拼接方法看代码吧

1.3 Vectorizing Logistic Regression (矢量化逻辑回归)

这里的内容主要是讲解了怎么使用矢量化的方式来进行逻辑回归的计算，没有具体的任务，故我不具体讲了

1.4 One-vs-all Classiﬁcation （一对一分类）

在这里插入图片描述
这里就是需要我们去解决这个问题了

先定义几个需要用的函数代码

# 定义一些函数
# sigmoid 函数
def sigmoid(z):
    return 1 / (1 + np.exp(-z))


# 预测函数
def h_fun(theta_param, x_param):
    return sigmoid(x_param @ theta_param)

# 定义损失函数
def cost_fun(theta_param, x_param, y_param, l):
    '''
    get cost Value
    :param theta_param: theta
    :param x_param: x
    :param y_param: y
    :param l: lambda
    :return: cost value
    '''
    theta_param = theta_param.reshape((x_param.shape[1], 1))
    tmp = h_fun(theta_param, x_param)
    first = np.multiply(y_param, np.log(tmp))
    second = np.multiply((1 - y_param), np.log(1 - tmp))
    return - 1 / m * (np.sum(first + second)) + l / (2 * m) * np.sum(np.power(theta_param, 2))


def gradient_fun(theta_param, x_param, y_param, l):
    '''
    get Gradient Value
    :param theta_param: theta
    :param x_param: x
    :param y_param: y
    :param l: lambda
    :return: gradient
    '''
    theta_param = theta_param.reshape((x_param.shape[1], 1))
    cpy_theta = theta_param.copy()
    tmp = h_fun(theta_param, x_param)
    theta_param = (x_param.T @ (tmp - y_param)) / len(x_param)
    theta_param[1:, :] += l / len(x_param) * cpy_theta[1:, :]
    return theta_param

一对一分类代码

import scipy.optimize as opt

# 准备下X
x = data['X']
x = np.insert(x, 0, values=np.ones(m), axis=1)


def one_vs_all(x_param, y_param, all_count, l):
    '''
    :param x_param: x
    :param y_param: y
    :param all_count: 待分类的总数
    :param l: lambda
    :return: all_theta
    '''
    all_theta = np.zeros((all_count, x_param.shape[1]))
    for i in range(all_count):
        theta = np.zeros(x_param.shape[1])
        y_i = np.array([1 if label % 10 == i else 0 for label in y_param])
        y_i = y_i.reshape((m, 1))
        # result = opt.fmin_tnc(func=cost_fun, x0=theta, fprime=gradient_fun, args=(x_param, y_i, l))
        # all_theta[i, :] = result[0]
        result = opt.minimize(fun=cost_fun, x0=theta, jac=gradient_fun, args=(x_param, y_i, l), method='TNC')
        all_theta[i, :] = result.x
    return all_theta


theta_result = one_vs_all(x, y, 10, 1)

讲解

这里一对一分类就是将问题分解成 10 个问题，当考虑识别手写数字0的时候，就将训练集中的y中代表0的值置为1 ，将不是0的值置为0，这样得到了一个新的y，用这个来训练所有数据集，这样我们就得到了一个可以识别手写数字0的theta组合，一次类推，获得所有数字的theta

同时题目也说对于特征值数量较多的时候可以使用 fmincg 方法，相较于 fminunc方法而言更加合适，所以代码中有注释的两行

# result = opt.fmin_tnc(func=cost_fun, x0=theta, fprime=gradient_fun, args=(x_param, y_i, l))
# all_theta[i, :] = result[0]

这两行就是使用fminunc方法来进行拟合数据的代码，
没注释的代码

result = opt.minimize(fun=cost_fun, x0=theta, jac=gradient_fun, args=(x_param, y_i, l), method='TNC')
all_theta[i, :] = result.x

就是使用fmincg来进行拟合的

但是请注意，因为上面说的那两个学习函数，fmincg 和 fminunc 都是 Octave 中的函数，由于我们使用的是Python ，我并没有找到具体的对应方法，因此在python中这两个方法进行拟合我的电脑进行运算的时候并没有太大的差别，最多的时候就差距了 50ms（总耗时大概400ms左右），而且不保证没有注释的代码一定比注释的代码跑的更快

1.4.1 一对一预测

在这里插入图片描述
在我们得到训练完成的theta后我们可以来验证他的准确率

代码

from sklearn.metrics import classification_report


def predict_one_vs_all(all_theta_param, x_param):
    '''
    :param all_theta_param: all theta
    :param x_param: x
    :return: y_pred
    '''
    h = h_fun(all_theta_param.T, x_param)
    #q: np.argmax是干嘛的
    # a: 返回最大值的索引
    h_argmax = np.argmax(h, axis=1)
    return h_argmax


y_pred = predict_one_vs_all(theta_result, x)
y_answer = np.array([0 if label == 10 else label[0] for label in y])
print(classification_report(y_answer, y_pred))
accuracy = np.mean(y_pred == y_answer)
print('accuracy = {0}%'.format(accuracy * 100))
# y_answer

结果

在这里插入图片描述

讲解

没啥好讲的，代码逻辑挺清晰的，用了sklearn.metrics 包的classification_report 来进行正确率校验，同时自己手算了一遍准确率

源代码

#!/usr/bin/env python
# coding: utf-8

# In[8]:


import numpy as np
import matplotlib.pyplot as plt
# 因为本次的数据集是.mat格式的，无法直接通过txt文件读取，因此我们需要通过这个方法来读取
from scipy.io import loadmat

data = loadmat('./ex3data1.mat')
x = data['X']
m = len(x)
y = data['y']
data


# In[2]:


need_show_img_count = 100
show_img = x[need_show_img_count, :].reshape((20, 20))
print("这个数据的识别结果是", 0 if y[need_show_img_count, 0] == 10 else y[need_show_img_count, 0])
# 画出这幅图
plt.imshow(show_img, cmap='gray')
plt.show()


# In[3]:


random_selected_index = np.random.choice(len(x), 100)
need_show_data = x[random_selected_index]
need_show_img = None

for i in range(10):
    tmp_array = need_show_data[i * 10].reshape(20, 20)
    for j in range(1, 10):
        tmp_array = np.concatenate((tmp_array, need_show_data[i * 10 + j].reshape(20, 20)), axis=1)
    need_show_img = tmp_array if need_show_img is None else np.concatenate((need_show_img, tmp_array), axis=0)
print(need_show_img.shape)
plt.imshow(need_show_img, cmap='gray')
plt.show()


# In[4]:


# 定义一些函数
# sigmoid 函数
def sigmoid(z):
    return 1 / (1 + np.exp(-z))


# 预测函数
def h_fun(theta_param, x_param):
    return sigmoid(x_param @ theta_param)



# In[5]:


# 定义损失函数
def cost_fun(theta_param, x_param, y_param, l):
    '''
    get cost Value
    :param theta_param: theta
    :param x_param: x
    :param y_param: y
    :param l: lambda
    :return: cost value
    '''
    theta_param = theta_param.reshape((x_param.shape[1], 1))
    tmp = h_fun(theta_param, x_param)
    first = np.multiply(y_param, np.log(tmp))
    second = np.multiply((1 - y_param), np.log(1 - tmp))
    return - 1 / m * (np.sum(first + second)) + l / (2 * m) * np.sum(np.power(theta_param, 2))


def gradient_fun(theta_param, x_param, y_param, l):
    '''
    get Gradient Value
    :param theta_param: theta
    :param x_param: x
    :param y_param: y
    :param l: lambda
    :return: gradient
    '''
    theta_param = theta_param.reshape((x_param.shape[1], 1))
    cpy_theta = theta_param.copy()
    tmp = h_fun(theta_param, x_param)
    theta_param = (x_param.T @ (tmp - y_param)) / len(x_param)
    theta_param[1:, :] += l / len(x_param) * cpy_theta[1:, :]
    return theta_param


# In[6]:


import scipy.optimize as opt

# 准备下X
x = data['X']
x = np.insert(x, 0, values=np.ones(m), axis=1)


def one_vs_all(x_param, y_param, all_count, l):
    '''
    :param x_param: x
    :param y_param: y
    :param all_count: 待分类的总数
    :param l: lambda
    :return: all_theta
    '''
    all_theta = np.zeros((all_count, x_param.shape[1]))
    for i in range(all_count):
        theta = np.zeros(x_param.shape[1])
        y_i = np.array([1 if label % 10 == i else 0 for label in y_param])
        y_i = y_i.reshape((m, 1))
        # result = opt.fmin_tnc(func=cost_fun, x0=theta, fprime=gradient_fun, args=(x_param, y_i, l))
        # all_theta[i, :] = result[0]
        result = opt.minimize(fun=cost_fun, x0=theta, jac=gradient_fun, args=(x_param, y_i, l), method='TNC')
        all_theta[i, :] = result.x
    return all_theta


theta_result = one_vs_all(x, y, 10, 1)


# In[7]:


from sklearn.metrics import classification_report


def predict_one_vs_all(all_theta_param, x_param):
    '''
    :param all_theta_param: all theta
    :param x_param: x
    :return: y_pred
    '''
    h = h_fun(all_theta_param.T, x_param)
    #q: np.argmax是干嘛的
    # a: 返回最大值的索引
    h_argmax = np.argmax(h, axis=1)
    return h_argmax


y_pred = predict_one_vs_all(theta_result, x)
y_answer = np.array([0 if label == 10 else label[0] for label in y])
print(classification_report(y_answer, y_pred))
accuracy = np.mean(y_pred == y_answer)
print('accuracy = {0}%'.format(accuracy * 100))
# y_answer

第二题神经网络

在这里插入图片描述
这题其实只是神经网络的开胃菜，并不是让我们去训练一个神经网络，而是将一个已经由神经网络训练好的theta 给我们，我们只需要进行校验即可

读取数据代码

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import loadmat

# 读取数据
data = loadmat('ex3data1.mat')
x = data['X']
y = data['y']
m = len(x)

函数定义代码

# sigmoid函数
def sigmoid(z_param):
    return 1 / (1 + np.exp(-z_param))


def h_fun(theta_param, x_param):
    return sigmoid(x_param @ theta_param.T)

读取theta 代码

# 读取theta
theta_data = loadmat('ex3weights.mat')
theta1 = theta_data['Theta1']
theta2 = theta_data['Theta2']
theta1.shape, theta2.shape

预测代码

# 前向传播预测
def predict(theta1_param, theta2_param, x_param):
    a1 = np.insert(x_param, 0, values=np.ones(m), axis=1)
    z2 = a1 @ theta1_param.T
    a2 = np.insert(sigmoid(z2), 0, values=np.ones(m), axis=1)
    z3 = a2 @ theta2_param.T
    a3 = sigmoid(z3)
    return np.argmax(a3, axis=1) + 1


y_pred = predict(theta1, theta2, x)

准确性校验代码

from sklearn.metrics import classification_report

print(classification_report(y, y_pred))

accuracy = np.mean(y_pred.reshape(y.shape[0], 1) == y)
print('accuracy = {0}%'.format(accuracy * 100))

结果

在这里插入图片描述
确实比我们手动拟合的参数预测准确率要高一点

源代码

#!/usr/bin/env python
# coding: utf-8

# In[74]:


import numpy as np
import matplotlib.pyplot as plt
from scipy.io import loadmat

# 读取数据
data = loadmat('ex3data1.mat')
x = data['X']
y = data['y']
m = len(x)


# In[75]:


# 绘图
random_selected_index = np.random.choice(len(x), 100)
need_show_data = x[random_selected_index]
need_show_img = None

for i in range(10):
    tmp_array = need_show_data[i * 10].reshape(20, 20)
    for j in range(1, 10):
        tmp_array = np.concatenate((tmp_array, need_show_data[i * 10 + j].reshape(20, 20)), axis=1)
    need_show_img = tmp_array if need_show_img is None else np.concatenate((need_show_img, tmp_array), axis=0)
print(need_show_img.shape)
plt.imshow(need_show_img, cmap='gray')
plt.show()


# In[76]:


# sigmoid函数
def sigmoid(z_param):
    return 1 / (1 + np.exp(-z_param))


def h_fun(theta_param, x_param):
    return sigmoid(x_param @ theta_param.T)


# In[77]:


# 读取theta
theta_data = loadmat('ex3weights.mat')
theta1 = theta_data['Theta1']
theta2 = theta_data['Theta2']


# In[78]:


# 前向传播预测
def predict(theta1_param, theta2_param, x_param):
    a1 = np.insert(x_param, 0, values=np.ones(m), axis=1)
    z2 = a1 @ theta1_param.T
    a2 = np.insert(sigmoid(z2), 0, values=np.ones(m), axis=1)
    z3 = a2 @ theta2_param.T
    a3 = sigmoid(z3)
    return np.argmax(a3, axis=1) + 1


y_pred = predict(theta1, theta2, x)


# In[79]:


from sklearn.metrics import classification_report

print(classification_report(y, y_pred))

accuracy = np.mean(y_pred.reshape(y.shape[0], 1) == y)
print('accuracy = {0}%'.format(accuracy * 100))