手工搭建一层感知器线性回归网络（神经网络）

超级帅的陈星宇

已于 2022-10-13 09:37:08 修改

阅读量364

点赞数 1

分类专栏：深度学习笔记文章标签：深度学习线性回归机器学习

于 2022-09-16 18:23:54 首次发布

本文链接：https://blog.csdn.net/qq_59524897/article/details/126895391

版权

深度学习笔记专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1. numpy预备知识

1.1 点乘(*)和矩阵相乘(np.dot)

* 代表矩阵对应元素相乘，要求两个矩阵行数和列数都相同
np.dot(matrix_1, matrix_2)是数学中的矩阵相乘，要求第一个矩阵的列数等于第二个矩阵的行数

Y = np.array([[1],[2],[3]])
Y*Y  # array([[1],[4],[9]])

1.2 reshape()

假设一组图片的训练集数据matrix.shape=(209,64,64,3), 其中209代表训练集个数，64代表图片长度和宽度像素点，3代表RGB三个通道把它转化为二位矩阵，使其维度为(64*64*3,209)，那么：

matrix.reshape(209,-1).T  # 转化为209行，不管多少列。 然后在进行矩阵转置
matrix.reshape(-1,209)   # 这种是错误的转换方式：第一行209个数据全是第一张照片的数据！

2. 算法介绍

构造 z = w₁x¹+w₂x²+w₃x³+…+w_nxⁿ+b，进行多次前向传播和反向传递，修正参数

注释：

n代表一条训练数据的维度，即一条数据可以看作一个n维向量
m代表训练集或测试集的数据条数，可以说这样说：”这个训练集有m条数据“
z代表一个结果值，而Z表示训练集或测试集结果值，维度为(1, m)
以下公式点乘 ‘*’ 显式写出来，其它为数学中矩阵相乘的形式

2.1 参数初始化

令 w = [0, 0, 0, … 0]^T，b = 0，其中w为矩阵(向量)，维度为(n, 1)

2.2 进行一次前向传播(propagate)

目的：获得当前w和b数值下的损失函数cost(记为L), 获得梯度 $\frac{dL}{dw}、\frac{dL}{db}$ (记为dw、db)以在反向传播过程中修正参数
$KaTeX parse error: Undefined control sequence: \ at position 209: …+(1-Y)*ln(1-A)]\̲ ̲$
注释：

Y：真实值，维度为(m, 1)
A：预测值，维度也为(m, 1), A = sigmoid(Z) = sigmoid(w^TX + b)

$**\bold{(2)梯度\frac{dL}{dw}、\frac{dL}{db}(dw、db)的计算}**\\ 矩阵表达(代码中使用):dw=\frac{1}{m}X(A-Y)\\ 矩阵表达(代码中使用):db=\frac{1}{m}\sum_{i=1}^{m}(A-Y)$

dw的推导过程（利用求导链式法则）：
$\\ a=sigmoid(z)=\frac{1}{1+e^{-z}} \\ z=w_1x^1+w_2x^2+... \\ 则dw_1=\frac{dL}{dw_1}=\frac{dL}{da}\frac{da}{dz}\frac{dz}{dw_1}=(a-y)x^1\\ dw_2=\frac{dL}{dw_2}=\frac{dL}{da}\frac{da}{dz}\frac{dz}{dw_2}=(a-y)x^2\\ ......\\ 对于每个样本计算出来的dw_1进行累加再除以样本数m取平均值作为最终值\\ 即dw_1=\frac{1}{m}\sum_{1}^{m}dw_1,dw_2=\frac{1}{m}\sum_{1}^{m}dw_2...\\ 故转化为矩阵形式:dw=\frac{1}{m}X(A-Y)\\$

2.3 进行一次反向传播(optimize)

目的：根据上一步算出的dw、db对 w 和 b 进行优化
$w=w-\alpha*dw\\ b=b-\alpha*db\\ \alpha:学习速率$

2.4 进行多次正向传播和反向传播

eg. 附一张吴恩达老师的图

在这里插入图片描述

=====================>

代码

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset

%matplotlib inline

1. 数据获取和整理

1.1 数据读取

train_file = h5py.File('./datasets/train_catvnoncat.h5')  # 训练集
print(train_file.keys())  # 获取文件的键
train_org_x = np.array(train_file["train_set_x"][:]) # 获取train_set_x并转化为array
train_org_y = np.array(train_file["train_set_y"][:]) 
test_file = h5py.File('./datasets/test_catvnoncat.h5') # 测试集
print(test_file.keys())
test_org_x = np.array(test_file["test_set_x"][:]) # 获取train_set_x并转化为array
test_org_y = np.array(test_file["test_set_y"][:]) 

print(train_org_x.shape)
print(train_org_y.shape)
print(test_org_x.shape)

(209, 64, 64, 3)
(209,)
(50, 64, 64, 3)

(209,)这种格式并不好，一般我们将其转化为(209,1)这种格式

train_y = train_org_y.reshape(-1,1)  # 不管多少行，但是仅一列
test_y = test_org_y.reshape(-1,1)   
train_y.shape

(209, 1)

1.2 reshape转化为需要的格式

train_x = train_org_x.reshape(209,-1).T
test__x = test_org_x.reshape(50,-1).T
train_x.shape

(12288, 209)

train_x = train_x / 255
test_x = test__x / 255

2. 参数初始化

def init(dim):
    w = np.zeros((dim,1))
    b = 0
    return w,b

3.一次前向传播

sigmoid函数

def sigmoid(Z):
    A = 1.0 / (1.0 + np.exp(-1.0 * Z))
    return A

前向传播

def propagate(w, b, X, Y):  # 参数：train_x train_y w b
    m = X.shape[1]    
    A = sigmoid(np.dot(w.T,X) + b).T
    
    cost = -(1.0 / m) * np.sum(Y*np.log(A) + (1-Y)*np.log(1-A))
    
    dw = (1 / m) * np.dot(X, (A - Y))   # 梯度dw, 这里A和Y都是(209,1)
    db = (1 / m) * np.sum(A - Y)   #  梯度db
    
    assert(dw.shape == w.shape)  
        
    return cost, dw, db   # 返回损失值，梯度(dw、db)

4. 反向传播

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):   
    costs = []
    
    for i in range(num_iterations): 
        cost, dw, db = propagate(w, b, X, Y)
        
        w = w - learning_rate * dw
        b = b - learning_rate * db
        
        if i % 100 == 0:
            costs.append(cost)
        
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
    
    params = {"w": w,
              "b": b}
    
    grads = {"dw": dw,
             "db": db}
    
    return params, grads, costs

5. 模型训练

w, b =init(train_x.shape[0])   
params, grads, costs = optimize(w, b, train_x, train_y, num_iterations=2000, learning_rate=0.005)   
w = params['w']
b = params['b']

# 测试集预测
def predict(w, b, X):
  
    m = X.shape[1]
    Y_prediction = np.zeros((1,m))
    w = w.reshape(X.shape[0], 1)
   
    A = sigmoid(np.dot(w.T,X) + b)
    
    for i in range(A.shape[1]):
        if A[0, i] >0.5:
            Y_prediction[0,i] = 1
        else:
            Y_prediction[0,i] = 0        
    
    assert(Y_prediction.shape == (1, m))
    return Y_prediction

Y_prediction_train = predict(w, b, train_x)
Y_prediction_test = predict(w,b, test_x)
print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - train_y.T)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - test_y.T)) * 100))

train accuracy: 99.04306220095694 %
test accuracy: 70.0 %

plt.plot(costs)    # 在训练集上的损失值变化

在这里插入图片描述

数据集在这：===>
github数据集

超级帅的陈星宇

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
手工搭建一层感知器线性回归网络（神经网络）

手工搭建深度学习线性回归模型
复制链接

扫一扫

专栏目录