从头编写神经网络（2）

艾勒姆

已于 2024-08-12 17:06:26 修改

阅读量821

点赞数 22

文章标签：神经网络人工智能深度学习

于 2024-08-10 16:53:58 首次发布

本文链接：https://blog.csdn.net/2303_76402248/article/details/141093040

版权

多层神经网络

多层神经网络的构建就是通过添加隐藏层来实现的。在多层神经网络中前一层的输出就是后一层的输入。隐藏层的输出数据我们无法看到，但是我们可以访问，这使得我们可以通过它们来诊断问题和改进模型。现在在上一个神经网络的基础上，在加上一层神经网络。

下面是代码：

import numpy as np
inputs = [[1, 2, 3, 2.5],
            [2, 5, -1, 2],
            [-1.5, 2.7, 3.3, -0.8]]
inputs = np.array(inputs)

# 第一层的权重和偏置
weights1 = [[0.2, 0.8, -0.5, 1],
            [0.5, -0.91, 0.26, -0.5],
            [-0.26, -0.27, 0.17, 0.87]]
weights1 = np.array(weights1)

biases1 = [2, 3, 0.5]
biases1 = np.array(biases1)

# 第二层的权重和偏置
weights2 = [[0.1, -0.14, 0.5],
            [-0.5, 0.12, -0.33],
            [-0.44, 0.73, -0.13]]
weights2 = np.array(weights2)

biases2 = [-1, 2, -0.5]
biases2 = np.array(biases2)

可知，添加的那一层也有3个神经元。

对于多层神经网络，计算输出值的公式是：

假如第一层的权重是 $W_1$ ，偏置是 $b_1$ ，第二层的权重是 $W_2$ ，偏置是 $b_2$ ，第一层的输出结果是 $Y_1$ ，则神经网络的输出结果是：
$Y_1 = inputs \times W_1 + b_1 \\ Y = Y_1 \times W_2 + b_2$

到后面可能会有激活函数，到时候就把激活函数加进去即可。

下面是神经网络的计算结果：

import numpy as np

layer1_output = np.dot(inputs, np.array(weights1).T) + biases1
layer_output = np.dot(layer1_output, np.array(weights2).T) + biases2

print('神经网络输出：\n', layer_output)

神经网络输出：
 [[ 0.50310004 -1.04184985 -2.03874993]
 [ 0.24339998 -2.73320007 -5.76329994]
 [-0.99314     1.41254002 -0.35655001]]

训练数据

训练数据：就是用来训练的数据，即接受的并用它来改变参数的数据。

我们可以自己创建训练数据，这个数据可以是线性的也可以是非线性的。

我们可以通过nnfs库来创建训练数据。

nnfs库：这主要是用于是适合初学者，侧重于从零开始构建神经网络，以便学习者深入了解每一步的细节。

nnfs库的主要功能：

数据预处理：生成和处理数据集。
神经元层：实现基本的全连接层。
激活函数：包括如Softmax等常用激活函数。
损失函数：实现交叉熵损失函数。
优化器：实现基本的优化算法，如SGD。

下面是简单的生成数据的代码：

# 用Nump函数来生成螺旋形状的数据集
# 主要是生成3个数据集
import numpy as np
# 用matplotlib来画图
import matplotlib.pyplot as plt
# 初始化一些参数
N = 100 # 点的个数
D = 2 # 数据的维度
K = 3 # 类别数

X = np.zeros((N*K, D))
y = np.zeros(N*K, dtype='uint8')

print('要生成的X数据集大概是：\n', X[:10])
print('要生成的y数据集大概是：\n', y[:10])

要生成的X数据集大概是：
 [[0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]]
要生成的y数据集大概是：
 [0 0 0 0 0 0 0 0 0 0]

通过代码和输出结果可以看出，X是二维数组，即是一个N*K行D列的矩阵。在代码中
X数据生成时通过指定形状来生成的，而生成y时通过指定数据集的长度和类型来生成的。
其中 dtype='uint8是表示数组的数据类型为无符号8位整数。

不过现在的数据还是初始化的数据（都是0），下面通过一些函数来给他赋值。

# 要生成3组数据集，就是把300个数据集分成3份，每份100个数据集
for j in range(K):
    ix = range(N*j, N*(j+1))
    r = np.linspace(0.0, 1, N)
    t = np.linspace(j*4, (j+1)*4, N) + np.random.randn(N)*0.2
    X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
    y[ix] = j

看一下当j为0的时候会生成的数据：

ix 是0到99的索引。
r 是0到1之间的等差数，共有100个，作为径向坐标。
t 是0到4之间的等差数加上一些随机噪声，作为角度坐标。
X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]：用极坐标生成螺旋数据点，转换为笛卡尔坐标。
最后是为每个类别的点分配标签。

np.c_是讲两个一维数组沿第二个轴组合，生成二维数组。

代码示例：

list1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
list2 = np.linspace(1, 100, 10)
list3 = np.c_[list1, list2]
print('list1是：\n', list1)
print('用linspace生成的列表是：\n', list2)
print('用c_生成的列表是：\n', list3)
list4 = list1 + list2
print('用+生成的列表是：\n', list4)
print('输出list3中第一个数字；', list3[0, 0])
print('输出list3中的第一列：\n', list3[:, 0])
print('输出list3中的第二列：\n', list3[:, 1])

list1是：
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
用linspace生成的列表是：
 [  1.  12.  23.  34.  45.  56.  67.  78.  89. 100.]
用c_生成的列表是：
 [[  1.   1.]
 [  2.  12.]
 [  3.  23.]
 [  4.  34.]
 [  5.  45.]
 [  6.  56.]
 [  7.  67.]
 [  8.  78.]
 [  9.  89.]
 [ 10. 100.]]
用+生成的列表是：
 [  2.  14.  26.  38.  50.  62.  74.  86.  98. 110.]
输出list3中第一个数字； 1.0
输出list3中的第一列：
 [ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
输出list3中的第二列：
 [  1.  12.  23.  34.  45.  56.  67.  78.  89. 100.]

对于二维数组有必要学会切片操作。

操作语法是：

array[start:stop:step]
array[row_sart:row_end:row_step, col_start:col_end:col_step]

主要掌握第二个：

row_start是行的起始索引（包括），默认为0。
row_end是行的结束索引（不包括），默认为数组长度。
row_step是行的步长，默认为1。
col_start是列的起始索引（包括），默认为0。
col_end是列的结束索引（不包括），默认为数组长度。
col_step是列的步长，默认为1。

# 创建绘图对象
fig, ax = plt.subplots()
# 绘制数据
scatter = ax.scatter(X[:, 0], X[:, 1], c=y, s=20, cmap='viridis')
# 设置绘图区域背景颜色
ax.set_facecolor('lightgrey')  # 将绘图区域背景设置为浅灰色
# 设置整个图表的背景颜色
fig.patch.set_facecolor('white')  # 将整个图表背景设置为白色
# 显示图表
plt.show()

在这里插入图片描述

其中cmap=viridis：使用 viridis 颜色映射将 y 的值转换为颜色。

from nnfs.datasets import spiral_data
import numpy as np
import nnfs
nnfs.init()
import matplotlib.pyplot as plt
X, y = spiral_data(samples=100, classes=3)
# 生成螺旋形数据，函数返回数据点的坐标和类别的标签。3类，每类有100个样本
# 生成2维数组X和一维数组y。
plt.scatter(X[:, 0], X[:, 1])
print(X.shape)
# 绘制散点图。
plt.show()

(300, 2)

在这里插入图片描述

第一行：从nnfs库中导入生成螺旋形数据的函数spiral_data。
nnfs.init()：初始化nnfs库，设置随机种子等默认参数，以确保结果可复现。

plt.scatter(X[:, 0], X[:, 1], c=y, cmap='brg')
plt.show()

在这里插入图片描述

这些颜色这是给读者看的，根据颜色我们可以给模型的类别编号，例如0、1和2。

上面的例子就是先是生成数据X矩阵，只不过这个矩阵的数据是按一定规律随机生成的，要注意的是：这个举证是300行2列的。其中X就是我们要的数据集，y是标签，就是通过标签来分类的，所以在后面y相同时点的颜色就相同（即是同一类别）。

在来看一个简单的例子：

import numpy as np 
# 数据集
data = np.array([
    [0.1, 0.2, 0.3, 0.4],
    [0.5, 0.6, 0.7, 0.8],
    [0.9, 1.0, 1.1, 1.2],
    [1.3, 1.4, 1.5, 1.6]
])
# 数据标签
lables = np.array([0, 1, 0, 1])
# 划分训练数据和测试数据
train_data = data[:3, :] # 前3个数据
train_labels = lables[:3]
test_data = data[3:, :]
test_labels = lables[3:]

print('训练数据：\n', train_data)
print('训练标签：\n', train_labels)
print('测试数据：\n', test_data)
print('测试标签：\n', test_labels)

训练数据：
 [[0.1 0.2 0.3 0.4]
 [0.5 0.6 0.7 0.8]
 [0.9 1.  1.1 1.2]]
训练标签：
 [0 1 0]
测试数据：
 [[1.3 1.4 1.5 1.6]]
测试标签：
 [1]

密集层类

Dense Layer（密集层）是神经网络中的一种基本层类型，也称为全连接层。

在密集层中，每个输入节点与每个输出节点之间都有一个连接。

下面定义一个密集层类：

class layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        # 初始化权重和偏差
        pass

    def forward(self, inputs):
        # 根据输入和权重计算输出
        pass

先用前向传播的方法来预测，但是要知道这不是唯一的方法。

class layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.01 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))

随机初始化权重，偏差设置为0。

np.random.randn会产生均值为0，方差为1的高斯分布，这意味着会产生以0为中心且均值接近0的随机数（正数和负数）。一般来说，神经网络在使用值时效果最佳值介于-1和1之间。

import numpy as np 
import nnfs

nnfs.init()

print(np.random.randn(2, 5))

[[ 1.7640524   0.4001572   0.978738    2.2408931   1.867558  ]
 [-0.9772779   0.95008844 -0.1513572  -0.10321885  0.41059852]]

上面的代码生成了均值为0（元素的均值）,2行5列的随机矩阵。可以zeros来创建0矩阵：

print(np.zeros((2, 5)))

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]

# 完整代码
import numpy as np
import nnfs 

nnfs.init()

n_inputs = 2
n_neurons = 4

weights = 0.01 * np.random.randn(n_inputs, n_neurons)
biases = np.zeros((1, n_neurons))

print(weights)
print(biases)

[[ 0.01764052  0.00400157  0.00978738  0.02240893]
 [ 0.01867558 -0.00977278  0.00950088 -0.00151357]]
[[0. 0. 0. 0.]]

class layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.01 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    
    def forward(self, inputs):
        self.outputs = np.dot(inputs, self.weights) + self.biases

完整代码：

import numpy as np
import nnfs
from nnfs.datasets import spiral_data

nnfs.init()

# 致密层
class Layer_Dense:
    # 初始化层
    def __init__(self, n_inputs, n_neurons):
        # 初始化权重和偏置
        self.weiths = 0.01 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))

    # 前向传播
    def forward(self, inputs):
        # 计算输出值
        self.output = np.dot(inputs, self.weiths) + self.biases

x, y = spiral_data(samples=100, classes=3)
# 创建具有2个输入特征和3个输出值的致密层
densl = Layer_Dense(2, 3)
# 通过此层执行我们的训练数据向前传递 
densl.forward(x)
# 让我们看看前几个示例的输出： 
print(densl.output[:5])

[[ 0.0000000e+00  0.0000000e+00  0.0000000e+00]
 [-1.0475188e-04  1.1395361e-04 -4.7983500e-05]
 [-2.7414842e-04  3.1729150e-04 -8.6921798e-05]
 [-4.2188365e-04  5.2666257e-04 -5.5912682e-05]
 [-5.7707680e-04  7.1401405e-04 -8.9430439e-05]]

print(np.exp(-1))
print(np.exp(0))
print(np.e)

0.36787944117144233
1.0
2.718281828459045

import numpy as np

np.random.seed(0)

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1-X)

# 输入点数据，这里我们有一组输入值
inputs = np.array([[0, 0],
                  [0, 1],
                  [1, 0],
                  [1, 1]])

outputs = np.array([0, 1, 1, 0])

# 随机初始化权重和偏置
weights_input_hidden = 2 * np.random.random((2, 4)) - 1
weights_hidden_output = 2 * np.random.random((4, 1)) - 1
bias_hidden = np.zeros((1, 4))
bias_output = np.zeros((1, 1))

# 输出看看
print('隐藏层的权重：\n', weights_input_hidden)
print('输出层的权重：\n', weights_hidden_output)
print('隐藏层的偏置：\n', bias_hidden)
print('输出层的偏置：\n', bias_output)

隐藏层的权重：
 [[ 0.09762701  0.43037873  0.20552675  0.08976637]
 [-0.1526904   0.29178823 -0.12482558  0.783546  ]]
输出层的权重：
 [[ 0.92732552]
 [-0.23311696]
 [ 0.58345008]
 [ 0.05778984]]
隐藏层的偏置：
 [[0. 0. 0. 0.]]
输出层的偏置：
 [[0.]]

# 前向传播
hidden_layer_input = np.dot(inputs, weights_input_hidden) + bias_hidden
hidden_layer_output = sigmoid(hidden_layer_input)

output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + bias_output
predicted_output = sigmoid(output_layer_input)

# 打印预测结果
print("Predicted Output:", predicted_output)

Predicted Output: [[0.66099339]
 [0.64750719]
 [0.66747943]
 [0.65435779]]

对于上面的代码的计算过程的详细说明

的神经网络图：

就看三层的神经网络如下

前向传播

输入数据点是（0, 0）

隐藏层的权重：
[[ 0.09762701 0.43037873 0.20552675 0.08976637]
[-0.1526904 0.29178823 -0.12482558 0.783546 ]]
输出层的权重：
[[ 0.92732552]
[-0.23311696]
[ 0.58345008]
[ 0.05778984]]
隐藏层的偏置：
[[0. 0. 0. 0.]]
输出层的偏置：
[[0.]]

神经网络的执行过程

输入数据点 (0,0) 的计算过程

输入数据点

$\text{inputs} = \begin{bmatrix} 0 & 0 \end{bmatrix}$

初始化权重和偏置

输入层到隐藏层的权重:
$weights_input_hidden = [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] \text{weights\_input\_hidden} = \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix}$

隐藏层到输出层的权重:
$weights_hidden_output = [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] \text{weights\_hidden\_output} = \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix}$

隐藏层的偏置:
$bias_hidden = [ 0 0 0 0 ] \text{bias\_hidden} = \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix}$

输出层的偏置:
$bias_output = [ 0 ] \text{bias\_output} = \begin{bmatrix} 0 \end{bmatrix}$

前向传播

计算隐藏层的输入:
$hidden_layer_input = inputs ⋅ weights_input_hidden + bias_hidden \text{hidden\_layer\_input} = \text{inputs} \cdot \text{weights\_input\_hidden} + \text{bias\_hidden}$

$hidden_layer_input = [ 0 0 ] ⋅ [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] + [ 0 0 0 0 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix}$

$hidden_layer_input = [ 0 0 0 0 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix}$

计算隐藏层的输出（使用 sigmoid 激活函数）:
$hidden_layer_output = σ ( hidden_layer_input ) \text{hidden\_layer\_output} = \sigma(\text{hidden\_layer\_input})$
$\sigma(x) = \frac{1}{1 + e^{-x}}$

$hidden_layer_output = [ σ ( 0 ) σ ( 0 ) σ ( 0 ) σ ( 0 ) ] \text{hidden\_layer\_output} = \begin{bmatrix} \sigma(0) & \sigma(0) & \sigma(0) & \sigma(0) \end{bmatrix}$

$hidden_layer_output = [ 0.5 0.5 0.5 0.5 ] \text{hidden\_layer\_output} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \end{bmatrix}$

计算输出层的输入:
$output_layer_input = hidden_layer_output ⋅ weights_hidden_output + bias_output \text{output\_layer\_input} = \text{hidden\_layer\_output} \cdot \text{weights\_hidden\_output} + \text{bias\_output}$

$output_layer_input = [ 0.5 0.5 0.5 0.5 ] ⋅ [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] + [ 0 ] \text{output\_layer\_input} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \end{bmatrix} \cdot \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix} + \begin{bmatrix} 0 \end{bmatrix}$

$output_layer_input = 0.5 ⋅ 0.92732552 + 0.5 ⋅ ( − 0.23311696 ) + 0.5 ⋅ 0.58345008 + 0.5 ⋅ 0.05778984 \text{output\_layer\_input} = 0.5 \cdot 0.92732552 + 0.5 \cdot (-0.23311696) + 0.5 \cdot 0.58345008 + 0.5 \cdot 0.05778984$

$output_layer_input = 0.66772424 \text{output\_layer\_input} = 0.66772424$

计算最终输出（使用 sigmoid 激活函数）:
$predicted_output = σ ( output_layer_input ) \text{predicted\_output} = \sigma(\text{output\_layer\_input})$

$predicted_output = σ ( 0.66772424 ) \text{predicted\_output} = \sigma(0.66772424)$

$predicted_output = 1 1 + e − 0.66772424 \text{predicted\_output} = \frac{1}{1 + e^{-0.66772424}}$

$predicted_output ≈ 0.66099582 \text{predicted\_output} \approx 0.66099582$

打印预测输出

Predicted Output for input (0,0):
[[0.66099582]]

输入数据点

$\text{inputs} = \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ 1 & 0 \\ 1 & 1 \end{bmatrix}$

输出数据点

$\text{outputs} = \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}$

初始化权重和偏置

$weights_input_hidden = [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] 输入层到隐藏层的权重: \text{weights\_input\_hidden} = \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix}$

$weights_hidden_output = [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] 隐藏层到输出层的权重: \text{weights\_hidden\_output} = \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix}$

$bias_hidden = [ 0 0 0 0 ] 隐藏层的偏置: \text{bias\_hidden} = \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix}$

$bias_output = [ 0 ] 输出层的偏置: \text{bias\_output} = \begin{bmatrix} 0 \end{bmatrix}$

前向传播

计算隐藏层的输入:
$hidden_layer_input = inputs ⋅ weights_input_hidden + bias_hidden \text{hidden\_layer\_input} = \text{inputs} \cdot \text{weights\_input\_hidden} + \text{bias\_hidden}$

$hidden_layer_input = [ 0 0 0 1 1 0 1 1 ] ⋅ [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] + [ 0 0 0 0 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ 1 & 0 \\ 1 & 1 \end{bmatrix} \cdot \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix}$

$hidden_layer_input = [ 0 0 0 0 − 0.1526904 0.29178823 − 0.12482558 0.783546 0.09762701 0.43037873 0.20552675 0.08976637 − 0.05506339 0.72216696 0.08070117 0.87331237 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 & 0 & 0 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \\ 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.05506339 & 0.72216696 & 0.08070117 & 0.87331237 \end{bmatrix}$

计算隐藏层的输出（使用 sigmoid 激活函数）:
$hidden_layer_output = σ ( hidden_layer_input ) σ ( x ) = 1 1 + e − x \text{hidden\_layer\_output} = \sigma(\text{hidden\_layer\_input})\\ \sigma(x) = \frac{1}{1 + e^{-x}}$

$hidden_layer_output = [ σ ( 0 ) σ ( 0 ) σ ( 0 ) σ ( 0 ) σ ( − 0.1526904 ) σ ( 0.29178823 ) σ ( − 0.12482558 ) σ ( 0.783546 ) σ ( 0.09762701 ) σ ( 0.43037873 ) σ ( 0.20552675 ) σ ( 0.08976637 ) σ ( − 0.05506339 ) σ ( 0.72216696 ) σ ( 0.08070117 ) σ ( 0.87331237 ) ] \text{hidden\_layer\_output} = \begin{bmatrix} \sigma(0) & \sigma(0) & \sigma(0) & \sigma(0) \\ \sigma(-0.1526904) & \sigma(0.29178823) & \sigma(-0.12482558) & \sigma(0.783546) \\ \sigma(0.09762701) & \sigma(0.43037873) & \sigma(0.20552675) & \sigma(0.08976637) \\ \sigma(-0.05506339) & \sigma(0.72216696) & \sigma(0.08070117) & \sigma(0.87331237) \end{bmatrix}$

$hidden_layer_output = [ 0.5 0.5 0.5 0.5 0.461902 0.572429 0.468833 0.686454 0.524387 0.605967 0.551210 0.522426 0.486237 0.673110 0.520163 0.705455 ] \text{hidden\_layer\_output} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \\ 0.461902 & 0.572429 & 0.468833 & 0.686454 \\ 0.524387 & 0.605967 & 0.551210 & 0.522426 \\ 0.486237 & 0.673110 & 0.520163 & 0.705455 \end{bmatrix}$

计算输出层的输入:
$output_layer_input = hidden_layer_output ⋅ weights_hidden_output + bias_output \text{output\_layer\_input} = \text{hidden\_layer\_output} \cdot \text{weights\_hidden\_output} + \text{bias\_output}$

$output_layer_input = [ 0.5 0.5 0.5 0.5 0.461902 0.572429 0.468833 0.686454 0.524387 0.605967 0.551210 0.522426 0.486237 0.673110 0.520163 0.705455 ] ⋅ [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] + [ 0 ] \text{output\_layer\_input} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \\ 0.461902 & 0.572429 & 0.468833 & 0.686454 \\ 0.524387 & 0.605967 & 0.551210 & 0.522426 \\ 0.486237 & 0.673110 & 0.520163 & 0.705455 \end{bmatrix} \cdot \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix} + \begin{bmatrix} 0 \end{bmatrix}$

$output_layer_input = [ 0.66772424 0.52997619 0.66047078 0.55106483 ] \text{output\_layer\_input} = \begin{bmatrix} 0.66772424 \\ 0.52997619 \\ 0.66047078 \\ 0.55106483 \end{bmatrix}$

计算最终输出（使用 sigmoid 激活函数）:
$predicted_output = σ ( output_layer_input ) \text{predicted\_output} = \sigma(\text{output\_layer\_input})$

$predicted_output = σ [ 0.66772424 0.52997619 0.66047078 0.55106483 ] \text{predicted\_output} = \sigma \begin{bmatrix} 0.66772424 \\ 0.52997619 \\ 0.66047078 \\ 0.55106483 \end{bmatrix}$

$predicted_output = [ 0.66099582 0.62941358 0.65933295 0.63440526 ] \text{predicted\_output} = \begin{bmatrix} 0.66099582 \\ 0.62941358 \\ 0.65933295 \\ 0.63440526 \end{bmatrix}$

打印预测输出

Predicted Output:
[[0.66099582]
 [0.62941358]
 [0.65933295]
 [0.63440526]]

注意虽然代码结果和计算结果不是完全相同，会有一点的误差，但是思路都是相同的。

艾勒姆

关注

22
点赞
踩
30

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫