神经网络笔记
代码
import numpy as np
# 设置随机种子以获得可重复的结果
np.random.seed(0)
# 定义激活函数及其导数
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x):
return x * (1 - x)
# 输入数据点
inputs = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
# 输出数据点
outputs = np.array([[0], [1], [1], [0]])
# 初始化权重和偏置
weights_input_hidden = 2 * np.random.random((2, 4)) - 1
weights_hidden_output = 2 * np.random.random((4, 1)) - 1
bias_hidden = np.zeros((1, 4))
bias_output = np.zeros((1, 1))
# 前向传播
hidden_layer_input = np.dot(inputs, weights_input_hidden) + bias_hidden
hidden_layer_output = sigmoid(hidden_layer_input)
output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + bias_output
predicted_output = sigmoid(output_layer_input)
# 打印预测输出
print("Predicted Output:")
print(predicted_output)
的神经网络图:
就看三层的神经网络如下
前向传播
输入数据点是(0, 0)
隐藏层的权重:
[[ 0.09762701 0.43037873 0.20552675 0.08976637]
[-0.1526904 0.29178823 -0.12482558 0.783546 ]]
输出层的权重:
[[ 0.92732552]
[-0.23311696]
[ 0.58345008]
[ 0.05778984]]
隐藏层的偏置:
[[0. 0. 0. 0.]]
输出层的偏置:
[[0.]]
神经网络的执行过程
先看一个数据的
输入数据点 (0,0) 的计算过程
输入数据点
inputs = [ 0 0 ] \text{inputs} = \begin{bmatrix} 0 & 0 \end{bmatrix} inputs=[00]
初始化权重和偏置
输入层到隐藏层的权重:
weights_input_hidden
=
[
0.09762701
0.43037873
0.20552675
0.08976637
−
0.1526904
0.29178823
−
0.12482558
0.783546
]
\text{weights\_input\_hidden} = \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix}
weights_input_hidden=[0.09762701−0.15269040.430378730.291788230.20552675−0.124825580.089766370.783546]
隐藏层到输出层的权重:
weights_hidden_output
=
[
0.92732552
−
0.23311696
0.58345008
0.05778984
]
\text{weights\_hidden\_output} = \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix}
weights_hidden_output=
0.92732552−0.233116960.583450080.05778984
隐藏层的偏置:
bias_hidden
=
[
0
0
0
0
]
\text{bias\_hidden} = \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix}
bias_hidden=[0000]
输出层的偏置:
bias_output
=
[
0
]
\text{bias\_output} = \begin{bmatrix} 0 \end{bmatrix}
bias_output=[0]
前向传播
- 计算隐藏层的输入:
hidden_layer_input = inputs ⋅ weights_input_hidden + bias_hidden \text{hidden\_layer\_input} = \text{inputs} \cdot \text{weights\_input\_hidden} + \text{bias\_hidden} hidden_layer_input=inputs⋅weights_input_hidden+bias_hidden
hidden_layer_input = [ 0 0 ] ⋅ [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] + [ 0 0 0 0 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix} hidden_layer_input=[00]⋅[0.09762701−0.15269040.430378730.291788230.20552675−0.124825580.089766370.783546]+[0000]
hidden_layer_input = [ 0 0 0 0 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix} hidden_layer_input=[0000]
- 计算隐藏层的输出(使用
sigmoid
激活函数):
hidden_layer_output = σ ( hidden_layer_input ) \text{hidden\_layer\_output} = \sigma(\text{hidden\_layer\_input}) hidden_layer_output=σ(hidden_layer_input)
σ ( x ) = 1 1 + e − x \sigma(x) = \frac{1}{1 + e^{-x}} σ(x)=1+e−x1
hidden_layer_output = [ σ ( 0 ) σ ( 0 ) σ ( 0 ) σ ( 0 ) ] \text{hidden\_layer\_output} = \begin{bmatrix} \sigma(0) & \sigma(0) & \sigma(0) & \sigma(0) \end{bmatrix} hidden_layer_output=[σ(0)σ(0)σ(0)σ(0)]
hidden_layer_output = [ 0.5 0.5 0.5 0.5 ] \text{hidden\_layer\_output} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \end{bmatrix} hidden_layer_output=[0.50.50.50.5]
- 计算输出层的输入:
output_layer_input = hidden_layer_output ⋅ weights_hidden_output + bias_output \text{output\_layer\_input} = \text{hidden\_layer\_output} \cdot \text{weights\_hidden\_output} + \text{bias\_output} output_layer_input=hidden_layer_output⋅weights_hidden_output+bias_output
output_layer_input = [ 0.5 0.5 0.5 0.5 ] ⋅ [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] + [ 0 ] \text{output\_layer\_input} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \end{bmatrix} \cdot \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix} + \begin{bmatrix} 0 \end{bmatrix} output_layer_input=[0.50.50.50.5]⋅ 0.92732552−0.233116960.583450080.05778984 +[0]
output_layer_input = 0.5 ⋅ 0.92732552 + 0.5 ⋅ ( − 0.23311696 ) + 0.5 ⋅ 0.58345008 + 0.5 ⋅ 0.05778984 \text{output\_layer\_input} = 0.5 \cdot 0.92732552 + 0.5 \cdot (-0.23311696) + 0.5 \cdot 0.58345008 + 0.5 \cdot 0.05778984 output_layer_input=0.5⋅0.92732552+0.5⋅(−0.23311696)+0.5⋅0.58345008+0.5⋅0.05778984
output_layer_input = 0.66772424 \text{output\_layer\_input} = 0.66772424 output_layer_input=0.66772424
- 计算最终输出(使用
sigmoid
激活函数):
predicted_output = σ ( output_layer_input ) \text{predicted\_output} = \sigma(\text{output\_layer\_input}) predicted_output=σ(output_layer_input)
predicted_output = σ ( 0.66772424 ) \text{predicted\_output} = \sigma(0.66772424) predicted_output=σ(0.66772424)
predicted_output = 1 1 + e − 0.66772424 \text{predicted\_output} = \frac{1}{1 + e^{-0.66772424}} predicted_output=1+e−0.667724241
predicted_output ≈ 0.66099582 \text{predicted\_output} \approx 0.66099582 predicted_output≈0.66099582
打印预测输出
Predicted Output for input (0,0):
[[0.66099582]]
下面是全部数据点的计算过程
输入数据点
inputs = [ 0 0 0 1 1 0 1 1 ] \text{inputs} = \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ 1 & 0 \\ 1 & 1 \end{bmatrix} inputs= 00110101
输出数据点
outputs = [ 0 1 1 0 ] \text{outputs} = \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix} outputs= 0110
初始化权重和偏置
输入层到隐藏层的权重 : weights_input_hidden = [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] 输入层到隐藏层的权重: \text{weights\_input\_hidden} = \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix} 输入层到隐藏层的权重:weights_input_hidden=[0.09762701−0.15269040.430378730.291788230.20552675−0.124825580.089766370.783546]
隐藏层到输出层的权重 : weights_hidden_output = [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] 隐藏层到输出层的权重: \text{weights\_hidden\_output} = \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix} 隐藏层到输出层的权重:weights_hidden_output= 0.92732552−0.233116960.583450080.05778984
隐藏层的偏置 : bias_hidden = [ 0 0 0 0 ] 隐藏层的偏置: \text{bias\_hidden} = \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix} 隐藏层的偏置:bias_hidden=[0000]
输出层的偏置 : bias_output = [ 0 ] 输出层的偏置: \text{bias\_output} = \begin{bmatrix} 0 \end{bmatrix} 输出层的偏置:bias_output=[0]
前向传播
- 计算隐藏层的输入:
hidden_layer_input = inputs ⋅ weights_input_hidden + bias_hidden \text{hidden\_layer\_input} = \text{inputs} \cdot \text{weights\_input\_hidden} + \text{bias\_hidden} hidden_layer_input=inputs⋅weights_input_hidden+bias_hidden
hidden_layer_input = [ 0 0 0 1 1 0 1 1 ] ⋅ [ 0.09762701 0.43037873 0.20552675 0.08976637 − 0.1526904 0.29178823 − 0.12482558 0.783546 ] + [ 0 0 0 0 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ 1 & 0 \\ 1 & 1 \end{bmatrix} \cdot \begin{bmatrix} 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \end{bmatrix} hidden_layer_input= 00110101 ⋅[0.09762701−0.15269040.430378730.291788230.20552675−0.124825580.089766370.783546]+[0000]
hidden_layer_input = [ 0 0 0 0 − 0.1526904 0.29178823 − 0.12482558 0.783546 0.09762701 0.43037873 0.20552675 0.08976637 − 0.05506339 0.72216696 0.08070117 0.87331237 ] \text{hidden\_layer\_input} = \begin{bmatrix} 0 & 0 & 0 & 0 \\ -0.1526904 & 0.29178823 & -0.12482558 & 0.783546 \\ 0.09762701 & 0.43037873 & 0.20552675 & 0.08976637 \\ -0.05506339 & 0.72216696 & 0.08070117 & 0.87331237 \end{bmatrix} hidden_layer_input= 0−0.15269040.09762701−0.0550633900.291788230.430378730.722166960−0.124825580.205526750.0807011700.7835460.089766370.87331237
- 计算隐藏层的输出(使用
sigmoid
激活函数):
hidden_layer_output = σ ( hidden_layer_input ) σ ( x ) = 1 1 + e − x \text{hidden\_layer\_output} = \sigma(\text{hidden\_layer\_input})\\ \sigma(x) = \frac{1}{1 + e^{-x}} hidden_layer_output=σ(hidden_layer_input)σ(x)=1+e−x1
hidden_layer_output = [ σ ( 0 ) σ ( 0 ) σ ( 0 ) σ ( 0 ) σ ( − 0.1526904 ) σ ( 0.29178823 ) σ ( − 0.12482558 ) σ ( 0.783546 ) σ ( 0.09762701 ) σ ( 0.43037873 ) σ ( 0.20552675 ) σ ( 0.08976637 ) σ ( − 0.05506339 ) σ ( 0.72216696 ) σ ( 0.08070117 ) σ ( 0.87331237 ) ] \text{hidden\_layer\_output} = \begin{bmatrix} \sigma(0) & \sigma(0) & \sigma(0) & \sigma(0) \\ \sigma(-0.1526904) & \sigma(0.29178823) & \sigma(-0.12482558) & \sigma(0.783546) \\ \sigma(0.09762701) & \sigma(0.43037873) & \sigma(0.20552675) & \sigma(0.08976637) \\ \sigma(-0.05506339) & \sigma(0.72216696) & \sigma(0.08070117) & \sigma(0.87331237) \end{bmatrix} hidden_layer_output= σ(0)σ(−0.1526904)σ(0.09762701)σ(−0.05506339)σ(0)σ(0.29178823)σ(0.43037873)σ(0.72216696)σ(0)σ(−0.12482558)σ(0.20552675)σ(0.08070117)σ(0)σ(0.783546)σ(0.08976637)σ(0.87331237)
hidden_layer_output = [ 0.5 0.5 0.5 0.5 0.461902 0.572429 0.468833 0.686454 0.524387 0.605967 0.551210 0.522426 0.486237 0.673110 0.520163 0.705455 ] \text{hidden\_layer\_output} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \\ 0.461902 & 0.572429 & 0.468833 & 0.686454 \\ 0.524387 & 0.605967 & 0.551210 & 0.522426 \\ 0.486237 & 0.673110 & 0.520163 & 0.705455 \end{bmatrix} hidden_layer_output= 0.50.4619020.5243870.4862370.50.5724290.6059670.6731100.50.4688330.5512100.5201630.50.6864540.5224260.705455
- 计算输出层的输入:
output_layer_input = hidden_layer_output ⋅ weights_hidden_output + bias_output \text{output\_layer\_input} = \text{hidden\_layer\_output} \cdot \text{weights\_hidden\_output} + \text{bias\_output} output_layer_input=hidden_layer_output⋅weights_hidden_output+bias_output
output_layer_input = [ 0.5 0.5 0.5 0.5 0.461902 0.572429 0.468833 0.686454 0.524387 0.605967 0.551210 0.522426 0.486237 0.673110 0.520163 0.705455 ] ⋅ [ 0.92732552 − 0.23311696 0.58345008 0.05778984 ] + [ 0 ] \text{output\_layer\_input} = \begin{bmatrix} 0.5 & 0.5 & 0.5 & 0.5 \\ 0.461902 & 0.572429 & 0.468833 & 0.686454 \\ 0.524387 & 0.605967 & 0.551210 & 0.522426 \\ 0.486237 & 0.673110 & 0.520163 & 0.705455 \end{bmatrix} \cdot \begin{bmatrix} 0.92732552 \\ -0.23311696 \\ 0.58345008 \\ 0.05778984 \end{bmatrix} + \begin{bmatrix} 0 \end{bmatrix} output_layer_input= 0.50.4619020.5243870.4862370.50.5724290.6059670.6731100.50.4688330.5512100.5201630.50.6864540.5224260.705455 ⋅ 0.92732552−0.233116960.583450080.05778984 +[0]
output_layer_input = [ 0.66772424 0.52997619 0.66047078 0.55106483 ] \text{output\_layer\_input} = \begin{bmatrix} 0.66772424 \\ 0.52997619 \\ 0.66047078 \\ 0.55106483 \end{bmatrix} output_layer_input= 0.667724240.529976190.660470780.55106483
- 计算最终输出(使用
sigmoid
激活函数):
predicted_output = σ ( output_layer_input ) \text{predicted\_output} = \sigma(\text{output\_layer\_input}) predicted_output=σ(output_layer_input)
predicted_output = σ [ 0.66772424 0.52997619 0.66047078 0.55106483 ] \text{predicted\_output} = \sigma \begin{bmatrix} 0.66772424 \\ 0.52997619 \\ 0.66047078 \\ 0.55106483 \end{bmatrix} predicted_output=σ 0.667724240.529976190.660470780.55106483
predicted_output = [ 0.66099582 0.62941358 0.65933295 0.63440526 ] \text{predicted\_output} = \begin{bmatrix} 0.66099582 \\ 0.62941358 \\ 0.65933295 \\ 0.63440526 \end{bmatrix} predicted_output= 0.660995820.629413580.659332950.63440526
打印预测输出
Predicted Output:
[[0.66099582]
[0.62941358]
[0.65933295]
[0.63440526]]