深度学习中对全连接层的理解

最新推荐文章于 2024-09-03 18:10:16 发布

lzfshub

最新推荐文章于 2024-09-03 18:10:16 发布

阅读量976

点赞数

文章标签：深度学习 python tensorflow 机器学习神经网络

本文链接：https://blog.csdn.net/weixin_40511249/article/details/108427708

版权

1. 单层模型

1.1 Recap（回顾）

out $= f (X @ W + b)$

例如：out $=\operatorname{relu}(X @ W+b)$

注：f(x)称为激活函数

2.1 X @ W+b（逐渐降维，将高阶原始样本降为低阶分类样本）

out $=\operatorname{relu}(X @ W+b)$
$\left[\begin{array}{cc}h_{0}^{0} & h_{1}^{0} \\ h_{0}^{1} & h_{1}^{1}\end{array}\right]=\operatorname{relu}\left(\left[\begin{array}{ccc}x_{0}^{0} & x_{1}^{0} & x_{2}^{0} \\ x_{0}^{1} & x_{1}^{1} & x_{2}^{1}\end{array}\right] @\left[\begin{array}{cc}w_{00} & w_{01} \\ w_{10} & w_{11} \\ w_{20} & w_{21}\end{array}\right]+\left[\begin{array}{ll}b_{0} & b_{1}\end{array}\right]\right)$

h代表输出，x代表变量，b代表偏置(y = kx + b线性模型)

import tensorflow as tf

x = tf.random.normal([4, 784])
net = tf.keras.layers.Dense(512) # 指定输出值的维度
# 写在这儿会报错，因为没有创建权重和偏置
# print("权重矩阵为：", net.kernel.shape)
# print("偏置矩阵为：", net.bias.shape)
net.build(input_shape=(None, 784)) # 用于创建权重和偏置，如果没有创建net(x)会自动创建，样本个数不影响创建w，b只有样本变量才影响
out = net(x) # 若没有创建权重和偏置，系统会自动调用build函数创建，为随机默认值
print("权重矩阵为：", net.kernel.shape)
print("偏置矩阵为：", net.bias.shape)
print("输出值为：", out)

输出为：

权重矩阵为： (784, 512)
偏置矩阵为： (512,)
输出值为： tf.Tensor(
[[ 0.7432524   0.39359367 -2.2896104  ...  0.05215064  0.0592235
  -0.63905406]
 [-1.9503739   0.11022064  1.0802195  ...  0.632308   -1.4672192
  -0.87029684]
 [ 1.5104785   1.1086452   0.11695671 ...  0.6099293  -0.45776743
   0.32787204]
 [-0.45765805 -1.6942897  -0.18677613 ... -0.21831259  0.78881395
   0.65277463]], shape=(4, 512), dtype=float32)

2. 多层连接

分为：

Input 输入层
Hidden 隐藏层（黑箱）
Output 输出层

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yZod6OCN-1599353908281)(https://i.loli.net/2020/09/06/M4blTOZWj7tcHXR.png)]

import tensorflow as tf
# keras.Sequential([layer1, layer2, layer3]) 把每个Dense组成一个list，调用一次完成前算一次
x = tf.random.normal([2, 3])

model = tf.keras.Sequential([
    tf.keras.layers.Dense(2, activation='relu'),# 变成二维
    tf.keras.layers.Dense(2, activation='relu'),# 变成二维
    tf.keras.layers.Dense(2)# 变成二维
])
model.build(input_shape=[None, 3]) # 样本个数不影响创建w，b只有样本变量才影响
model.summary() # print 查看网络，返回list [w1, b1, w2, b2, w3, b3]，所有可训练的参数

# 返回list [w1, b1, w2, b2, w3, b3]，所有可训练的参数
for p in model.trainable_variables:
    print(p.name, p.shape)

输出为：

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 2)                 8
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 6
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 6
=================================================================
Total params: 20
Trainable params: 20
Non-trainable params: 0
_________________________________________________________________
dense/kernel:0 (3, 2)
dense/bias:0 (2,)
dense_1/kernel:0 (2, 2)
dense_1/bias:0 (2,)
dense_2/kernel:0 (2, 2)
dense_2/bias:0 (2,)