ResNet(残差网络)
H(x) = F(x) + x
F(x) = H(x) - x
当x为最优解时(或趋近最优解时),为了保证下一层网络状态也是最优,只需令F(x)=0即可,此时,H(x) = x
F是求和前网络映射,H是从输入到求和后的网络映射。比如把5映射到5.1,那么引入残差前是
F'(5)=5.1,引入残差后是H(5)=5.1, H(5)=F(5)+5, F(5)=0.1。这里的F'和F都表示网络
参数映射,引入残差后的映射对输出的变化更敏感。比如s输出从5.1变到5.2,映射F'的输出增
加了1/51=2%,而对于残差结构输出从5.1到5.2,映射F是从0.1到0.2,增加了100%。明显后
者输出变化对权重的调整作用更大,所以效果更好。残差的思想都是去掉相同的主体部分,从而突
出微小的变化,看到残差网络我第一反应就是差分放大器
更多解释:
https://www.cnblogs.com/alanma/p/6877166.html
class BasicBlock(layers.Layer):
def __init__(self, filter_num, stride=1):
super(BasicBlock, self).__init__()
self.conv1 = layers.Conv2D(filter_num, (3, 3), strides=stride, padding='same')
self.bn1 = layers.BatchNormalization()
self.relu = layers.Activation('relu')
self.conv2 = layers.Conv2D(filter_num, (3, 3), strides=1, padding='same')
self.bn2 = layers.BatchNormalization()
if stride != 1:
self.downsample = Sequential()
self.downsample.add(layers.Conv2D(filter_num, (1, 1), strides=stride))
self.downsample.add(layers.BatchNormalization())
else:
self.downsample = lambda x: x
self.stride = stride
def call(self, inputs, training=None):
residual = self.downsample(inputs)
conv1 = self.conv1(inputs)
bn1 = self.bn1(conv1)
relu1 = self.relu(bn1)
conv2 = self.conv2(relu1)
bn2 = self.bn2(conv2)
add = layers.add([bn2, residual])
out = self.relu(add)
return out
def _build_resbolck(self, block, filter_num, blocks, stride=1):
res_blocks = keras.Sequential()
res_blocks.add(block(filter_num, stride))
for _ in range(1, blocks):
res_blocks.add(block(filter_num, stride=1))
return res_blocks
DenseNet