GoogLeNet
5.1 GoogleNet名称由来
《更深层的卷积》(Going deeper with convolutions)[2]。因为这篇论文的作者基本都来自于谷歌,所以文章提出的模型有时候又被叫作 GoogleNet。这篇论文拥有 8 千多次的引用数。
GoogleNet 不仅和 VGG 一样在把架构做“深”上下文章,而且在模型的效率上比 AlexNet 更加优秀。作者们利用了比 AlexNet 少 12 倍的参数,在更深的架构上达到了更好的效果。
5.2 文章的重点创新
GoogleNet 创新的重点是在网络架构上。和 AlexNet 以及 VGG 都不同的是,GoogleNet 的作者们认为更加合适的网络架构不是简单地把相同的卷积层叠加起来,然后再把相同的全联通层叠加。如果我们需要更深的架构,必须从原理上对网络架构有一个不同的理解。作者们认为,网络结构必须走向“稀疏化”(Sparsity),才能够达到更深层次、更高效的目的。
那么,能否直接用稀疏结构来进行网络的架构呢?过去的经验表明,这条路并不那么直观。第一,直接利用稀疏的结构所表达的网络结构效果并不好,第二,这样做就无法利用现代的硬件,特别是 GPU 的加速功能。现代的 GPU 之所以能够高效地处理视觉以及其他一系列类似的问题,主要的原因就是快速的紧密矩阵运算。所以,直接使用稀疏结构有一定的挑战。
5.3 核心思想
这篇论文的核心思想就是希望用一组“局部的”(Local)紧密结构来逼近理想中的最优的稀疏化结构,从而能够在计算上达到高效率,同时在理论思想上,能够利用稀疏化结构来达到更深的网络架构。
这种局部模块被作者们称作是 Inception 模块。什么意思呢?传统上,卷积层都是直接叠加起来的。而这篇论文提出的 Inception 模块,其实就是让卷积层能够在水平方向上排列起来,然后整个模块再进行垂直方向的叠加。至于水平方向排列多少个卷积层,垂直方向排列多少 Inception 模块,都是采用经验试错的方式来进行实验的。
这篇论文最终提出的 GoogleNet 有 22 层网络结构。如果把所有的平行结构都算上的话,整个网络超过了 100 层。为了能够在这么深的结构上训练模型,作者们还采用了一种方法,那就是在中间的一些层次中插入分类器。相比之下,我们之前遇到过的网络结构都是在最后一层才有一个分类器。分类器层的特点就是最终的标签信息会在这里被利用,也就是说,分类的准确性,或者说是图片中物体究竟是什么,都会被这个标签信息所涵盖。在中间层加入分类器,其实就是希望标签信息能够正确引导中间层的目标,并且能够让梯度依然有效经过。
在实验中,GoogleNet 模型可以说是达到了非常好的效果。在 2014 年的 ImageNet 竞赛中,GoogleNet 和 VGG 分列比赛的第一名和第二名。两个模型之间的差距仅有不到 1 个百分点。
5.4 纵横交错:Inception
一般来说,增加网络的深度与宽度可以提升网络的性能,但是这样做也会带来参数量的大幅度增加,同时较深的网络需要较多的数据,否则容易产生过拟合现象。除此之外,增加神经网络的深度容易带来梯度消失的现象。在2014年的ImageNet大赛上,获得冠军的Inception v1(又名GoogLeNet)网络较好地解决了这个问题。
Inception v1网络是一个精心设计的22层卷积网络,并提出了具有良好局部特征结构的Inception模块,即对特征并行地执行多个大小不同的卷积运算与池化,最后再拼接到一起。由于1×1、3×3和5×5的卷积运算对应不同的特征图区域,因此这样做的好处是可以得到更好的图像表征信息。
Inception模块如图3.13所示,使用了三个不同大小的卷积核进行卷积运算,同时还有一个最大值池化,然后将这4部分级联起来(通道拼接),送入下一层。
在上述模块的基础上,为进一步降低网络参数量,Inception又增加了多个1×1的卷积模块。如图3.14所示,这种1×1的模块可以先将特征图降维,再送给3×3和5×5大小的卷积核,由于通道数的降低,参数量也有了较大的减少。值得一提的是,用1×1卷积核实现降维的思想,在后面的多个轻量化网络中都会使用到。
Inception v1网络一共有9个上述堆叠的模块,共有22层,在最后的Inception模块处使用了全局平均池化。为了避免深层网络训练时带来的梯度消失问题,作者还引入了两个辅助的分类器,在第3个与第6个Inception模块输出后执行Softmax并计算损失,在训练时和最后的损失一并回传。
Inception v1的参数量是AlexNet的 ,VGGNet的 ,适合处理大规模数据,尤其是对于计算资源有限的平台。下面使用PyTorch来搭建一个单独的Inception模块,新建一个inceptionv1.py文件,代码如下:
import torch
from torch import nn import torch.nn.functional as F
# 首先定义一个包含conv与ReLU的基础卷积类
class BasicConv2d(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, padding=0):
super(BasicConv2d, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, padding=padding) def forward(self, x):
x = self.conv(x)
return F.relu(x, inplace=True)
# Inceptionv1的类,初始化时需要提供各个子模块的通道数大小
class Inceptionv1(nn.Module):
def __init__(self, in_dim, hid_1_1, hid_2_1, hid_2_3, hid_3_1, out_3_5, out_4_1):
super(Inceptionv1, self).__init__()
# 下面分别是4个子模块各自的网络定义
self.branch1x1 = BasicConv2d(in_dim, hid_1_1, 1)
self.branch3x3 = nn.Sequential( BasicConv2d(in_dim, hid_2_1, 1),
BasicConv2d(hid_2_1, hid_2_3, 3, padding=1) )
self.branch5x5 = nn.Sequential( BasicConv2d(in_dim, hid_3_1, 1),
BasicConv2d(hid_3_1, out_3_5, 5, padding=2)
)
self.branch_pool = nn.Sequential( nn.MaxPool2d(3, stride=1, padding=1),
BasicConv2d(in_dim, out_4_1, 1) )
def forward(self, x):
b1 = self.branch1x1(x)
b2 = self.branch3x3(x)
b3 = self.branch5x5(x)
b4 = self.branch_pool(x)
# 将这四个子模块沿着通道方向进行拼接
output = torch.cat((b1, b2, b3, b4), dim=1)
return output
在终端中进入上述Inceptionv1.py文件的同级目录,输入python3进入交互式环境,利用下面的代码调用该模块。
>>> import torch
>>> from inceptionv1 import Inceptionv1
# 网络实例化,输入模块通道数,并转移到GPU上
>>> net_inceptionv1 = Inceptionv1(3, 64, 32, 64, 64, 96, 32).cuda()
>>> net_inceptionv1
Inceptionv1( # 第一个分支,使用1×1卷积,输出通道数为64
(branch1x1): BasicConv2d(
(conv): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1))
)
# 第二个分支,使用1×1卷积与3×3卷积,输出通道数为64
(branch3x3): Sequential(
(0): BasicConv2d( (conv): Conv2d(3, 32, kernel_size=(1, 1), stride=(1, 1)) )
(1): BasicConv2d( (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) )
)
# 第三个分支,使用1×1卷积与5×5卷积,输出通道数为96
(branch5x5): Sequential(
(0): BasicConv2d( (conv): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1)) )
(1): BasicConv2d( (conv): Conv2d(64, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) ) )
# 第四个分支,使用最大值池化与1×1卷积,输出通道数为32
(branch_pool): Sequential(
(0): MaxPool2d(kernel_size=3, stride=1, padding=1, dilation=1, ceil_mode=False) (1): BasicConv2d( (conv): Conv2d(3, 32, kernel_size=(1, 1), stride=(1, 1))
)
)
)
>>> input = torch.randn(1, 3, 256, 256).cuda()
>>> input.shape torch.Size([1, 3, 256, 256])
>>> output = net_inceptionv1(input)
# 可以看到输出的通道数是输入通道数的和,即256=64+64+96+32
>>> output.shape torch.Size([1, 256, 256, 256])
在Inception v1网络的基础上,随后又出现了多个Inception版本。 Inception v2进一步通过卷积分解与正则化实现更高效的计算,增加了 BN层,同时利用两个级联的3×3卷积取代了Inception v1版本中的5×5卷积,如图3.15所示,这种方式既减少了卷积参数量,也增加了网络的非线性能力
使用PyTorch来搭建一个单独的Inception v2模块,默认输入的通道数为192,新建一个inceptionv2.py文件,代码如下:
import torch
from torch import nn
import torch.nn.functional as F
# 构建基础的卷积模块,与Inception v2的基础模块相比,增加了BN层
class BasicConv2d(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, padding=0):
super(BasicConv2d, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, padding=padding)
self.bn = nn.BatchNorm2d(out_channels, eps=0.001)
def forward(self, x):
x = self.conv(x)
x = self.bn(x)
return F.relu(x, inplace=True)
class Inceptionv2(nn.Module):
def __init__(self):
super(Inceptionv2, self).__init__()
self.branch1 = BasicConv2d(192, 96, 1, 0)
# 对应1x1卷积分支 # 对应1x1卷积与3x3卷积分支
self.branch2 = nn.Sequential(
BasicConv2d(192, 48, 1, 0),
BasicConv2d(48, 64, 3, 1) )
# 对应1x1卷积、3x3卷积与3x3卷积分支
self.branch3 = nn.Sequential(
BasicConv2d(192, 64, 1, 0),
BasicConv2d(64, 96, 3, 1),
BasicConv2d(96, 96, 3, 1) )
# 对应3x3平均池化与1x1卷积分支
self.branch4 = nn.Sequential(
nn.AvgPool2d(3, stride=1, padding=1, count_include_pad=False),
BasicConv2d(192, 64, 1, 0) )
# 前向过程,将4个分支进行torch.cat()拼接起来
def forward(self, x):
x0 = self.branch1(x)
x1 = self.branch2(x)
x2 = self.branch3(x)
x3 = self.branch4(x)
out = torch.cat((x0, x1, x2, x3), 1)
return out
在终端中进入上述Inceptionv2.py文件的同级目录,输入python3进入交互式环境,利用下面的代码调用该模块。
>>> import torch
>>> from inceptionv2 import Inceptionv2
>>> net_inceptionv2 = Inceptionv2().cuda()
>>> net_inceptionv2
Inceptionv2(
# 第1个分支,使用1×1卷积,输出通道数为96
(branch1): BasicConv2d(
(conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1))
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_
running_stats=True) )
# 第2个分支,使用1×1卷积与3×3卷积,输出通道数为64
(branch2): Sequential(
(0): BasicConv2d( (conv): Conv2d(192, 48, kernel_size=(1, 1), stride=(1, 1))
(bn): BatchNorm2d(48,eps=0.001, momentum=0.1, affine=True,
track_ running_stats=True) )
(1): BasicConv2d( (conv): Conv2d(48, 64, kernel_size=(3, 3), stride=(1, 1),
padding=(1, 1))
(bn): BatchNorm2d(64,eps=0.001,momentum=0.1,
affine=True, track_ running_stats=True) ) )
# 第3个分支,使用1×1卷积与两个连续的3×3卷积,输出通道数为96
(branch3): Sequential(
(0): BasicConv2d(
(conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1))
(bn): BatchNorm2d(64,eps=0.001, momentum=0.1, affine=True, track_
running_stats=True) )
(1): BasicConv2d(
(conv): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(bn): BatchNorm2d(96,eps=0.001, momentum=0.1, affine=True, track_
running_stats=True) )
(2): BasicConv2d(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(bn): BatchNorm2d(96,eps=0.001, momentum=0.1, affine=True, track_
running_stats=True) ) )
# 第4个分支,使用平均池化与1×1卷积,输出通道数为64
(branch4): Sequential(
(0): AvgPool2d(kernel_size=3, stride=1, padding=1)
(1): BasicConv2d(
(conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1))
(bn): BatchNorm2d(64,eps=0.001, momentum=0.1, affine=True, track_
running_stats=True) ) ) )
>>> input = torch.randn(1, 192, 32, 32).cuda()
>>> input.shape torch.Size([1, 192, 32, 32])
>>> output = net_inceptionv2(input)
# 将输入传入实例的网络
>>> output.shape # 输出特征图的通道数为:96+64+96+64=320
torch.Size([1, 320, 32, 32])
更进一步,Inceptionv2将n×n的卷积运算分解为1×n与n×1两个卷积,如图3.16所示,这种分解的方式可以使计算成本降低33%。
此外,Inception v2还将模块中的卷积核变得更宽而不是更深,形成第三个模块,以解决表征能力瓶颈的问题。Inception v2网络正是由上述的三种不同类型的模块组成的,其计算也更加高效。
Inception v3在Inception v2的基础上,使用了RMSProp优化器,在辅助的分类器部分增加了7×7的卷积,并且使用了标签平滑技术。
Inception v4则是将Inception的思想与残差网络进行了结合,显著提升了训练速度与模型准确率,这里对于模块细节不再展开讲述。至于残差网络这一里程碑式的结构,正是由下一节的网络ResNet引出的。
tensorflow2,cifar10 inception10
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D,BatchNormalization,Activation,MaxPool2D,Dropout,Flatten,Dense,GlobalAveragePooling2D
from tensorflow.keras import Model
np.set_printoptions(threshold=np.inf)
cifar10 = tf.keras.datasets.cifar10
(x,y),(x_test,y_test) = cifar10.load_data()
x,x_test = x / 255., x_test / 255.
class ConvBNRelu(Model):
def __init__(self,ch,kernelsz = 3,strides = 1,padding = "same"):
super(ConvBNRelu, self).__init__()
self.model = tf.keras.models.Sequential([
Conv2D(ch,kernelsz,strides = strides,padding = padding),
BatchNormalization(),
Activation('relu')
])
def call(self,x):
# 在training=False时,BN通过整个训练集计算均值、方差去做批归一化,training=True时,通过当前batch的均值、方差去做批归一化。推理时 training=False效果好
x = self.model(x,training = False)
return x
class InceptionBlk(Model):
def __init__(self,ch,strides = 1):
super(InceptionBlk, self).__init__()
self.ch = ch
self.strides = strides
self.c1 = ConvBNRelu(ch,kernelsz=1,strides=strides)
self.c2_1 = ConvBNRelu(ch,kernelsz=1,strides = strides)
self.c2_2 = ConvBNRelu(ch,kernelsz=3,strides=1)
self.c3_1 = ConvBNRelu(ch,kernelsz=1,strides = strides)
self.c3_2 = ConvBNRelu(ch,kernelsz=5,strides = strides)
self.p4_1 = MaxPool2D(3,strides=1,padding='same')
self.c4_2 = ConvBNRelu(ch,kernelsz=1,strides = strides)
def call(self,x):
x1 = self.c1(x)
x2_1 = self.c2_1(x)
x2_2 = self.c2_2(x2_1)
x3_1 = self.c3_1(x)
x3_2 = self.c3_2(x3_1)
x4_1 = self.p4_1(x)
x4_2 = self.c4_2(x4_1)
x = tf.concat([x1,x2_2,x3_2,x4_2],axis=3)
return x
class Inception10(Model):
def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
super(Inception10, self).__init__(**kwargs)
self.in_channels = init_ch
self.out_channels = init_ch
self.num_blocks = num_blocks
self.init_ch = init_ch
self.c1 = ConvBNRelu(init_ch)
self.blocks = tf.keras.models.Sequential()
for block_id in range(num_blocks):
for layer_id in range(2):
if layer_id == 0:
block = InceptionBlk(self.out_channels, strides=2)
else:
block = InceptionBlk(self.out_channels, strides=1)
self.blocks.add(block)
# enlarger out_channels per block
self.out_channels *= 2
self.p1 = GlobalAveragePooling2D()
self.f1 = Dense(num_classes, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y
model = Inception10(num_blocks=2, num_classes=10)
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/Inception10.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)
history = model.fit(x, y, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()
file = open('./inceptionweights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()
############################################### show ###############################################
# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
torch
import torch.nn as nn
import torch
import torch.nn.functional as F
class GoogLeNet(nn.Module):
def __init__(self,num_classes= 1000 ,aux_logits=True,init_weithts = False):
super(GoogLeNet, self).__init__()
self.aux_logits = aux_logits#是否需要辅助分类器
self.conv1 = BasicConv2d(3,64,kernel_size = 7,strides = 2,padding = 3)
self.maxpool1 = nn.MaxPool2d(3,stride=2,ceil_mode=True)
self.conv2 = BasicConv2d(64,64,kernel_size = 1)
self.conv3 = BasicConv2d(64,192,kernel_size = 3,padding = 1)
self.maxpool2 = nn.MaxPool2d(3,stride=2,ceil_mode=True)
self.inception3a = Inception(192,64,96,128,16,32,32)
self.inception3b = Inception(256,128,128,192,32,96,64)
self.maxpool3 = nn.MaxPool2d(3,stride=2,ceil_mode=True)
self.inception4a = Inception(480,192,96,208,16,48,64)
self.inception4b = Inception(512,160,112,224,24,64,64)
self.inception4c = Inception(512,128,128,256,24,64,64)
self.inception4d = Inception(512,112,144,288,32,64,64)
self.inception4e = Inception(528,256,160,320,32,128,128)
self.maxpool4 = nn.MaxPool2d(3,stride=2,ceil_mode=True)
self.inception5a = Inception(832,256,160,320,32,128,128)
self.inception5b = Inception(832,384,192,384,48,128,128)
if self.aux_logits:
self.aux1 = InceptionAux(512,num_classes)
self.aux2 = InceptionAux(528,num_classes)
self.avgpool = nn.AdaptiveAvgPool2d((1,1))
self.dropout = nn.Dropout(0.4)
self.fc = nn.Linear(1024,num_classes)
if init_weithts:
self._initialize_weights()
def forward(self,x):
#N x 3 x 224 x 224
x = self.conv1(x)
#N x 64 x 112 x 112
x = self.maxpool1(x)
#N x 64 x 56 x 56
x = self.conv2(x)
#N x 64 x 56 x56
x = self.conv3(x)
#N x 192 x 56 x 56
x = self.maxpool2(x)
#N x 192 x 28 x 28
x = self.inception3a(x)
#N x 256 x 28 x 28
x = self.inception3b(x)
#N x 480 x 28 x 28
x = self.maxpool3(x)
# N x 480 x 14 x 14
x = self.inception4a(x)
#N x 512 x 14 x 14
if self.training and self.aux_logits:
aux1 = self.aux1(x)
x = self.inception4b(x)
#n x 512 x 14 x 14
x = self.inception4c(x)
# N x 512 x 14 x 14
x = self.inception4d(x)
#N x 528 x 14 x 14
if self.training and self.aux_logits:
aux2 = self.aux2(x)
x = self.inception4e(x)
# N x 832 x 14 x 14
x = self.maxpool4(x)
# N x 832 x 7 x7
x = self.inception5a(x)
# n x 832 x 7 x7
x = self.inception5b(x)
# N 1024 x 7 x7
x = self.avgpool(x)
# N x 1024 x 1 x 1
x = torch.flatten(x,1)
#N x 1024
x = self.dropout(x)
x = self.fc(x)
#N x 1000
if self.training and self.aux_logits:
return x,aux2,aux1
return x
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
if m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01)
nn.init.constant_(m.bias, 0)
class Inception(nn.Module):
def __init__(self,in_channels,ch1x1,ch3x3red,ch3x3,ch5x5red,ch5x5,pool_proj):
'''
:param in_channels: 通道数
:param ch1x1: 1x1卷积核seq0
:param ch3x3red: 1x1卷积核seq1
:param ch3x3: 3x3卷积 seq1
:param ch5x5red: 1x1卷积 seq2
:param ch5x5: 5x5卷积核seq2
:param pool_proj: 池化操作
'''
super(Inception, self).__init__()
self.branch1 = BasicConv2d(in_channels,ch1x1,kernel_size = 1) #28 - > 64
self.branch2 = nn.Sequential(
BasicConv2d(in_channels,ch3x3red,kernel_size = 1), # 28 ->96
BasicConv2d(ch3x3red,ch3x3,kernel_size = 3,padding = 1) # 96 -> 128
)
self.branch3 = nn.Sequential(
BasicConv2d(in_channels,ch5x5red,kernel_size = 1), #28 -> 16
BasicConv2d(ch5x5red,ch5x5,kernel_size = 5,padding = 2)#保证输出大小等于输出大小 16 - 32
)
self.branch4 = nn.Sequential(
nn.MaxPool2d(kernel_size=3,stride=1,padding=1), #30#
BasicConv2d(in_channels,pool_proj,kernel_size = 1) #1x1juanji
)
def forward(self,x):
branch1 = self.branch1(x)
branch2 = self.branch2(x)
branch3 = self.branch3(x)
branch4 = self.branch4(x)
output = [branch1,branch2,branch3,branch4]
return torch.cat(output,1)
class InceptionAux(nn.Module):
def __init__(self,in_channel,num_classes):
super(InceptionAux, self).__init__()
self.averagePool = nn.AvgPool2d(kernel_size=5,stride=3)
self.conv = BasicConv2d(in_channel,128,kernel_size = 1)
self.fc1 = nn.Linear(2048,1024)
self.fc2 = nn.Linear(1024,num_classes)
def forward(self,x):
#aux1: N x 512 x 14 x 14 aux2:N x 528 x 14 x 14
x = self.averagePool(x)
#aux1: N x 512 x 4 x 4,aux2:N x 528 x 4 x 4
x = self.conv(x)
#N x 128 x4 x4
x = torch.flatten(x,1)
x = F.dropout(x,0.5,training=self.training)
#N x 2048
x = F.relu(self.fc1(x),inplace=True)
x = F.dropout(x,0.5,training=self.training)
#Nx 1024
x = self.fc2(x)
#N x num_classes
return x
class BasicConv2d(nn.Module):
def __init__(self,in_channels,out_channels,**kwargs):
super(BasicConv2d, self).__init__()
self.conv = nn.Conv2d(in_channels,out_channels,**kwargs)
self.relu = nn.ReLU(inplace= True)
def forward(self,x):
x = self.conv(x)
x = self.relu(x)
return x
torch简单版
import torch
from torch import nn
def conv1xn(in_channel, out_channel, kernel_size):
return nn.Sequential(
nn.Conv2d(in_channel, out_channel, kernel_size=(kernel_size, 1), padding=(kernel_size // 2, 0), bias=False),
nn.BatchNorm2d(out_channel),
nn.ReLU(inplace=True)
)
def conv1x1(in_channels, out_channels):
return nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True)
)
class Inception(nn.Module):
def __init__(self, in_channel, out_c1, out_c2, out_c3, out_c4):
super(Inception, self).__init__()
self.route1 = conv1x1(in_channel, out_c1)
self.route2 = nn.Sequential(
conv1x1(in_channel, out_c2 // 2),
conv1xn(out_c2 // 2, out_c2, 3)
)
self.route3 = nn.Sequential(
conv1x1(in_channel, out_c3 // 2),
conv1xn(out_c3 // 2, out_c3, 5)
)
self.route4 = nn.Sequential(
nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
conv1x1(in_channel, out_c4)
)
def forward(self, input):
x1 = self.route1(input)
x2 = self.route2(input)
x3 = self.route3(input)
x4 = self.route4(input)
out = torch.cat([x1, x2, x3, x4], dim=1)
return out
class InveptionV4(nn.Module):
def __init__(self, in_channel, out_c1, out_c2, out_c3, out_c4):
super(InveptionV4, self).__init__()
self.fx = Inception(in_channel, out_c1, out_c2, out_c3, out_c4)
self.downsample = conv1x1(in_channel, out_c1 + out_c2 + out_c3 + out_c4)
self.relu = nn.ReLU(inplace=True)
def forward(self, input):
x = self.fx(input)
identity = self.downsample(input)
out = self.relu(x + identity)
return out
class GoogLeNet(nn.Module):
def __init__(self, in_channel, num_classes):
super(GoogLeNet, self).__init__()
self.input_layers = nn.Sequential(
nn.Conv2d(in_channel, 64, kernel_size=5, stride=2, padding=2, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
)
self.inception_layers = nn.Sequential(
InveptionV4(64, 32, 96, 64, 32),
nn.BatchNorm2d(224),
nn.ReLU(inplace=True),
InveptionV4(224, 128, 128, 96, 64),
nn.BatchNorm2d(416),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.ReLU(inplace=True),
InveptionV4(416, 128, 256, 128, 128),
nn.BatchNorm2d(640),
nn.ReLU(inplace=True),
InveptionV4(640, 128, 512, 256, 128),
nn.BatchNorm2d(1024),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.ReLU(inplace=True),
InveptionV4(1024, 256, 640, 256, 128),
nn.BatchNorm2d(1280),
nn.ReLU(inplace=True),
InveptionV4(1280, 256, 1024, 512, 256),
nn.BatchNorm2d(2048),
nn.ReLU(inplace=True),
)
self.output_layers = nn.Sequential(
nn.Conv2d(2048, 1024, 4, bias=False),
nn.BatchNorm2d(1024),
nn.ReLU(inplace=True),
nn.Flatten(),
nn.Linear(1024, 10),
nn.ReLU(inplace=True)
)
def forward(self, input):
x = self.input_layers(input)
# print('input_layers', x.shape)
x = self.inception_layers(x)
# print('inception_layer', x.shape)
out = self.output_layers(x)
# print('out_layer', out.shape)
return out
if __name__ == '__main__':
fack_img = torch.randint(0, 255, [10, 3, 32, 32]).type(torch.FloatTensor)
model = GoogLeNet(3, 10)
out = model(fack_img)
print(out.shape)
辅助分类器后添加全连接分类操作
在网络训练过程中,某一层参数的更新,会使得其后面的层的输入分布发生变化,即低层微小的变化会被高层累积、放大。然而,当网络
中某层的参数发生变化时,会使得该层输出数据的分布发生变化,这就
意味着,下一层的输入分布会发生变化,从而迫使下一层学习和适应新的分布。如果这样的变化过程一直持续,将对收敛速度产生影响。
批标准化就是为了解决这一问题而提出的。通过采样小批数据,在某个神经元的输入中对该小批数据进行归一化处理,将它们的分布调整
为均值为0、方差为1的标准正态分布。式3
.15
展示了其计算方法,该方
法是针对每一层的每个神经元而言的。对某一层的某个神经元,x(k) 表示通过该神经元的第k条数据,μ表示该神经元的平均输出,σ表示该
神经元的输出值的标准差。式3
.16
展示了σ的计算方式,其中n是采样的
数据量,ϵ是为了防止σ = 0
而设定的较小常量。
v4
主要提出了新的Inception结构,并且结合ResNet网络提出了Inception - ResNet - v1和Inception - ResNet - v2。
当网络更深更宽时,inception网络能否一样高效。基于这个想法,将inception和resnet两者进行融合,进一步改善网络。由于TensorFlow的出现,能大大简化训练,不需要将模型进行分割。因此,google采取了更加大胆的设计方法,提出了inception
v4,其具有更加统一的inception结构。