GoogLeNet
观察具有相同结构的块,用函数封装起来,减少代码冗余。
Inception :电影《盗梦空间》,重用
Concatenate :拼接
Average Pooling:设置Padding,stride保证输入输出大小相同
1
×
1
1\times 1
1×1 Conv :融合了不同通道相同位置的信息
C
×
W
×
H
=
>
1
×
W
×
H
C\times W \times H => 1 \times W \times H
C×W×H=>1×W×H
作用:减少运算量
- 使用
5
×
5
5\times 5
5×5 Convolution
192 @ 28 × 28 = > 32 @ 28 × 28 192@28 \times 28 => 32 @ 28\times28 192@28×28=>32@28×28
运算量: 5 2 × 2 8 2 × 192 × 32 = 120422400 5^2 \times 28^2 \times 192 \times 32 = 120422400 52×282×192×32=120422400 - 使用
1
×
1
1\times 1
1×1 Convolution
192 @ 28 × 28 = > 16 @ 28 × 28 = > 32 @ 28 × 28 192@28 \times 28 =>16@28 \times 28 => 32 @ 28\times28 192@28×28=>16@28×28=>32@28×28
运算量: 1 2 × 2 8 2 × 192 × 16 + 5 2 × 2 8 2 × 16 × 32 = 12433648 1^2 \times 28^2 \times 192 \times 16 + 5^2 \times 28^2 \times 16 \times 32 = 12433648 12×282×192×16+52×282×16×32=12433648
可以看出运算量几乎是前面方法的十分之一
模块结构
代码实现
class InceptionA(torch.nn.Module):
def __init__(self, in_channels):
super(InceptionA, self).__init__()
self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2)
self.branch3x3_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1)
self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1)
self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1)
def forward(self, x):
branch1x1 = self.branch1x1(x)
branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)
branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch3x3 = self.branch3x3_3(branch3x3)
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
return torch.cat(outputs, dim=1) # batch_size, C, W, H:按通道方向拼接起来
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(88, 20, kernel_size=5)
self.incep1 = InceptionA(in_channels=10)
self.incep2 = InceptionA(in_channels=20)
self.mp = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(1408, 10)
def forward(self, x):
in_size = x.size(0)
x = F.relu(self.mp(self.conv1(x)))
x = self.incep1(x) # 输入通道数为10,输出为88 = 24 + 16 + 24 + 24
x = F.relu(self.mp(self.conv2(x))) # 输入通道数为88,输出通道为20
x = self.incep2(x) # 输入通道20,输出通道88
x = x.view(in_size, -1) # 展平为一维
x = self.fc(x) # 得到分类概率
return x
输出
[1, 300] loss: 0.963
[1, 600] loss: 0.187
[1, 900] loss: 0.134
Accuuracy on test set: 97 %
[2, 300] loss: 0.105
[2, 600] loss: 0.103
[2, 900] loss: 0.085
Accuuracy on test set: 97 %
[3, 300] loss: 0.079
[3, 600] loss: 0.070
[3, 900] loss: 0.073
Accuuracy on test set: 98 %
[4, 300] loss: 0.067
[4, 600] loss: 0.062
[4, 900] loss: 0.063
Accuuracy on test set: 98 %
[5, 300] loss: 0.056
[5, 600] loss: 0.058
[5, 900] loss: 0.053
Accuuracy on test set: 97 %
[6, 300] loss: 0.049
[6, 600] loss: 0.050
[6, 900] loss: 0.053
Accuuracy on test set: 98 %
[7, 300] loss: 0.044
[7, 600] loss: 0.046
[7, 900] loss: 0.047
Accuuracy on test set: 98 %
[8, 300] loss: 0.042
[8, 600] loss: 0.040
[8, 900] loss: 0.042
Accuuracy on test set: 98 %
[9, 300] loss: 0.038
[9, 600] loss: 0.038
[9, 900] loss: 0.038
Accuuracy on test set: 98 %
[10, 300] loss: 0.034
[10, 600] loss: 0.037
[10, 900] loss: 0.037
Accuuracy on test set: 98 %
有点疑惑和老师PPT上的图不一样 :(
Plain nets: stacking 3x3 conv layers
56-layer的效果比20-layer的差,可能是梯度消失或者过拟合
多个小于1的梯度相乘结果趋近于零,权重的更新(
w
=
w
−
α
×
∂
l
o
s
s
∂
w
w = w - \alpha \times \cfrac{\partial loss}{\partial w}
w=w−α×∂w∂loss) 就几乎不更新。
Residual net
跳连接:
H
(
x
)
=
F
(
x
)
+
x
H(x) = F(x)+x
H(x)=F(x)+x
求导为
∂
H
∂
x
=
∂
F
∂
x
+
1
\cfrac{\partial H}{\partial x} = \cfrac{\partial F}{\partial x} + 1
∂x∂H=∂x∂F+1 ,解决了梯度为零的问题
代码实现
class ResidualBlock(nn.Module):
def __init__(self, channels):
super(ResidualBlock, self).__init__()
self.channels = channels
self.conv1 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
def forward(self, x):
y = F.relu(self.conv1(x))
y = self.conv2(y)
return F.relu(x + y)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 16, kernel_size=5)
self.conv2 = nn.Conv2d(16, 32, kernel_size=5)
self.mp = nn.MaxPool2d(2)
self.rblock1 = ResidualBlock(16)
self.rblock2 = ResidualBlock(32)
self.fc = nn.Linear(512, 10)
def forward(self, x):
in_size = x.size(0)
x = self.mp(F.relu(self.conv1(x)))
x = self.rblock1(x)
x = self.mp(F.relu(self.conv2(x)))
x = self.rblock2(x)
x = x.view(in_size, -1)
x = self.fc(x)
return x