解释.
从从某种程度来讲用1×1卷积并不是是网络变得更深,而是更宽,这里的宽实际上是增加数据量
但是通过1×1的卷积我们就可以对原始图片做一个变换,得到一张新的图片,从而可以提高泛化的能力减小过拟合,同时在这个过程中根据所选用的1×1卷积和filter的数目不同,可以实现跨通道的交互和信息的整合,而且还可以改变图片的维度.而且因为通过对维度的操作,虽然网络的层数增加了,但是网络的参数却可以大大减小,节省计算量。在实际应用的过程中,通常在卷积之后就跟着一个Relu之类的非线性层,把卷积的线性变成非线性,通过这个可以增加更多的非线性因素,包持feature map 尺寸不变(即不损失分辨率)的前提下大幅增加非线性特性,把网络做得很deep
- H、W保持不变,只改变C 跨通的的交互和信息融合、网络计算量降低,分辨率不变的情况下,增加非线性
- 1×1×F的卷积在数学上就等价与多层感知机, F是filter的数目,一个filter相当于就是对一张图片做一次卷积。
卷积核的通道,肯定是图片的通道啊
上图解释了为什么可以实现跨通道的交互和信息的整合,而且还可以改变图片的维度
- 使用1x1是可以减少参数的,naive版本和有降维的模块
pytorch代码
不同的输出,拼接起来,整个channel的维度,其实在改变的;但是分辨率不变,就是H和W没有变
outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
return torch.cat(outputs, dim=1)
# 注意了,里面都是有padding = f/2 为了分辨率不变 H/W不变
class InceptionA(nn.Module):
def __init__(self, in_channels):
super(InceptionA, self).__init__()
self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_1 = nn.Conv2d(in_channels,16, kernel_size=1)
self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2)
self.branch3x3_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1)
self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1)
self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1)
def forward(self, x):
branch1x1 = self.branch1x1(x)
branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)
branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch3x3 = self.branch3x3_3(branch3x3)
# 下面的branch_pool和self.branch_pool两码事
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
return torch.cat(outputs, dim=1)
Net模型
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(88, 20, kernel_size=5)
self.incep1 = InceptionA(in_channels=10)
self.incep2 = InceptionA(in_channels=20)
self.mp = nn.MaxPool2d(2)
self.fc = nn.Linear(1408, 10)
# 卷积 -> Inception -> 卷积 ->Inception ->拉平->Linear(1480,10)
# 这个没有交叉熵,因为损失函数自带
def forward(self, x):
in_size = x.size(0)
x = F.relu(self.mp(self.conv1(x)))
x = self.incep1(x)
x = F.relu(self.mp(self.conv2(x)))
x = self.incep2(x)
x = x.view(in_size, -1)
x = self.fc(x)
return x