VGG相较于AlexNet的改进是将若干相同的卷积核串联成一个块,然后将这些块再串联起来构建整个网络。
VGG网络用于MNIST手写数字识别,泛化性能测试acc_g=0.65,相较于MLP、LeNet所有提升
完整训练代码见 >>Github链接
VGG块
需要指定块由几个卷积核组成,输入的第一个卷积的通道和最终输出的卷积通道
从代码可以看出,只有第一个卷积核的输入通道会和其他的不一样,其他的卷积核都是一样的。
def vgg_block(num_convs,in_channels,out_channels):
layers = []
for _ in range(num_convs):
layers.append(nn.Conv2d(in_channels,out_channels,kernel_size=3,padding=1))
layers.append(nn.ReLU())
in_channels = out_channels
layers.append(nn.MaxPool2d(kernel_size=2,stride=2))
return nn.Sequential(*layers)
VGG网络实现
class VGG(nn.Module):
def __init__(self,conv_arch) -> None:
super().__init__()
conv_blks=[]
in_channels=1
for (num_convs,out_channels) in conv_arch:
conv_blks.append(vgg_block(num_convs,in_channels,out_channels))
in_channels = out_channels
self.sequential = nn.Sequential(*conv_blks,nn.Flatten(),
nn.Linear(out_channels*7*7,4096),nn.ReLU(),nn.Dropout(0.5),
nn.Linear(4096,4096),nn.ReLU(),nn.Dropout(0.5),
nn.Linear(4096,10))
def forward(self,x):
return self.sequential(x)
def vgg_block(num_convs,in_channels,out_channels):
layers = []
for _ in range(num_convs):
layers.append(nn.Conv2d(in_channels,out_channels,kernel_size=3,padding=1))
layers.append(nn.ReLU())
in_channels = out_channels
layers.append(nn.MaxPool2d(kernel_size=2,stride=2))
return nn.Sequential(*layers)