Pytorch学习笔记:VGGNet

最新推荐文章于 2023-06-26 10:42:01 发布

爱喝汽水的喵

最新推荐文章于 2023-06-26 10:42:01 发布

阅读量239

点赞数

分类专栏： pytorch

本文链接：https://blog.csdn.net/qq_42309130/article/details/117604982

版权

pytorch 专栏收录该内容

5 篇文章 5 订阅

订阅专栏

本文档详细介绍了如何使用PyTorch构建VGGNet模型，包括模型文件中的*args和**kwargs用法。VGGNet通过配置不同的卷积层结构实现多种变体。代码中展示了VGGNet的构建过程，以及如何加载预训练权重。同时，讨论了*args用于不定数量的普通参数，**kwargs用于不定数量的键值对参数。最后，提到了训练和测试文件的使用与AlexNet相同。

摘要由CSDN通过智能技术生成

主要参考b站up霹雳吧啦Wz视频，本文不再赘述网络具体内容，以代码实现为主。感谢up主做的极其详细并对小白友好的精彩分享。

VGGNet知识点视频
 代码实现视频
代码来自up主的Github仓库开源项目，侵权删。

1.模型文件

import torch.nn as nn
import torch

# official pretrain weights
model_urls = {
    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',
    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',
    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',
    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth'
}


class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_weights=False):
        super(VGG, self).__init__()
        self.features = features
        self.classifier = nn.Sequential(
            nn.Linear(512*7*7, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, num_classes)
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        # N x 3 x 224 x 224
        x = self.features(x)
        # N x 512 x 7 x 7
        x = torch.flatten(x, start_dim=1)
        # N x 512*7*7
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                # nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)


def make_features(cfg: list):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == "M":
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else:
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            layers += [conv2d, nn.ReLU(True)]
            in_channels = v
    return nn.Sequential(*layers)


cfgs = {
    'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
    'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}


def vgg(model_name="vgg16", **kwargs):
    assert model_name in cfgs, "Warning: model number {} not in cfgs dict!".format(model_name)
    cfg = cfgs[model_name]

    model = VGG(make_features(cfg), **kwargs)
    return model

几个说明：
1.因为VGGNet存在多种结构，故使用传入不同的列表来实现。
2.网络构建函数分析：

def make_features(cfg: list):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == "M":
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else:
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            layers += [conv2d, nn.ReLU(True)]
            in_channels = v
    return nn.Sequential(*layers)

传入的cfg时列表型数据，输入的图像是RGB图像故in_channels起始为3。如果列表中遍历到’M’，说明添加最大池化层，且所有的池化层大小与步长均为2；若遍历到的内容非’M’，则说明要添加卷积层，卷积层的输入通道数由in_channels决定，遍历到的数值为卷积核的数量，决定了该层的输出特征图out_channels，即下一个卷积层的in_channels，故需要更新in_channels，此外，卷积层后面还需要补充RELU激活层。返回值为*layers，有关Python3 *与 **的说明见博客关于Python3函数参数中的* 与 **说明。

1.1使用*args

*args会对后续参数进行元组引用，下面的例程中name接收了Geek字符串以后，剩余的参数均打包进元组赋值给args。

def fun(name, *args):
    print('你好:', name)
    print(args)
    for i in args:
        print("你的宠物有:", i)
fun("Geek", "dog", "cat")
"""
你好: Geek
('dog', 'cat')
你的宠物有: dog
你的宠物有: cat
"""
#另一种使用方法为先将args打包成元组,传参时要在args前加*
args = ("dog", "cat")
fun("Geek", *args)

1.2使用**kwargs

**kwargs会将接收的参数打包成字典格式，下面的例程中,kwargs接收了{'Geek': 'cat', 'cat': 'box'}这一字典。(输入参数中的Geek会强行转换为字符串变量。

def fun(**kwargs):
    print(kwargs)
    for key, value in kwargs.items():
        print("{0} 喜欢 {1}".format(key, value))
fun(Geek="cat", cat="box")
"""
{'Geek': 'cat', 'cat': 'box'}
Geek 喜欢 cat
cat 喜欢 box
"""
#另一种使用方法为在函数先将kwargs打包成字典并传入,传参时要在kwargs前加**
kwargs = {'Geek':"cat", 'cat':"box"}
fun(**kwargs)

1.3同时使用*args与**kwargs

要注意函数参数顺序为普通参数->args->kwargs。
下面的例程中,num1与num2对应第1，2个参数，之后单个元素均为打包成元组赋值给args，A='a’式传参均被打包成字典赋值给kwargs。

def fun(num1,num2,*args, **kwargs):
    print(num1)
    print(num2)
    print('args=', args)
    print('kwargs=', kwargs)
fun(1, 2, 3, 4, A='a', B='b', C='c', D='d')
"""
1
2
args= (3, 4)
kwargs= {'A': 'a', 'B': 'b', 'C': 'c', 'D': 'd'}
"""