Inception-v4(GoogLeNet-v4)模型框架(PyTorch)

I. 前言

Inception-v4,又名GoogLeNet-v4,论文地址:Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning,该论文将Inception-v3与Inception-v4相比,此外还与ResNet结合,提出了Inception-ResNet-A及Inception-ResNet-B两个模型。本文只针对Inception-v4部分的模型框架进行复现。

II. 模型构架图

在这里插入图片描述

III. 各部分构架图

1. Stem


【注】标“V”即valid-padding,padding=0,否则为same-padding,需手动计算出每处所需的padding数.

2. Inception-A

在这里插入图片描述
【注】此处的Avg Pooling的kernel_size为3, padding为1, stride为1,下同.

3. Inception-B

在这里插入图片描述
【注】不对称卷积核的padding,如conv17的padding为(0,3),conv71的padding为(3,0),conv13及conv31的padding分别为(0,1)及(1,0).

4. Inception-C

在这里插入图片描述

5. Reduction-A

在这里插入图片描述
【注】此处根据论文中的表格,k=192、l=224、m=256、n=384.

6. Reduction-B

在这里插入图片描述

IV. 代码复现

import torch
import torch.nn as nn
import torch.nn.functional as F

定义一个卷积模块(带BatchNormalization及ReLU激活函数)

class BasicConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, **kwargs):
        super(BasicConv2d, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, **kwargs)
        self.bn = nn.BatchNorm2d(out_channels)
        
    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
        return F.relu(x)

InceptionA模块

class InceptionA(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(InceptionA, self).__init__()
        #branch1: avgpool --> conv1*1(96)
        self.b1_1 = nn.AvgPool2d(kernel_size=3, padding=1, stride=1)
        self.b1_2 = BasicConv2d(in_channels, 96, kernel_size=1)
        
        #branch2: conv1*1(96)
        self.b2 = BasicConv2d(in_channels, 96, kernel_size=1)
        
        #branch3: conv1*1(64) --> conv3*3(96)
        self.b3_1 = BasicConv2d(in_channels, 64, kernel_size=1)
        self.b3_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)
        
        #branch4: conv1*1(64) --> conv3*3(96) --> conv3*3(96)
        self.b4_1 = BasicConv2d(in_channels, 64, kernel_size=1)
        self.b4_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)
        self.b4_3 = BasicConv2d(96, 96, kernel_size=3, padding=1)
        
    def forward(self, x):
        y1 = self.b1_2(self.b1_1(x))
        y2 = self.b2(x)
        y3 = self.b3_2(self.b3_1(x))
        y4 = self.b4_3(self.b4_2(self.b4_1(x)))
        
        outputsA = [y1, y2, y3, y4]
        return torch.cat(outputsA, 1)

InceptionB模块

class InceptionB(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(InceptionB, self).__init__()
        #branch1: avgpool --> conv1*1(128)
        self.b1_1 = nn.AvgPool2d(kernel_size=3, padding=1, stride=1)
        self.b1_2 = BasicConv2d(in_channels, 128, kernel_size=1)
        
        #branch2: conv1*1(384)
        self.b2 = BasicConv2d(in_channels, 384, kernel_size=1)
        
        #branch3: conv1*1(192) --> conv1*7(224) --> conv1*7(256)
        self.b3_1 = BasicConv2d(in_channels, 192, kernel_size=1)
        self.b3_2 = BasicConv2d(192, 224, kernel_size=(1,7), padding=(0,3))
        self.b3_3 = BasicConv2d(224, 256, kernel_size=(1,7), padding=(0,3))
        
        #branch4: conv1*1(192) --> conv1*7(192) --> conv7*1(224) --> conv1*7(224) --> conv7*1(256)
        self.b4_1 = BasicConv2d(in_channels, 192, kernel_size=1, stride=1)
        self.b4_2 = BasicConv2d(192, 192, kernel_size=(1,7), padding=(0,3))
        self.b4_3 = BasicConv2d(192, 224, kernel_size=(7,1), padding=(3,0))
        self.b4_4 = BasicConv2d(224, 224, kernel_size=(1,7), padding=(0,3))
        self.b4_5 = BasicConv2d(224, 256, kernel_size=(7,1), padding=(3,0))
        
    def forward(self, x):
        y1 = self.b1_2(self.b1_1(x))
        y2 = self.b2(x)
        y3 = self.b3_3(self.b3_2(self.b3_1(x)))
        y4 = self.b4_5(self.b4_4(self.b4_3(self.b4_2(self.b4_1(x)))))
        
        outputsB = [y1, y2, y3, y4]
        return torch.cat(outputsB, 1)

InceptionC模块

class InceptionC(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(InceptionC, self).__init__()
        #branch1: avgpool --> conv1*1(256)
        self.b1_1 = nn.AvgPool2d(kernel_size=3, padding=1, stride=1)
        self.b1_2 = BasicConv2d(in_channels, 256, kernel_size=1)
        
        #branch2: conv1*1(256)
        self.b2 = BasicConv2d(in_channels, 256, kernel_size=1)
        
        #branch3: conv1*1(384) --> conv1*3(256) & conv3*1(256)
        self.b3_1 = BasicConv2d(in_channels, 384, kernel_size=1)
        self.b3_2_1 = BasicConv2d(384, 256, kernel_size=(1,3), padding=(0,1))
        self.b3_2_2 = BasicConv2d(384, 256, kernel_size=(3,1), padding=(1,0))
        
        #branch4: conv1*1(384) --> conv1*3(448) --> conv3*1(512) --> conv3*1(256) & conv7*1(256)
        self.b4_1 = BasicConv2d(in_channels, 384, kernel_size=1, stride=1)
        self.b4_2 = BasicConv2d(384, 448, kernel_size=(1,3), padding=(0,1))
        self.b4_3 = BasicConv2d(448, 512, kernel_size=(3,1), padding=(1,0))
        self.b4_4_1 = BasicConv2d(512, 256, kernel_size=(3,1), padding=(1,0))
        self.b4_4_2 = BasicConv2d(512, 256, kernel_size=(1,3), padding=(0,1))
        
    def forward(self, x):
        y1 = self.b1_2(self.b1_1(x))
        y2 = self.b2(x)
        y3_1 = self.b3_2_1(self.b3_1(x))
        y3_2 = self.b3_2_2(self.b3_1(x))
        y4_1 = self.b4_4_1(self.b4_3(self.b4_2(self.b4_1(x))))
        y4_2 = self.b4_4_2(self.b4_3(self.b4_2(self.b4_1(x))))
        
        outputsC = [y1, y2, y3_1, y3_2, y4_1, y4_2]
        return torch.cat(outputsC, 1)

ReductionA模块

class ReductionA(nn.Module):
    def __init__(self, in_channels, out_channels, k, l, m, n):
        super(ReductionA, self).__init__()
        #branch1: maxpool3*3(stride2 valid)
        self.b1 = nn.MaxPool2d(kernel_size=3, stride=2)
        
        #branch2: conv3*3(n stride2 valid)
        self.b2 = BasicConv2d(in_channels, n, kernel_size=3, stride=2)
        
        #branch3: conv1*1(k) --> conv3*3(l) --> conv3*3(m stride2 valid)
        self.b3_1 = BasicConv2d(in_channels, k, kernel_size=1)
        self.b3_2 = BasicConv2d(k, l, kernel_size=3, padding=1)
        self.b3_3 = BasicConv2d(l, m, kernel_size=3, stride=2)
        
    def forward(self, x):
        y1 = self.b1(x)
        y2 = self.b2(x)
        y3 = self.b3_3(self.b3_2(self.b3_1(x)))
        
        outputsRedA = [y1, y2, y3]
        return torch.cat(outputsRedA, 1)

ReductionB模块

class ReductionB(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(ReductionB, self).__init__()
        #branch1: maxpool3*3(stride2 valid)
        self.b1 = nn.MaxPool2d(kernel_size=3, stride=2)
        
        #branch2: conv1*1(192) --> conv3*3(192 stride2 valid)
        self.b2_1 = BasicConv2d(in_channels, 192, kernel_size=1)
        self.b2_2 = BasicConv2d(192, 192, kernel_size=3, stride=2)
        
        #branch3: conv1*1(256) --> conv1*7(256) --> conv7*1(320) --> conv3*3(320 stride2 valid)
        self.b3_1 = BasicConv2d(in_channels, 256, kernel_size=1)
        self.b3_2 = BasicConv2d(256, 256, kernel_size=(1,7), padding=(0,3))
        self.b3_3 = BasicConv2d(256, 320, kernel_size=(7,1), padding=(3,0))
        self.b3_4 = BasicConv2d(320, 320, kernel_size=3, stride=2)
        
    def forward(self, x):
        y1 = self.b1(x)
        y2 = self.b2_2(self.b2_1((x)))
        y3 = self.b3_4(self.b3_3(self.b3_2(self.b3_1(x))))
        
        outputsRedB = [y1, y2, y3]
        return torch.cat(outputsRedB, 1)

Stem模块

class Stem(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Stem, self).__init__()
        #conv3*3(32 stride2 valid)
        self.conv1 = BasicConv2d(in_channels, 32, kernel_size=3, stride=2)
        #conv3*3(32 valid)
        self.conv2 = BasicConv2d(32, 32, kernel_size=3)
        #conv3*3(64)
        self.conv3 = BasicConv2d(32, 64, kernel_size=3, padding=1)
        #maxpool3*3(stride2 valid) & conv3*3(96 stride2 valid)
        self.maxpool4 = nn.MaxPool2d(kernel_size=3, stride=2)
        self.conv4 = BasicConv2d(64, 96, kernel_size=3, stride=2)
        
        #conv1*1(64) --> conv3*3(96 valid)
        self.conv5_1_1 = BasicConv2d(160, 64, kernel_size=1)
        self.conv5_1_2 = BasicConv2d(64, 96, kernel_size=3)
        #conv1*1(64) --> conv7*1(64) --> conv1*7(64) --> conv3*3(96 valid)
        self.conv5_2_1 = BasicConv2d(160, 64, kernel_size=1)
        self.conv5_2_2 = BasicConv2d(64, 64, kernel_size=(7,1), padding=(3,0))
        self.conv5_2_3 = BasicConv2d(64, 64, kernel_size=(1,7), padding=(0,3))
        self.conv5_2_4 = BasicConv2d(64, 96, kernel_size=3)
        
        #conv3*3(192 valid)
        self.conv6 = BasicConv2d(192, 192, kernel_size=3, stride=2)
        #maxpool3*3(stride2 valid)
        self.maxpool6 = nn.MaxPool2d(kernel_size=3, stride=2)
        
    def forward(self, x):
        y1_1 = self.maxpool4(self.conv3(self.conv2(self.conv1(x))))
        y1_2 = self.conv4(self.conv3(self.conv2(self.conv1(x))))
        y1 = torch.cat([y1_1, y1_2], 1)
        
        y2_1 = self.conv5_1_2(self.conv5_1_1(y1))
        y2_2 = self.conv5_2_4(self.conv5_2_3(self.conv5_2_2(self.conv5_2_1(y1))))
        y2 = torch.cat([y2_1, y2_2], 1)
        
        y3_1 = self.conv6(y2)
        y3_2 = self.maxpool6(y2)
        y3 = torch.cat([y3_1, y3_2], 1)
        
        return y3

定义网络模型,将上述模块按构架图组装一起

class Googlenetv4(nn.Module):
    def __init__(self):
        super(Googlenetv4, self).__init__()
        self.stem = Stem(3, 384)
        self.icpA = InceptionA(384, 384)
        self.redA = ReductionA(384, 1024, 192, 224, 256, 384)
        self.icpB = InceptionB(1024, 1024)
        self.redB = ReductionB(1024, 1536)
        self.icpC = InceptionC(1536, 1536)
        self.avgpool = nn.AvgPool2d(kernel_size=8)
        self.dropout = nn.Dropout(p=0.8)
        self.linear = nn.Linear(1536, 1000)
        
    def forward(self, x):
        #Stem Module
        out = self.stem(x)
        #InceptionA Module * 4
        out = self.icpA(self.icpA(self.icpA(self.icpA(out))))
        #ReductionA Module
        out = self.redA(out)
        #InceptionB Module * 7
        out = self.icpB(self.icpB(self.icpB(self.icpB(self.icpB(self.icpB(self.icpB(out)))))))
        #ReductionB Module
        out = self.redB(out)
        #InceptionC Module * 3
        out = self.icpC(self.icpC(self.icpC(out)))
        #Average Pooling
        out = self.avgpool(out)
        out = out.view(out.size(0), -1)
        #Dropout
        out = self.dropout(out)
        #Linear(Softmax)
        out = self.linear(out)
        
        return out

V. 测试部分及结果

def test():
    x = torch.randn(1, 3, 299, 299)
    net = Googlenetv4()
    y = net(x)
    print(y.size())
test()
torch.Size([1, 1000])
### 回答1: 要使用 PyTorch 调用 Inception-v4 模型,可以按照以下步骤操作: 1. 安装 PyTorch 和 torchvision 库。如果您已经安装了这些库,可以跳过此步骤。 ``` pip install torch torchvision ``` 2. 导入 PyTorch 和 torchvision 库,以及 Inception-v4 模型。 ```python import torch import torchvision.models as models inceptionv4 = models.inception_v4(pretrained=True) ``` 3. 加载预训练权重。在上面的代码中,`pretrained=True` 表示加载预训练权重。 4. 将输入数据传递给模型,以获取输出结果。Inception-v4 模型需要输入大小为 299x299 的图像。您可以使用 torchvision.transforms 库对图像进行预处理,以使其与模型输入的大小匹配。 ```python from PIL import Image import torchvision.transforms as transforms # 预处理图像 preprocess = transforms.Compose([ transforms.Resize(299), transforms.CenterCrop(299), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # 加载图像 image = Image.open("image.jpg") # 预处理图像并添加批次维度 image = preprocess(image).unsqueeze(0) # 将图像传递给模型 output = inceptionv4(image) ``` 5. 获取模型的输出结果。在上面的代码中,`output` 是一个张量,包含了模型对输入图像的预测结果。 ```python # 获取预测结果 prediction = torch.argmax(output) print("预测结果:", prediction.item()) ``` 上述代码中的 `torch.argmax()` 函数将输出张量中的最大值索引作为预测结果。 ### 回答2: 使用PyTorch调用InceptionV4可以按照以下步骤进行操作: 1. 安装PyTorch:在终端或命令提示符中,使用适当的命令安装PyTorch库,如pip install torch或conda install pytorch。 2. 下载InceptionV4模型权重:在PyTorch的官方仓库中(https://github.com/Cadene/pretrained-models.pytorch),可以找到InceptionV4的预训练权重文件。选择相应的权重文件并下载。 3. 导入必要的库:在Python脚本开始部分,导入PyTorch和其他必要的库,如torch、torchvision和pretrainedmodels等。 4. 加载InceptionV4模型:使用torchvision.models中的InceptionV4类加载InceptionV4模型,可以通过设置pretrained=True来加载预训练权重。 5. 输入预处理:如果需要对输入图像进行预处理,可以使用torchvision.transforms对图像进行转换,例如缩放、裁剪或归一化等操作。 6. 加载图像数据:使用PyTorch的Dataset和DataLoader加载图像数据集,并对其进行必要的预处理。 7. 运行测试:通过模型.forward()方法,将图像数据传递给加载的InceptionV4模型进行测试。可以通过模型输出的结果进行预测或其他后续操作。 8. 后处理:根据需要,对模型的输出结果进行后处理,例如获取预测的类别标签或计算图像的特征向量等。 9. 结果展示:根据具体的任务需求,对输出结果进行展示或保存等操作。 总结以上步骤,即可使用PyTorch调用InceptionV4模型对图像进行预测或其他相关任务。 ### 回答3: PyTorch是一个常用的深度学习框架,它提供了一种简单而高效的方法来调用各种预训练的模型,包括InceptionV4。下面是使用PyTorch调用InceptionV4的一般步骤: 1. 安装PyTorch:在使用PyTorch之前,需要先安装PyTorch库。可以通过PyTorch官方网站提供的指南来安装匹配自己操作系统和GPU的版本。 2. 导入相应的模块:首先,需要导入`torch`和`torchvision`模块。其中,`torch`是PyTorch的核心库,`torchvision`是PyTorch的图像处理库。 ```python import torch import torchvision.models as models ``` 3. 加载InceptionV4模型:使用`torchvision.models`模块中的`inceptionv4`函数可以加载InceptionV4模型。需要注意的是,InceptionV4模型需要经过预训练,因此可以直接使用预训练的权重。 ```python model = models.inception_v4(pretrained=True) ``` 4. 图像预处理:在将图像输入模型之前,需要对图像进行预处理。PyTorch提供了一些图像预处理函数,例如`torchvision.transforms`模块中的`ToTensor()`和`Normalize()`函数。可以根据需要自定义预处理流程。 ```python from torchvision import transforms transform = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) ``` 5. 加载图像并进行预测:在使用模型进行预测之前,需要加载图像并进行预处理。 ```python from PIL import Image # 加载图像 image = Image.open('image.jpg') # 进行预处理 image = transform(image) # 添加一个额外的维度以满足模型的输入要求 image = image.unsqueeze(0) # 使用模型进行预测 output = model(image) ``` 以上就是使用PyTorch调用InceptionV4的一般步骤,通过加载模型、预处理图像并进行预测,可以快速得到图像在InceptionV4模型上的输出。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值