Pytorch中卷积层转向全连接层时,全连接层输入维度的确定

Pytorch中卷积层转向全连接层时,全连接层输入维度的确定

一、代码法:

改变的AlexNet网络结构:

输入维度为 ( N , 1 , 100 , 100 ) (N,1, 100,100) N1,100100 N N N为输入的样本数。

import torch.nn as nn
import torch.nn.functional as F
import torch
from torchsummary import summary

class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 96, 11, stride=2)
        self.conv2 = nn.Conv2d(96, 256, 5, stride=1, padding=2)
        self.conv3 = nn.Conv2d(256, 384, 3, stride=1, padding=1)
        self.conv4 = nn.Conv2d(384, 384, 3, stride=1, padding=1)
        self.conv5 = nn.Conv2d(384, 256, 3, stride=1, padding=1)
        self.fc1 = nn.Linear(256*6*6, 4096)
        self.fc2 = nn.Linear(4096, 4096)
        self.fc3 = nn.Linear(4096, 1000)

    def forward(self, x):
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv3(out))
        out = F.relu(self.conv4(out))
        out = F.relu(self.conv5(out))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out

AlexNet 网络结构为例,如上代码。在AlexNet的基础上,将第一层的卷积层的输入通道变成1。代码法相对直观,随机产生一个维度的数据作为调试,我们选择的维度为 ( 1 , 100 , 100 ) (1,100,100) 1,100,100,代码如下:

net = AlexNet()
image = torch.randn(2, 1, 100, 100)
out = F.relu(net.conv1(image))
out = F.max_pool2d(out, 2)
out = F.relu(net.conv2(out))
out = F.max_pool2d(out, 2)
out = F.relu(net.conv3(out))
out = F.relu(net.conv4(out))
out = F.relu(net.conv5(out))
out = F.max_pool2d(out, 2)
out.size()

得到输出为:

torch.Size([2, 256, 5, 5])

卷积层的输出维度为:(2, 256, 5, 5)。
全连接层的输入维度为卷积层的输出维度为: 256 × 5 × 5 256\times 5\times 5 256×5×5 。则AlexNet中下面的这行代码:

self.fc1 = nn.Linear(256*6*6, 4096)

需要改为:

self.fc1 = nn.Linear(256*5*5,4096)

全连接层输入的维度为每个图片总的像素点个数。

AlexNet

import torch.nn as nn
import torch.nn.functional as F
import torch
from torchsummary import summary

class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 96, 11, stride=4)
        self.conv2 = nn.Conv2d(96, 256, 5, stride=1, padding=2)
        self.conv3 = nn.Conv2d(256, 384, 3, stride=1, padding=1)
        self.conv4 = nn.Conv2d(384, 384, 3, stride=1, padding=1)
        self.conv5 = nn.Conv2d(384, 256, 3, stride=1, padding=1)
        self.fc1 = nn.Linear(256*6*6, 4096)
        self.fc2 = nn.Linear(4096, 4096)
        self.fc3 = nn.Linear(4096, 1000)

    def forward(self, x):
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv3(out))
        out = F.relu(self.conv4(out))
        out = F.relu(self.conv5(out))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out

调试代码为:

net = AlexNet()
image = torch.randn(2, 3, 227, 227)
out = F.relu(net.conv1(image))
out = F.max_pool2d(out, 2)
out = F.relu(net.conv2(out))
out = F.max_pool2d(out, 2)
out = F.relu(net.conv3(out))
out = F.relu(net.conv4(out))
out = F.relu(net.conv5(out))
out = F.max_pool2d(out, 2)
out.size()

输出为:

torch.Size([2, 256, 6, 6])

全连接层的输入维度为: 256 × 6 × 6 256\times 6\times 6 256×6×6

二、公式法:

torch.nn.Conv2d(in_channels,
               out_channels, 
               kernel_size, 
               stride=1, 
               padding=0, 
               dilation=1, 
               groups=1, 
               bias=True, 
               padding_mode='zeros', 
               device=None, 
               dtype=None)

参数如下:

  • n_channels (int) :输入图片的维度;
  • out_channels (int) :卷积层的输出维度;
  • kernel_size (int or tuple) :卷积核的大小;
  • stride (int or tuple, optional) :卷积核在原图上扫描时,需要跳跃的格数,默认跳一格;
  • padding (int, tuple or str, optional) :输入四个边的填充值,默认为0 ,不填充;
  • padding_mode (string, optional) :填充的方式有’zeros’, ‘reflect’, ‘replicate’ or ‘circular’. 默认为’zeros’,填充0值;
  • dilation (int or tuple, optional) :卷积核元素间的间隔,默认为1;
  • groups (int, optional) :输入通道和输出通道间的联系类别数,默认为1;
  • bias (bool, optional) :如果为真,添加一个学习的偏置到输出上,默认为真。

卷积层的输入和输出关系,如下在这里插入图片描述 H o u t H_{out} Hout W o u t W_{out} Wout为卷积层输出图片的高和宽。

  • 5
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值