pytorch(4)--conv3d

一、前言

    本篇主要记录pytorch 下的 conv3d 原理及一个网络示例C3D

参考自:https://blog.csdn.net/weixin_43844219/article/details/104134838

二、原理

   3维卷积比较适用于对时序数据,如视频序列,多帧图像做特征提取,pytorch 接口为 torch.nn.Conv3D

输入的size是(N,Cin,D,H,W),输出size是(N,Cout,Dout,Hout,Wout), 假设卷积核 kernel_size=(Kd, Kh, Kw), 步长stride=(Sd, Sh, Sw),补位padding=(Pd,Ph,Pw)

   torch.nn.Conv3D(c_in, c_out, kernel_size, stride, padding)

  1. N: batch_size, 以此训练的样本数
  2. Cin: 通道数,对于一般的RGB图像就是3,也为c_in的值
  3. D: 这个参数是在二维卷积中没有的,也是能提取到时序信息的关键, 就是用于提取时序特征的帧数,例如输入连续的16张图片,则D为16
  4. H/W: 一帧图片的大小

输出为:

  1. N: batch_size, 以此训练的样本数,不变
  2. Cout:  输出通道,直接由c_out 指定
  3. Dout:  由公式可计算,Sd为1时为 D-Kd+2*Pd+1, Sd 不为1时 (D-Kd+2*Pd)/2+1 , 出格情况计算公式: np.floor((n + 2p - f)/s + 1)
  4. Hout/Wout: 与3类似

   3维卷积的过程:

故 2维卷积是2维的卷积核在2维图像中移动,3维卷积是3维的卷积核在特征图的 D,H,W 3个维度移动 

三、代码实现

pytorch 的一个示例网络,链接 https://github.com/jjboy/c3d-pytorch

C3D 实现如下:

import torch
import torch.nn as nn


class C3D(nn.Module):
    '''
    conv1  in:16*3*112*112   out:16*64*112*112
    pool1  in:16*64*56*56    out:16*64*56*56
    conv2  in:16*64*56*56    out:16*128*56*56
    pool2  in:16*128*56*56   out:8*128*28*28
    conv3a in:8*128*28*28    out:8*256*28*28
    conv3b in:8*256*28*28    out:8*256*28*28
    pool3  in:8*256*28*28    out:4*256*14*14
    conv4a in:4*512*14*14    out:8*512*14*14
    conv4b in:4*512*14*14    out:8*512*14*14
    pool4  in:4*512*14*14    out:2*512*7*7
    conv5a in:2*512*7*7      out:2*512*7*7
    conv5b in:2*512*7*7      out:2*512*7*7
    pool5  in:2*512*7*7      out:1*512*4*4
    '''

    def __init__(self):
        super(C3D, self).__init__()

        self.conv1 = nn.Conv3d(3, 64, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool1 = nn.MaxPool3d(kernel_size=(1, 2, 2), stride=(1, 2, 2))

        self.conv2 = nn.Conv3d(64, 128, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool2 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2))

        self.conv3a = nn.Conv3d(128, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.conv3b = nn.Conv3d(256, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool3 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2))

        self.conv4a = nn.Conv3d(256, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.conv4b = nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool4 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2))

        self.conv5a = nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.conv5b = nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool5 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2), padding=(0, 1, 1))

        self.fc6 = nn.Linear(8192, 4096)
        self.fc7 = nn.Linear(4096, 4096)
        self.fc8 = nn.Linear(4096, 101)

        self.dropout = nn.Dropout(p=0.5)
        self.relu = nn.ReLU()

    def init_weight(self):
        for name, para in self.named_parameters():
            if name.find('weight') != -1:
                nn.init.xavier_normal_(para.data)
            else:
                nn.init.constant_(para.data, 0)

    def forward(self, x):
        h = self.relu(self.conv1(x))
        h = self.pool1(h)

        h = self.relu(self.conv2(h))
        h = self.pool2(h)

        h = self.relu(self.conv3a(h))
        h = self.relu(self.conv3b(h))
        h = self.pool3(h)

        h = self.relu(self.conv4a(h))
        h = self.relu(self.conv4b(h))
        h = self.pool4(h)

        h = self.relu(self.conv5a(h))
        h = self.relu(self.conv5b(h))
        h = self.pool5(h)

        h = h.view(-1, 8192)

        h = self.relu(self.fc6(h))
        h = self.dropout(h)

        h = self.relu(self.fc7(h))
        h = self.dropout(h)

        logits = self.fc8(h)

        return logits

 

  • 2
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值