pytorch(4)--conv3d

一、前言

    本篇主要记录pytorch 下的 conv3d 原理及一个网络示例C3D

参考自:https://blog.csdn.net/weixin_43844219/article/details/104134838

二、原理

   3维卷积比较适用于对时序数据,如视频序列,多帧图像做特征提取,pytorch 接口为 torch.nn.Conv3D

输入的size是(N,Cin,D,H,W),输出size是(N,Cout,Dout,Hout,Wout), 假设卷积核 kernel_size=(Kd, Kh, Kw), 步长stride=(Sd, Sh, Sw),补位padding=(Pd,Ph,Pw)

   torch.nn.Conv3D(c_in, c_out, kernel_size, stride, padding)

  1. N: batch_size, 以此训练的样本数
  2. Cin: 通道数,对于一般的RGB图像就是3,也为c_in的值
  3. D: 这个参数是在二维卷积中没有的,也是能提取到时序信息的关键, 就是用于提取时序特征的帧数,例如输入连续的16张图片,则D为16
  4. H/W: 一帧图片的大小

输出为:

  1. N: batch_size, 以此训练的样本数,不变
  2. Cout:  输出通道,直接由c_out 指定
  3. Dout:  由公式可计算,Sd为1时为 D-Kd+2*Pd+1, Sd 不为1时 (D-Kd+2*Pd)/2+1 , 出格情况计算公式: np.floor((n + 2p - f)/s + 1)
  4. Hout/Wout: 与3类似

   3维卷积的过程:

故 2维卷积是2维的卷积核在2维图像中移动,3维卷积是3维的卷积核在特征图的 D,H,W 3个维度移动 

三、代码实现

pytorch 的一个示例网络,链接 https://github.com/jjboy/c3d-pytorch

C3D 实现如下:

import torch
import torch.nn as nn


class C3D(nn.Module):
    '''
    conv1  in:16*3*112*112   out:16*64*112*112
    pool1  in:16*64*56*56    out:16*64*56*56
    conv2  in:16*64*56*56    out:16*128*56*56
    pool2  in:16*128*56*56   out:8*128*28*28
    conv3a in:8*128*28*28    out:8*256*28*28
    conv3b in:8*256*28*28    out:8*256*28*28
    pool3  in:8*256*28*28    out:4*256*14*14
    conv4a in:4*512*14*14    out:8*512*14*14
    conv4b in:4*512*14*14    out:8*512*14*14
    pool4  in:4*512*14*14    out:2*512*7*7
    conv5a in:2*512*7*7      out:2*512*7*7
    conv5b in:2*512*7*7      out:2*512*7*7
    pool5  in:2*512*7*7      out:1*512*4*4
    '''

    def __init__(self):
        super(C3D, self).__init__()

        self.conv1 = nn.Conv3d(3, 64, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool1 = nn.MaxPool3d(kernel_size=(1, 2, 2), stride=(1, 2, 2))

        self.conv2 = nn.Conv3d(64, 128, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool2 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2))

        self.conv3a = nn.Conv3d(128, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.conv3b = nn.Conv3d(256, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool3 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2))

        self.conv4a = nn.Conv3d(256, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.conv4b = nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool4 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2))

        self.conv5a = nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.conv5b = nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1))
        self.pool5 = nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2), padding=(0, 1, 1))

        self.fc6 = nn.Linear(8192, 4096)
        self.fc7 = nn.Linear(4096, 4096)
        self.fc8 = nn.Linear(4096, 101)

        self.dropout = nn.Dropout(p=0.5)
        self.relu = nn.ReLU()

    def init_weight(self):
        for name, para in self.named_parameters():
            if name.find('weight') != -1:
                nn.init.xavier_normal_(para.data)
            else:
                nn.init.constant_(para.data, 0)

    def forward(self, x):
        h = self.relu(self.conv1(x))
        h = self.pool1(h)

        h = self.relu(self.conv2(h))
        h = self.pool2(h)

        h = self.relu(self.conv3a(h))
        h = self.relu(self.conv3b(h))
        h = self.pool3(h)

        h = self.relu(self.conv4a(h))
        h = self.relu(self.conv4b(h))
        h = self.pool4(h)

        h = self.relu(self.conv5a(h))
        h = self.relu(self.conv5b(h))
        h = self.pool5(h)

        h = h.view(-1, 8192)

        h = self.relu(self.fc6(h))
        h = self.dropout(h)

        h = self.relu(self.fc7(h))
        h = self.dropout(h)

        logits = self.fc8(h)

        return logits

 

  • 2
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
以下是一个完整的3D卷积案例,用于对3D体积图像进行分类: 1. 建立数据集 我们下载了一个名为"Brain tumor dataset"的3D体积图像数据集,该数据集有2个类别:正常(类别0)和带有肿瘤(类别1)。每个样本是由155x240x240 3D体积组成的。 我们将在这里使用pytorchtorchvision.transforms进行数据增强。 ```python import os import torch import random import numpy as np import torch.nn as nn import torch.optim as optim import torchvision.transforms as transforms import torch.utils.data as data from torch.utils.data import DataLoader, Dataset from PIL import Image class CustomDataset(Dataset): def __init__(self, data_dir, transform=None): self.data_dir = data_dir self.transform = transform self.file_list = os.listdir(data_dir) def __len__(self): return len(self.file_list) def __getitem__(self, idx): img_path = os.path.join(self.data_dir, self.file_list[idx]) img = np.load(img_path) if self.transform: img = self.transform(img) label = int(self.file_list[idx].split("_")[1].split(".npy")[0]) return img, label def create_datasets(data_dir, batch_size): transform = transforms.Compose([ transforms.ToPILImage(), transforms.RandomHorizontalFlip(0.5), transforms.RandomRotation(20, resample=False, expand=False), transforms.ToTensor(), transforms.Normalize(mean=[0.5], std=[0.5]) ]) dataset = CustomDataset(data_dir, transform) train_size = int(len(dataset) * 0.8) test_size = len(dataset) - train_size train_dataset, test_dataset = torch.utils.data.random_split(dataset, [train_size, test_size]) train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) return train_loader, test_loader ``` 2. 建立3D CNN模型 我们建立了一个3D CNN模型,它包含了几层卷积层和池化层。 ```python class ConvNet(nn.Module): def __init__(self): super(ConvNet, self).__init__() self.conv1 = nn.Conv3d(1, 32, kernel_size=3, stride=1, padding=1) self.activation1 = nn.ReLU(inplace=True) self.pool1 = nn.MaxPool3d(kernel_size=2) self.conv2 = nn.Conv3d(32, 64, kernel_size=3, stride=1, padding=1) self.activation2 = nn.ReLU(inplace=True) self.pool2 = nn.MaxPool3d(kernel_size=2) self.conv3 = nn.Conv3d(64, 128, kernel_size=3, stride=1, padding=1) self.activation3 = nn.ReLU(inplace=True) self.pool3 = nn.MaxPool3d(kernel_size=2) self.conv4 = nn.Conv3d(128, 256, kernel_size=3, stride=1, padding=1) self.activation4 = nn.ReLU(inplace=True) self.pool4 = nn.MaxPool3d(kernel_size=2) self.fc1 = nn.Linear(256*11*14*14, 512) self.activation5 = nn.ReLU(inplace=True) self.fc2 = nn.Linear(512, 2) def forward(self, x): x = self.conv1(x) x = self.activation1(x) x = self.pool1(x) x = self.conv2(x) x = self.activation2(x) x = self.pool2(x) x = self.conv3(x) x = self.activation3(x) x = self.pool3(x) x = self.conv4(x) x = self.activation4(x) x = self.pool4(x) x = x.view(-1, 256*11*14*14) x = self.fc1(x) x = self.activation5(x) x = self.fc2(x) return x ``` 3. 训练模型 接下来,我们将训练我们的模型。我们使用Adam优化器和交叉熵损失函数。我们还使用了学习率衰减和早期停止技术,以避免过拟合问题。 ```python def train(model, train_loader, test_loader, num_epochs, learning_rate=0.001, weight_decay=0.0): criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay) scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=5, verbose=True) best_acc = 0.0 for epoch in range(num_epochs): train_loss = 0.0 train_acc = 0.0 for i, (inputs, labels) in enumerate(train_loader): optimizer.zero_grad() outputs = model(inputs.float().cuda()) loss = criterion(outputs, labels.cuda()) loss.backward() optimizer.step() train_loss += loss.item() * inputs.size(0) _, preds = torch.max(outputs.data, 1) train_acc += torch.sum(preds == labels.cuda().data) train_acc = train_acc.double() / len(train_loader.dataset) train_loss = train_loss / len(train_loader.dataset) print('Epoch [{}/{}], Train Loss: {:.4f}, Train Acc: {:.4f}'.format(epoch+1, num_epochs, train_loss, train_acc)) test_loss = 0.0 test_acc = 0.0 with torch.no_grad(): for inputs, labels in test_loader: outputs = model(inputs.float().cuda()) loss = criterion(outputs, labels.cuda()) test_loss += loss.item() * inputs.size(0) _, preds = torch.max(outputs.data, 1) test_acc += torch.sum(preds == labels.cuda().data) test_acc = test_acc.double() / len(test_loader.dataset) test_loss = test_loss / len(test_loader.dataset) scheduler.step(test_loss) if test_acc > best_acc: best_acc = test_acc torch.save(model.state_dict(), 'best_model.pth') print('Epoch [{}/{}], Test Loss: {:.4f}, Test Acc: {:.4f}'.format(epoch+1, num_epochs, test_loss, test_acc)) ``` 4. 运行模型 最后,我们调用我们建立的模型和数据集等函数,运行模型: ```python def main(): data_dir = 'Brain_tumor_dataset' batch_size = 8 num_epochs = 100 train_loader, test_loader = create_datasets(data_dir, batch_size) model = ConvNet().cuda() train(model, train_loader, test_loader, num_epochs) if __name__ == '__main__': main() ```

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值