mtcnn pytorch实现

本文详细介绍了MTCNN(多任务级联卷积网络)在PyTorch中的实现过程,包括网络结构的搭建,如P-net、R-net和O-net,以及每个网络的训练和测试环节。P-net采用全卷积网络结构,R-net和O-net在卷积基础上增加了全连接层。所有网络都包含Sigmoid和线性输出层,用于置信度和回归框的预测。
摘要由CSDN通过智能技术生成

MTCNN PyTorch实现

MTCNN 网络结构实现:

P-net

全卷积网络

中间层:

​ 卷积层:2D卷积,激活函数:PReLU

​ 池化层:最大池化

置信度输出:Sigmoid(激活函数)

回归框输出,地表点回归:线性输出

P_net in_shape in_channels out_channels kernel_size stride padding out_shape
conv1 [batch,3,12,12] 3 10 3 1 0 [batch,10,10,10]
pool [batch,10,10,10] 10 10 2 2 1 [batch,10,5,5]
conv2 [batch,10,5,5] 10 16 3 1 0 [batch,16,3,3]
conv3 [batch,16,3,3] 16 32 3 1 0 [batch,32,1,1]
conv4_1 [batch,32,1,1] 32 2 1 1 0 [batch,1,1,1]
conv4_2 [batch,32,1,1] 32 4 1 1 0 [batch,4,1,1]
conv4_3 [batch,32,1,1] 32 10 1 1 0 [batch,10,1,1]

R-net

R-net = 卷积层 + 全连接层

中间层:

​ 卷积层:2D卷积,激活函数:PReLU

​ 池化层:最大池化

​ 全连接层:PReLU(激活函数)

置信度输出:Sigmoid(激活函数)

回归框输出,地表点回归:线性输出

conv in_shape in_channels out_channels kernel_size stride padding out_shape
conv1 [batch,3,24,24] 3 28 3 1 0 [batch,28,22,22]
pool1 [batch,28,22,22] 28 28 3 2 1 [batch,28,22,22]
conv2 [batch,48,11,11] 28 48 3 1 0 [batch,48,9,9]
pool2 [batch,48,9,9] 48 48 3 2 0 [batch,48,4,4]
conv3 [batch,48,4,4] 48 64 2 1 0 [batch,48,3,3]
line in_unit out_unit
line1 64*3*3 128
line2_1 128 1
line2_2 128 4
line3-3 128 10

O-net

O-net = 卷积层 + 全连接层

中间层:

​ 卷积层:2D卷积,激活函数:PReLU

​ 池化层:最大池化

​ 全连接层:PReLU(激活函数)

置信度输出:Sigmoid(激活函数)

回归框输出,地表点回归:线性输出

conv in_shape in_channnels out_channels kernel_size stride padding out_shape
conv1 [batch,3,48,48] 3 32 3 1 0 [batch,32,46,46]
pool1 [batch,32,46,46] 32 32 2 2 1 [batch,32,24,24]
conv2 [batch,32,24,24] 32 64 3 1 0 [batch,64,22,22]
pool2 [batch,64,22,22] 64 64 3 2 0 [batch,64,10,10]
conv3 [batch,64,10,10] 64 64 2 1 0 [batch,64,8,8]
pool3 [batch,64,8,8] 64 64 2 2 0 [batch,64,4,4]
conv4 [batch,64,4,4] 64 128 2 1 0 [batch,128,3,3]
line in_unit out_unit
line1 128*3* 3 256
line2_1 256 1
line2_2 256 4
line2_3 256 10

代码实现:

import torch
import torch.nn as nn
import torch.nn.functional as F


class PNet(nn.Module):

    def __init__(self):
        super(PNet, self).__init__()

        self.conv_layer = nn.Sequential(
            nn.Conv2d(3, 10, kernel_size=3, stride=1),  # conv1
            nn.PReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2),  # pool1
            nn.Conv2d(10, 16, kernel_size=3, stride=1),  # conv2
            nn.PReLU(),  
            nn.Conv2d(16, 32, kernel_size=3, stride=1),  # conv3
            nn.PReLU()  
        )

        self.conv4_1 = nn.Conv2d(32, 1, kernel_size=1, stride=1)
        self.conv4_2 = nn.Conv2d(32, 4, kernel_size=1, stride=1)
        self.conv4_3 = nn.Conv2d(32, 10, kernel_size=1, stride=1)

    def forward(self, x):
    
        x = self.conv_layer(x)

        cond = F.sigmoid(self.conv4_1(x))
        box_offset = self.conv4_2(x)
        land_offset = self.conv4_3(x)

        return cond, box_offset, land_offset


class RNet(nn.Module):
    def __init__(self):
        super(RNet, self).__init__()
        self.conv_layer = nn.Sequential(
            nn.Conv2d(3, 28, kernel_size=3, stride=1),  # conv1
            nn.PReLU(), 
            nn.MaxPool2d(kernel_size=3, stride=2, padding=1),  # pool1
            nn.Conv2d(28, 48, kernel_size=3, stride=1),  # conv2
            nn.PReLU(),  
            nn.MaxPool2d(kernel_size=3, stride=2),  # pool2
            nn.Conv2d(48, 64, kernel_size=2, stride=1),  # conv3
            nn.PReLU()  

        )
        self.line1 = nn.Sequential(
            nn
评论 15
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值