GaitGLN学习笔记(主干网络+代码复现)

论文

1 Introduction

​ Gait Lateral Network:步态横向网络

​ 创新之处:

​ ① Lateral Connection:横向连接。利用深度CNN网络中的固有的特征金字塔来学习更具有判别性表示的步态特征,不同层提取的特征捕获了不同的细节。通过对GaitSet的改进,将不同阶段提取的轮廓集特征与集合级特征以横向连接的方式合并,从而学习更具有判别性的表示

​ ② Compact Block:紧缩块。通过HPM学习到的表示存在冗余, 文章将高维表示看成是低维表示的集合,利用dropout选择一个小子集,通过全连接层映射到一个紧凑的空间从而实现蒸馏

image-20220213212409476

2 Related Work

Inherent Feature Pyramid:固有金字塔特征,从FPN(特征金字塔网络,用于物体检测)得到了启发。

image-20220213213958953

​ 文章与FPN存在三点不同:

​ ① GLN有两个分支(轮廓级别与集合级别),利用横向连接同时合并两个级别的特征

​ ② FPN根据接受域不同分配不同的训练标签,而GLN不同阶段的标签是一致的

​ ③ FPN在不同阶段的头部共用参数,而GLN在不同阶段的后续层有独立的参数

3 Our Approach

3.1 BackBone简介

​ GaitGLN是一个GaitSet的改进版本,以下为GaitSet网络,作者将其划分为Stage0,1,2三个阶段(有些许改动,下边提及):

image-20220213221221395

​ 并在此基础上提出了自己的横向连接网络:

image-20220213221356849

注意!上图的蓝色与橙色箭头均已经过了Set Pooling!并且,两者先通过concat之后再经过1x1Conv

3.2 Lateral Connections

​ **作用:**通过CNN的特征金字塔学习更具有判别性的步态表示

​ **Fine-tune:**调换了Stage1中MaxPooing与SP的顺序,作用应该是方便横向连接网路的接入

​ 修改前,先MP再SP:

image-20220213222058416

​ 修改后,先SP再MP:

image-20220213222033881

​ ※ 这里特别提及了MaPooling与SetPooling的区别,前者在h和w维度做操作([30, 64, 44] => [30, 32,22]),后者在s维度做操作([30, 64, 44] => [1, 64, 44])。

​ **作用机理:**主干网络的不同Stage提取了步态轮廓集合的不同细节(bottom-up),而利用横向连接网络对这些细节进行融合(up-bottom)。

​ ① 1×1Conv:对通道维度进行重新调整,从而使得深层的特征可以与上一层的channel数目相匹配,在文中固定为256

​ ② 2 × up:两倍上采样(扩大特征图像),通过最近邻上采样实现

​ ③ SMO:平滑层,作用是为了缓解上采样造成的混叠效应和不同阶段之间的语义差距,通道大小也为256,通过3×3的卷积层进行实现

image-20220213224539217

3.3 Compact Block

​ **作用:**压缩原本GaitSet的15872维特征向量到只有256维,减小两个数量级的存储,后边做解释。

作用机理:发现HPM跨不同尺度的表示中,编码了一些重复的信息,例如4尺度的第一条度8尺度的第1和2条 代表轮廓的同一区域,并且为此猜想在高维表示中存在着一定的冗余信息

​ ① HPM:与GaitSet保持一致,划分为1,2,4,8,16个不同的尺度

​ ② 紧缩快:

image-20220213225414311

image-20220213225710180

​ 注意:Dropout层只在train阶段开启,dropout rate设置为0.9

3.4 Training Strategy

image-20220213230046590

​ 如图所见,训练过程中包含了两个损失函数:三元数损失 + 交叉熵损失。

三元数损失用于预训练横向连接网络(训练细节如同GaitSet),而三元数损失 + 交叉熵损失之和将被用于全局训练,具体做法是将每一个被试者单独看做一类(平滑标记技术),用交叉熵训练多分类问题。

​ 平滑标记技术:在通常的交叉熵训练中,label通常设计为one-hot编码。但在本文中,为了鼓励模型降低对训练集的信心,概率的设置有些许改变(0概率变成了一个小概率值):

image-20220213232613067

image-20220213233046302

​ 小细节:Compact Block的输出被固定为256,那对于OUMVLP这种大数据集,有10307个分类,256维根本不够one-hot的维度。确实,文章是有再最后再加入一个线性层的,但是在推理阶段并不能把改维度的特征作为最终表示,仍然应该以256维作为最终表示。简而言之,这个线性层是为了辅助交叉熵训练的,但它不是最终表示。

image-20220213233945114

代码复现(仅主干网)

'''BackBone'''

import torch
import torch.nn as nn
import torch.nn.functional as F
from model.network.basic_blocks import SetBlock, BasicConv2d, BasicConv1x1, SMO, Compact_Block


class GaitGLN(nn.Module):
    def __init__(self, hidden_dim):
        super(GaitGLN, self).__init__()
        self.hidden_dim = hidden_dim
        self.batch_frame = None

        _set_in_channels = 1
        _set_channels = [32, 64, 128]
        self.set_layer1 = SetBlock(BasicConv2d(_set_in_channels, _set_channels[0], 5, padding=2))
        self.set_layer2 = SetBlock(BasicConv2d(_set_channels[0], _set_channels[0], 3, padding=1))
        self.set_layer3 = SetBlock(BasicConv2d(_set_channels[0], _set_channels[1], 3, padding=1))
        self.set_layer4 = SetBlock(BasicConv2d(_set_channels[1], _set_channels[1], 3, padding=1))
        self.set_layer5 = SetBlock(BasicConv2d(_set_channels[1], _set_channels[2], 3, padding=1))
        self.set_layer6 = SetBlock(BasicConv2d(_set_channels[2], _set_channels[2], 3, padding=1))

        _gl_in_channels = 32
        _gl_channels = [64, 128]
        self.gl_layer1 = BasicConv2d(_gl_in_channels, _gl_channels[0], 3, padding=1)
        self.gl_layer2 = BasicConv2d(_gl_channels[0], _gl_channels[0], 3, padding=1)
        self.gl_layer3 = BasicConv2d(_gl_channels[0], _gl_channels[1], 3, padding=1)
        self.gl_layer4 = BasicConv2d(_gl_channels[1], _gl_channels[1], 3, padding=1)
        self.gl_pooling = nn.MaxPool2d(2)

        self.conv1x1_layer1 = BasicConv1x1(_set_channels[0])
        self.conv1x1_layer2 = BasicConv1x1(_set_channels[1] * 2)
        self.conv1x1_layer3 = BasicConv1x1(_set_channels[2] * 2)

        self.smo = SMO()
        self.upsample = nn.Upsample(scale_factor=2, mode='nearest')
        self.bin_num = [1, 2, 4]
        self.cb = Compact_Block(21, 1)

        for m in self.modules():
            if isinstance(m, (nn.Conv2d, nn.Conv1d)):
                nn.init.xavier_uniform_(m.weight.data)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight.data)
                nn.init.constant(m.bias.data, 0.0)
            elif isinstance(m, (nn.BatchNorm2d, nn.BatchNorm1d)):
                nn.init.normal(m.weight.data, 1.0, 0.02)
                nn.init.constant(m.bias.data, 0.0)

    def pooling(self, x):
        n, s, c, h, w = x.size()
        x = x.view(-1, c, h, w)
        pool2d = nn.MaxPool2d(2)
        x = pool2d(x)
        _, c, h, w = x.size()
        return x.view(n, s, c, h, w)

    def sp(self, x):
        return torch.max(x, 1)

    def hpm(self, x):
        feature = list()
        n, c, h, w = x.size()

        for num_bin in self.bin_num:  # 1 2 4
            z = x.view(n, c, num_bin, -1)
            z = z.mean(3) + z.max(3)[0]
            feature.append(z)

        feature = torch.cat(feature, 2).permute(2, 0, 1).contiguous()
        return feature

    def forward(self, silho):

        x = silho.unsqueeze(2)
        del silho

        # Stage 0
        x = self.set_layer1(x)  # Conv  [8, 30, 32, 64, 44]
        x = self.set_layer2(x)  # Conv  [8, 30, 32, 64, 44]

        # Lateral Connection 0
        lc0 = self.conv1x1_layer1(self.sp(x)[0])  # SP + Adjust [8, 256, 64, 44]

        # MP 0
        x = self.pooling(x)  # MP  [8, 30, 32, 32, 22]

        # Stage 1
        gl = self.gl_layer1(self.sp(x)[0])  # SP + Conv  [8, 64, 32, 22]
        gl = self.gl_layer2(gl)  # Conv  [8, 64, 32, 22]
        x = self.set_layer3(x)  # Conv  [8, 30, 64, 32, 22]
        x = self.set_layer4(x)  # Conv  [8, 30, 64, 32, 22]
        gl = gl + self.sp(x)[0]  # SP + Concat  [8, 64, 32, 22]

        # Lateral Connection 1
        lc1 = torch.cat((self.sp(x)[0], gl), dim=1)  # Concat [8, 128, 32, 22]
        lc1 = self.conv1x1_layer2(lc1)  # Adjust [8, 256, 32, 22]

        # MP 1
        x = self.pooling(x)  # MP  [8, 30, 64, 16, 11]
        gl = self.gl_pooling(gl)  # MP  [8, 64, 16, 11]

        # Stage 2
        gl = self.gl_layer3(gl)  # Conv  [8, 128, 16, 11]
        gl = self.gl_layer4(gl)  # Conv  [8, 128, 16, 11]
        x = self.set_layer5(x)  # Conv  [8, 30, 128, 16, 11]
        x = self.set_layer6(x)  # Conv  [8, 30, 128, 16, 11]
        gl = gl + self.sp(x)[0]  # Concat + SP  [8, 128, 16, 11]

        # Lateral Connection 2
        lc2 = torch.cat((self.sp(x)[0], gl), dim=1)  # Concat [8, 256, 16, 11]
        lc2 = self.conv1x1_layer3(lc2)  # Adjust [8, 256, 16, 11]
        feature1 = self.hpm(self.smo(lc2))  # SMO + HPM  [7, 8, 256]

        lc2 = self.upsample(lc2)  # Upsample [8, 256, 32, 22]

        lc1 = lc1 + lc2  # Add [8, 256, 32, 22]
        feature2 = self.hpm(self.smo(lc1))  # SMO + HPM  [7, 8, 256]

        lc1 = self.upsample(lc1)  # Upsample [8, 256, 64, 44]

        lc0 = lc0 + lc1  # Add [8, 256, 64, 44]
        feature3 = self.hpm(self.smo(lc0))  # SMO + HPM  [7, 8, 256]

        feature = torch.cat((feature1, feature2, feature3), dim=0).permute(1, 0, 2).contiguous()  # Concat  [8, 21, 256]

        # Compact Block
        feature_cb = self.cb(feature)  # [8, 256]

        return feature, feature_cb

'''basic_blocks'''

import torch
import torch.nn as nn
import torch.nn.functional as F


class BasicConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, **kwargs):
        super(BasicConv2d, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, bias=False, **kwargs)

    def forward(self, x):
        x = self.conv(x)
        return F.leaky_relu(x, inplace=True)


class BasicConv1x1(nn.Module):
    def __init__(self, in_channels):
        super(BasicConv1x1, self).__init__()
        self.conv = nn.Conv2d(in_channels, 256, 1, bias=False)

    def forward(self, x):
        x = self.conv(x)
        return x


class SMO(nn.Module):
    def __init__(self):
        super(SMO, self).__init__()
        self.conv = nn.Conv2d(256, 256, 3, bias=False, padding=1)

    def forward(self, x):
        x = self.conv(x)
        return x


class SetBlock(nn.Module):
    def __init__(self, forward_block, pooling=False):
        super(SetBlock, self).__init__()
        self.forward_block = forward_block
        self.pooling = pooling
        if pooling:
            self.pool2d = nn.MaxPool2d(2)

    def forward(self, x):
        n, s, c, h, w = x.size()
        x = self.forward_block(x.view(-1, c, h, w))
        if self.pooling:
            x = self.pool2d(x)
        _, c, h, w = x.size()
        return x.view(n, s, c, h, w)


class Compact_Block(nn.Module):
    def __init__(self, input_size, common_size):
        super(Compact_Block, self).__init__()
        self.bn = nn.BatchNorm1d(256)
        self.relu = nn.ReLU(inplace=True)
        self.dropout = nn.Dropout(0.9)
        self.fc = nn.Linear(input_size, common_size)

    def forward(self, x):
        x = x.permute(0, 2, 1).contiguous()
        x = self.bn(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc(x)
        x = self.bn(x)
        x = x.squeeze()
        return x

  • 6
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 15
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 15
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值