PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer

最新推荐文章于 2024-08-23 16:35:18 发布

qq_49746822

最新推荐文章于 2024-08-23 16:35:18 发布

阅读量1k

点赞数 1

文章标签： transformer 深度学习人工智能

本文链接：https://blog.csdn.net/qq_49746822/article/details/125698175

版权

PhysFormer利用时间差分Transformer来解决面部视频中的生理信号测量问题，强调长程时空感知的重要性。与仅关注微妙皮肤颜色变化的传统深度学习方法不同，PhysFormer通过全局时空注意力机制捕捉精细的时间肤色差异。此外，它引入了标签分布学习和受课程学习启发的频域动态约束，提供精细的监督并减轻过拟合。

摘要由CSDN通过智能技术生成

PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer

摘要/介绍/相关工作

Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling.（长程时间关系）

反例：Unifying frame rate and temporal dilations for improved remote pulse detection（SCI三区水论文）
在这里插入图片描述

the temporal difference transformers

提出了：global spatio-temporal attention based on the fine-grained temporal skin color differences

差异性：

subtle skin color changes
long-time monitoring task
a video sequence to signal sequence problem

we also propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and alleviate overfitting.

网络结构

TDC模块

在这里插入图片描述

埋个伏笔下次再讲差分卷积在计算机视觉中的应用 - 知乎 (zhihu.com)

class CDC_T(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1,
                 padding=1, dilation=1, groups=1, bias=False, theta=0.6):

        super(CDC_T, self).__init__()
        self.conv = nn.Conv3d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding,
                              dilation=dilation, groups=groups, bias=bias)
        self.theta = theta

    def forward(self