论文阅读 [TPAMI-2022] Learning Representations for Facial Actions From Unlabeled Videos

最新推荐文章于 2022-11-05 22:04:39 发布

智尊宝人工智能社区

最新推荐文章于 2022-11-05 22:04:39 发布

阅读量276

点赞数

文章标签：人工智能深度学习机器学习计算机视觉 CVPR

本文链接：https://blog.csdn.net/weixin_42155685/article/details/123983322

版权

论文阅读 [TPAMI-2022] Learning Representations for Facial Actions From Unlabeled Videos

论文搜索(studyai.com)

搜索论文: Learning Representations for Facial Actions From Unlabeled Videos

搜索论文: http://www.studyai.com/search/whole-site/?q=Learning+Representations+for+Facial+Actions+From+Unlabeled+Videos

关键字(Keywords)

Gold; Face; Feature extraction; Videos; Magnetic heads; Task analysis; Facial action unit detection; self-supervised learning; representation learning; feature disentanglement; encoder-decoder structure

机器学习; 机器视觉; 自然语言处理

监督学习; 自监督学习; 表情识别; 自编码器; 编码解码器; 语言表示学习; 图像检索

摘要(Abstract)

Facial actions are usually encoded as anatomy-based action units (AUs), the labelling of which demands expertise and thus is time-consuming and expensive.

面部动作通常被编码为基于解剖学的动作单元（AU），其标记需要专业知识，因此耗时且昂贵。.

To alleviate the labelling demand, we propose to leverage the large number of unlabelled videos by proposing a twin-cycle autoencoder (TAE) to learn discriminative representations for facial actions.

为了缓解标签需求，我们建议利用大量未标记的视频，提出一种双循环自动编码器（TAE）来学习面部动作的区别表示。.

TAE is inspired by the fact that facial actions are embedded in the pixel-wise displacements between two sequential face images (hereinafter, source and target) in the video.

TAE的灵感来自这样一个事实：面部动作嵌入到视频中两个连续面部图像（以下简称源和目标）之间的像素级位移中。.

Therefore, learning the representations of facial actions can be achieved by learning the representations of the displacements.

因此，学习面部动作的表征可以通过学习位移的表征来实现。.

However, the displacements induced by facial actions are entangled with those induced by head motions.

然而，面部动作引起的位移与头部运动引起的位移相纠缠。.

TAE is thus trained to disentangle the two kinds of movements by evaluating the quality of the synthesized images when either the facial actions or head pose is changed, aiming to reconstruct the target image.

因此，当面部动作或头部姿势发生变化时，通过评估合成图像的质量，训练TAE来分离这两种运动，以重建目标图像。.

Experiments on AU detection show that TAE can achieve accuracy comparable to other existing AU detection methods including some supervised methods, thus validating the discriminant capacity of the representations learned by TAE.

金检测实验表明，TAE可以达到与其他现有金检测方法（包括一些监督方法）相当的精度，从而验证了TAE学习到的表征的鉴别能力。.

TAE’s ability in decoupling the action-induced and pose-induced movements is also validated by visualizing the generated images and analyzing the facial image retrieval results qualitatively and quantitatively…

通过对生成的图像进行可视化，并定性和定量地分析人脸图像检索结果，验证了TAE对动作诱发和姿势诱发运动的解耦能力。。.