Deepfake Video Detection Using Recurrent Neural Networks论文阅读笔记

Deepfake Video Detection Using Recurrent Neural Networks论文阅读笔记

D. Güera and E. J. Delp, “Deepfake Video Detection Using Recurrent Neural Networks,” 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2018, pp. 1-6, doi: 10.1109/AVSS.2018.8639163.

Introduction

ses a convolu-tional neural network (CNN) to extract frame-level features

使用CNN卷积神经网络提取帧级特征

These features are then used to train a recurrent neural net-work (RNN) that learns to classify if a video has been sub-ject to manipulation or not.

特征提取后,使用RNN递归神经网络,学习如何对视频分类

The main contributions of this work are summarized asfollows. First, we propose a two-stage analysis composedof a CNN to extract features at the frame level followed by atemporally-aware RNN network to capture temporal incon-sistencies between frames introduced by the face-swappingprocess. Second, we have used a collection of 600 videos to evaluate the proposed method, with half of the videos being deepfakes collected from multiple video hosting websites. Third, we show experimentally the effectiveness of the de-
scribed approach, which allows use to detect if a suspect video is a deepfake manipulation with 94% more accuracy than a random detector baseline in a balanced setting

这项工作的主要贡献总结如下。首先,我们提出了一个由CNN组成的两阶段分析,用于在帧级提取特征,然后是一个时间感知RNN网络,用于捕获人脸交换过程中引入的帧之间的时间不一致性。其次,我们收集了600个视频来评估所提出的方法,其中一半的视频是从多个视频托管网站收集的DeepFake视频。第三,我们通过实验证明了所述方法的有效性,该方法允许在平衡设置下检测可疑视频是否为深度假操作,准确率比随机检测器基线高94%

Realted Work

  • Digital Media Forensics 数字媒体取证

    two pre-trained deep CNNs

    two different face swapping manipulations using a two-stream network

  • Face-based Video Manipulation Methods 基于人脸的视频处理方法

    Face2Face:a real-time fa-cial reenactment system,capable of altering facial move-ments in different types of video streams

    Generative adversarial networks ( 生成性对抗网络GANs):Gans shows remarkable results in altering face attributes such as age, facial hair or mouth expressions.

  • Recurrent Neural Networks 循环神经网络

    LSTM网络

    When a deep learning architecture is equipped with a LSTM combined with a CNN, it is typically considered as “deep in space” and “deep in time” respectively, which can be seen as two distinct system modalities.

    当深度学习体系结构配备有LSTM和CNN时,它通常分别被视为“空间深度”和“时间深度”,这可以被视为两种不同的系统模式。

在这里插入图片描述

训练方式(Unet神经网络也是相同的训练方式)

Two sets of training images are required

the original face原始图像

the desired face预期图像

生成方式

pass a latent representation of a face generated from the original subject present in the video to the decoder network trained on faces of the subject we want to insert in the video

缺陷:

边界效果

Because the encoder is not aware of the skin or other scene information it is very common to have boundary effects due to a seamed fusion between the new face and the rest of the frame

最终视频本身的生成过程固有的

Because the autoencoder is used frame-by-frame, it is completely
unaware of any previous generated face that it may have created.

CNN的用处:

The most prominent is an inconsistent choice of illuminants between scenes with frames, with leads to a flickering phenomenon in the face region common to the majority of fake videos. Although this phenomenon can be hard to appreciate to the naked eye in the best manually-tuned deepfake manipulations, it is easily captured by a pixel-level CNN feature extracto

最突出的是场景与帧之间光源的选择不一致,导致大多数假视频常见的脸部区域闪烁现象。虽然这种现象很难用肉眼在最好的人工调节深度伪造操作中欣赏,但它很容易被像素级的CNN特征提取捕捉到

Recurrent Network for Deepfake Detection

在这里插入图片描述

Convolutional LSTM

  1. CNN for frame feature extraction.

    CNN帧特征提取

    removed to directly output a deep representation of each frame using the ImageNet pre-trained model

    The 2048-dimensional feature vec-tors after the last pooling layers are then used as the sequen-tial LSTM input.

  2. LSTM for temporal sequence analysis.

    LSTM时间序列分析

    2048-wide LSTM takes a sequence of 2048-dimensional ImageNet feature vectors

    512 fully-connected layer

    a softmax layer to compute the probabilities of the frame sequence being either pristine or deepfake

    without the need of auxiliary loss functions.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

陌上&未央

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值