FaceForensics++数据集下载、视频帧提取及面部提取教程

KZ_雪豹

已于 2024-07-23 15:14:57 修改

阅读量5.6k

点赞数 22

文章标签：计算机视觉 python

于 2024-07-23 15:13:24 首次发布

本文链接：https://blog.csdn.net/m0_52807769/article/details/140634564

版权

FaceForensics++数据集下载、视频帧提取及面部提取教程

一、FaceForensics++数据集简介
二、FaceForensics++数据集下载
三、视频帧提取和面部提取方法
四、复现注意事项
- 4.1 数据集下载
- 4.2 克隆代码的文件结构

本文主要总结多媒体取证领域数据集FaceForensics++的下载、视频帧提取和面部提取方法，并提供一个完整的能够正确运行代码的教程。本文会提供数据集的下载脚本，可以帮助大家节省向官方申请的环节。本文内容仅供参考，如果有问题欢迎大家提出宝贵意见！

一、FaceForensics++数据集简介

FaceForensics++ 是一个取证数据集，由1000段原始视频序列组成，这些视频通过四种自动人脸操纵方法进行处理：Deepfakes、Face2Face、FaceSwap 和 NeuralTextures。数据来自 977 段 YouTube 视频，所有视频中都包含一张可跟踪的、主要为正面且没有遮挡的人脸，使得自动篡改方法能够生成逼真的伪造视频。同时，由于该数据集提供了二值掩码，这些数据可以用于图像和视频分类以及分割。此外，官方还提供了 1000 个 Deepfakes 模型，用于生成和扩充新数据。
论文原文链接：https://arxiv.org/abs/1901.08971
GitHub链接：https://github.com/ondyari/FaceForensics?tab=readme-ov-file

二、FaceForensics++数据集下载

2.1 通过官方渠道提出申请

在官方的GitHub主页中，使用者可以通过填写谷歌表单的方式向官方递交申请，如果申请通过会将下载脚本自动的发送到你的邮箱中。如果在一周内没有得到回复，在重新递交申请的同时检查一下自己的邮箱是否退回了自动发送的邮件。
谷歌表单链接：https://docs.google.com/forms/d/e/1FAIpQLSdRRR3L5zAv6tQ_CKxmK4W96tAab_pfBu2EKAgQbeDVhmXagg/viewform
在这里插入图片描述
如果申请成功会收到这样一封邮件，通过邮件里的链接就可以获取下载脚本了。

2.2 使用本文提供的下载脚本

为了帮助大家节省递交申请的环节，在本节提供了下载脚本的代码。如果有需要可以直接复制到Python文件中，配置好对应的环境就可以下载了。

#!/usr/bin/env python
""" Downloads FaceForensics++ and Deep Fake Detection public data release
Example usage:
    see -h or https://github.com/ondyari/FaceForensics
"""
# -*- coding: utf-8 -*-
import argparse
import os
import urllib
import urllib.request
import tempfile
import time
import sys
import json
import random
from tqdm import tqdm
from os.path import join


# URLs and filenames
FILELIST_URL = 'misc/filelist.json'
DEEPFEAKES_DETECTION_URL = 'misc/deepfake_detection_filenames.json'
DEEPFAKES_MODEL_NAMES = ['decoder_A.h5', 'decoder_B.h5', 'encoder.h5',]

# Parameters
DATASETS = {
    'original_youtube_videos': 'misc/downloaded_youtube_videos.zip',
    'original_youtube_videos_info': 'misc/downloaded_youtube_videos_info.zip',
    'original': 'original_sequences/youtube',
    'DeepFakeDetection_original': 'original_sequences/actors',
    'Deepfakes': 'manipulated_sequences/Deepfakes',
    'DeepFakeDetection': 'manipulated_sequences/DeepFakeDetection',
    'Face2Face': 'manipulated_sequences/Face2Face',
    'FaceShifter': 'manipulated_sequences/FaceShifter',
    'FaceSwap': 'manipulated_sequences/FaceSwap',
    'NeuralTextures': 'manipulated_sequences/NeuralTextures'
    }
ALL_DATASETS = ['original', 'DeepFakeDetection_original', 'Deepfakes',
                'DeepFakeDetection', 'Face2Face', 'FaceShifter', 'FaceSwap',
                'NeuralTextures']
COMPRESSION = ['raw', 'c23', 'c40']
TYPE = ['videos', 'masks', 'models']
SERVERS = ['EU', 'EU2', 'CA']


def parse_args():
    parser = argparse.ArgumentParser(
        description='Downloads FaceForensics v2 public data release.',
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )
    parser.add_argument('output_path', type=str, help='Output directory.')
    parser.add_argument('-d', '--dataset', type=str, default='all',
                        help='Which dataset to download, either pristine or '
                             'manipulated data or the downloaded youtube '
                             'videos.',
                        choices=list(DATASETS.keys()) + ['all']
                        )
    parser.add_argument('-c', '--compression', type=str, default='raw',
                        help='Which compression degree. All videos '
                             'have been generated with h264 with a varying '
                             'codec. Raw (c0) videos are lossless compressed.',
                        choices=COMPRESSION
                        )
    parser.add_argument('-t', '--type', type=str, default='videos',
                        help='Which file type, i.e. videos, masks, for our '
                             'manipulation methods, models, for Deepfakes.',
                        choices=TYPE
                        )
    parser.add_argument('-n', '--num_videos', type=int, default=None,
                        help='Select a number of videos number to '
                             "download if you don't want to download the full"
                             ' dataset.')
    parser.add_argument('--server', type=str, default='EU',
                        help='Server to download the data from. If you '
                             'encounter a slow download speed, consider '
                             'changing the server.',
                        choices=SERVERS
                        )
    args = parser.parse_args()

    # URLs
    server = args.server
    if server == 'EU':
        server_url = 'http://canis.vc.in.tum.de:8100/'
    elif server == 'EU2':
        server_url = 'http://kaldir.vc.in.tum.de/faceforensics/'
    elif server == 'CA':
        server_url = 'http://falas.cmpt.sfu.ca:8100/'
    else:
        raise Exception('Wrong server name. Choices: {}'.format(str(SERVERS)))
    args.tos_url = server_url + 'webpage/FaceForensics_TOS.pdf'
    args.base_url = server_url + 'v3/'
    args.deepfakes_model_url = server_url + 'v3/manipulated_sequences/' + \
                               'Deepfakes/models/'

    return args


def download_files(filenames, base_url, output_path, report_progress=True):
    os.makedirs(output_path, exist_ok=True)
    if report_progress:
        filenames = tqdm(filenames)
    for filename in filenames:
        download_file(base_url + filename, join(output_path, filename))


def reporthook(count, block_size, total_size):
    global start_time
    if count == 0:
        start_time = time.time()
        return
    duration = time.time() - start_time
    progress_size = int(count * block_size)
    speed = int(progress_size / (1024 * duration))
    percent = int(count * block_size * 100 / total_size)
    sys.stdout.write("\rProgress: %d%%, %d MB, %d KB/s, %d seconds passed" %
                     (percent, progress_size / (1024 * 1024), speed, duration))
    sys.stdout.flush()


def download_file(url, out_file, report_progress=False):
    out_dir = os.path.dirname(out_file)
    if not os.path.isfile(out_file):
        fh, out_file_tmp = tempfile.mkstemp(dir=out_dir)
        f = os.fdopen(fh, 'w')
        f.close()
        if report_progress:
            urllib.request.urlretrieve(url, out_file_tmp,
                                       reporthook=reporthook)
        else:
            urllib.request.urlretrieve(url, out_file_tmp)
        os.rename(out_file_tmp, out_file)
    else:
        tqdm.write('WARNING: skipping download of existing file ' + out_file)


def main(args):
    # TOS
    print('By pressing any key to continue you confirm that you have agreed '\
          'to the FaceForensics terms of use as described at:')
    print(args.tos_url)
    print('***')
    print('Press any key to continue, or CTRL-C to exit.')
    _ = input('')

    # Extract arguments
    c_datasets = [args.dataset] if args.dataset != 'all' else ALL_DATASETS
    c_type = args.type
    c_compression = args.compression
    num_videos = args.num_videos
    output_path = args.output_path
    os.makedirs(output_path, exist_ok=True)

    # Check for special dataset cases
    for dataset in c_datasets:
        dataset_path = DATASETS[dataset]
        # Special cases
        if 'original_youtube_videos' in dataset:
            # Here we download the original youtube videos zip file
            print('Downloading original youtube videos.')
            if not 'info' in dataset_path:
                print('Please be patient, this may take a while (~40gb)')
                suffix = ''
            else:
            	suffix = 'info'
            download_file(args.base_url + '/' + dataset_path,
                          out_file=join(output_path,
                                        'downloaded_videos{}.zip'.format(
                                            suffix)),
                          report_progress=True)
            return

        # Else: regular datasets
        print('Downloading {} of dataset "{}"'.format(
            c_type, dataset_path
        ))

        # Get filelists and video lenghts list from server
        if 'DeepFakeDetection' in dataset_path or 'actors' in dataset_path:
        	filepaths = json.loads(urllib.request.urlopen(args.base_url + '/' +
                DEEPFEAKES_DETECTION_URL).read().decode("utf-8"))
        	if 'actors' in dataset_path:
        		filelist = filepaths['actors']
        	else:
        		filelist = filepaths['DeepFakesDetection']
        elif 'original' in dataset_path:
            # Load filelist from server
            file_pairs = json.loads(urllib.request.urlopen(args.base_url + '/' +
                FILELIST_URL).read().decode("utf-8"))
            filelist = []
            for pair in file_pairs:
            	filelist += pair
        else:
            # Load filelist from server
            file_pairs = json.loads(urllib.request.urlopen(args.base_url + '/' +
                FILELIST_URL).read().decode("utf-8"))
            # Get filelist
            filelist = []
            for pair in file_pairs:
                filelist.append('_'.join(pair))
                if c_type != 'models':
                    filelist.append('_'.join(pair[::-1]))
        # Maybe limit number of videos for download
        if num_videos is not None and num_videos > 0:
        	print('Downloading the first {} videos'.format(num_videos))
        	filelist = filelist[:num_videos]

        # Server and local paths
        dataset_videos_url = args.base_url + '{}/{}/{}/'.format(
            dataset_path, c_compression, c_type)
        dataset_mask_url = args.base_url + '{}/{}/videos/'.format(
            dataset_path, 'masks', c_type)

        if c_type == 'videos':
            dataset_output_path = join(output_path, dataset_path, c_compression,
                                       c_type)
            print('Output path: {}'.format(dataset_output_path))
            filelist = [filename + '.mp4' for filename in filelist]
            download_files(filelist, dataset_videos_url, dataset_output_path)
        elif c_type == 'masks':
            dataset_output_path = join(output_path, dataset_path, c_type,
                                       'videos')
            print('Output path: {}'.format(dataset_output_path))
            if 'original' in dataset:
                if args.dataset != 'all':
                    print('Only videos available for original data. Aborting.')
                    return
                else:
                    print('Only videos available for original data. '
                          'Skipping original.\n')
                    continue
            if 'FaceShifter' in dataset:
                print('Masks not available for FaceShifter. Aborting.')
                return
            filelist = [filename + '.mp4' for filename in filelist]
            download_files(filelist, dataset_mask_url, dataset_output_path)

        # Else: models for deepfakes
        else:
            if dataset != 'Deepfakes' and c_type == 'models':
                print('Models only available for Deepfakes. Aborting')
                return
            dataset_output_path = join(output_path, dataset_path, c_type)
            print('Output path: {}'.format(dataset_output_path))

            # Get Deepfakes models
            for folder in tqdm(filelist):
                folder_filelist = DEEPFAKES_MODEL_NAMES

                # Folder paths
                folder_base_url = args.deepfakes_model_url + folder + '/'
                folder_dataset_output_path = join(dataset_output_path,
                                                  folder)
                download_files(folder_filelist, folder_base_url,
                               folder_dataset_output_path,
                               report_progress=False)   # already done


if __name__ == "__main__":
    args = parse_args()
    main(args)

2.3 下载脚本的运行

由于不同的研究对于该数据集的需求不同，因此本节仅展示全部数据的下载命令。更多个性化的下载命令可以参考官方提供的下载说明。
下载说明链接：https://github.com/ondyari/FaceForensics/blob/master/dataset/README.md

python download-Faceforensics.py <output path> -d all -c c23 -t videos

其中，-d代表下载的类型，-c代表下载的压缩成都，-t代表下载是视频。

三、视频帧提取和面部提取方法

3.1 前期准备工作

3.1.1 环境配置

为了统一环境配置，本节会提供详细的配置过程，大家可以按需进行配置。如果自己已有完整的环境，则可以不看本节。

conda create -n facecrop python=3.8
conda activate facecrop
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
sudo apt-get install -y cmake / conda install -c conda-forge cmake
pip install dlib
pip install pandas opencv-python tqdm  imutils cmake easydict zipp imgaug efficientnet_pytorch imagecorruptions flask

3.1.2 代码克隆

1. git clone https://github.com/ternaus/retinaface.git 

# Replace the content in the requirements.txt file with the following:
albumentations>=1.0.0
torch>=1.9.0
torchvision>=0.10.0

cd retinaface && pip install -v -e .

2. git clone https://github.com/mapooon/SelfBlendedImages.git

3. mkdir src/utils/library
git clone https://github.com/AlgoHunt/Face-Xray.git src/utils/library

3.1.3 数据及配置文件准备

1. Download FF++ videos and place them in ./data/ folder like this:
SelfBlendedImages
└── data
    ├── FaceForensics++
         ├── original_sequences
         │   ├── youtube
         │   │   ├── raw
         │   │   │   └── videos
         │   │   │       └── *.mp4
         │   │   └── c23
         │   │       └── videos
         │   │           └── *.mp4
         │   └── actors
         │       └── raw
         │           └── videos
         │               └── *.mp4
         ├── manipulated_sequences
         │   ├── Deepfakes
         │   │   └── raw
         │   │       └── videos
         │   │           └── *.mp4
         │   ├── Face2Face
         │   │   └── raw
         │   │       └── videos
         │   │           └── *.mp4
         │   ├── FaceSwap
         │   │   └── raw
         │   │       └── videos
         │   │           └── *.mp4
         │   ├── NeuralTextures
         │   │   └── raw
         │   │       └── videos
         │   │           └── *.mp4
         │   ├── FaceShifter
         │   │   └── raw
         │   │       └── videos
         │   │           └── *.mp4
         │   └── DeepFakeDetection
         │       └── raw
         │           └── videos
         │               └── *.mp4
         ├── train.json
         ├── val.json
         └── test.json

2. Download landmark detector (shape_predictor_81_face_landmarks.dat) from 
https://github.com/codeniko/shape_predictor_81_face_landmarks and place it
in ./src/preprocess/ folder:

3.2 视频帧提取和面部提取方法

声明：本文的目的仅为提取视频帧和面部图像，而3.1节中还包含了提取人脸关键点的配置过程，大家可以根据实际使用的需要进行选择。如果目的和本文相同，可以使用本节提供的简化后的代码；反之使用SBI官方提供的代码。

3.2.1 DLIB方法（简化版）

from glob import glob
import os
import pandas as pd
import cv2
from tqdm import tqdm
import numpy as np
import shutil
import json
import sys
import argparse
import dlib
from imutils import face_utils


def facecrop(org_path, save_path, face_detector, face_predictor, period=1, num_frames=10):
    cap_org = cv2.VideoCapture(org_path)
    frame_count_org = int(cap_org.get(cv2.CAP_PROP_FRAME_COUNT))

    frame_idxs = np.linspace(0, frame_count_org - 1, num_frames, endpoint=True, dtype=int)
    for cnt_frame in range(frame_count_org): 
        ret_org, frame_org = cap_org.read()
        height, width = frame_org.shape[:-1]
        if not ret_org:
            tqdm.write('Frame read {} Error! : {}'.format(cnt_frame,os.path.basename(org_path)))
            break
        
        if cnt_frame not in frame_idxs:
            continue
        
        frame = cv2.cvtColor(frame_org, cv2.COLOR_BGR2RGB)

        faces = face_detector(frame, 1)
        if len(faces)==0:
            tqdm.write('No faces in {}:{}'.format(cnt_frame,os.path.basename(org_path)))
            continue

        save_path_frames = os.path.join(save_path, 'frames', os.path.basename(org_path).replace('.mp4', ''))
        os.makedirs(save_path_frames, exist_ok=True)

        for face_idx, face in enumerate(faces):
            x0, y0, x1, y1 = face.left(), face.top(), face.right(), face.bottom()

            # Ensure the coordinates are within the image bounds
            x0 = max(0, x0)
            y0 = max(0, y0)
            x1 = min(frame_org.shape[1], x1)
            y1 = min(frame_org.shape[0], y1)

            # Crop the face region and save
            cropped_face = frame_org[y0:y1, x0:x1]
            face_image_path = os.path.join(save_path_frames, f'frame_{cnt_frame}_face.png')
            cv2.imwrite(face_image_path, cropped_face)

    cap_org.release()
    return


if __name__=='__main__':
    parser=argparse.ArgumentParser()
    parser.add_argument('-d',dest='dataset',choices=['DeepFakeDetection_original','DeepFakeDetection','FaceShifter','Face2Face','Deepfakes','FaceSwap','NeuralTextures','Original','Celeb-real','Celeb-synthesis','YouTube-real','DFDC','DFDCP'])
    parser.add_argument('-c',dest='comp',choices=['raw','c23','c40'],default='raw')
    parser.add_argument('-n',dest='num_frames',type=int,default=20)
    args=parser.parse_args()
    if args.dataset=='Original':
        dataset_path='data/FaceForensics++/original_sequences/youtube/{}/'.format(args.comp)
    elif args.dataset=='DeepFakeDetection_original':
        dataset_path='data/FaceForensics++/original_sequences/actors/{}/'.format(args.comp)
    elif args.dataset in ['DeepFakeDetection','FaceShifter','Face2Face','Deepfakes','FaceSwap','NeuralTextures']:
        dataset_path='data/FaceForensics++/manipulated_sequences/{}/{}/'.format(args.dataset,args.comp)
    elif args.dataset in ['Celeb-real','Celeb-synthesis','YouTube-real']:
        dataset_path='data/Celeb-DF-v2/{}/'.format(args.dataset)
    elif args.dataset in ['DFDC']:
        dataset_path='data/{}/'.format(args.dataset)
    else:
        raise NotImplementedError

    face_detector = dlib.get_frontal_face_detector()
    predictor_path = 'src/preprocess/shape_predictor_81_face_landmarks.dat'
    face_predictor = dlib.shape_predictor(predictor_path)
    
    movies_path=dataset_path+'videos/'

    movies_path_list=sorted(glob(movies_path+'*.mp4'))
    print("{} : videos are exist in {}".format(len(movies_path_list),args.dataset))


    n_sample=len(movies_path_list)

    for i in tqdm(range(n_sample)):
        folder_path=movies_path_list[i].replace('videos/','frames/').replace('.mp4','/')
        facecrop(movies_path_list[i],save_path=dataset_path,num_frames=args.num_frames,face_predictor=face_predictor,face_detector=face_detector)

3.2.2 RetinaFace方法（简化版）

from glob import glob
import os
import pandas as pd
import cv2
from tqdm import tqdm
import numpy as np
import shutil
import json
import sys
import argparse
from imutils import face_utils
from retinaface.pre_trained_models import get_model
from retinaface.utils import vis_annotations
import torch


def facecrop(model, org_path, save_path, period=1, num_frames=10):
	cap_org = cv2.VideoCapture(org_path)
	frame_count_org = int(cap_org.get(cv2.CAP_PROP_FRAME_COUNT))
	
	frame_idxs = np.linspace(0, frame_count_org - 1, num_frames, endpoint=True, dtype=int)
	for cnt_frame in range(frame_count_org): 
		ret_org, frame_org = cap_org.read()
		height,width=frame_org.shape[:-1]
		if not ret_org:
			tqdm.write('Frame read {} Error! : {}'.format(cnt_frame,os.path.basename(org_path)))
			continue
		
		if cnt_frame not in frame_idxs:
			continue
		
		frame = cv2.cvtColor(frame_org, cv2.COLOR_BGR2RGB)

		faces = model.predict_jsons(frame)
		if len(faces)==0:
			print(faces)
			tqdm.write('No faces in {}:{}'.format(cnt_frame,os.path.basename(org_path)))
			continue

		save_path_frames = os.path.join(save_path, 'frames_retina', os.path.basename(org_path).replace('.mp4', ''))
		os.makedirs(save_path_frames, exist_ok=True)

		for face_idx, face in enumerate(faces):
			# Extract bounding box coordinates from the dictionary
			bbox = face['bbox']
			x0, y0, x1, y1 = bbox[0], bbox[1], bbox[2], bbox[3]

			# Ensure the coordinates are within the image bounds
			x0 = max(0, int(x0))
			y0 = max(0, int(y0))
			x1 = min(frame_org.shape[1], int(x1))
			y1 = min(frame_org.shape[0], int(y1))

			# Crop the face region and save
			cropped_face = frame_org[y0:y1, x0:x1]
			face_image_path = os.path.join(save_path_frames, f'frame_{cnt_frame}_face.png')
			cv2.imwrite(face_image_path, cropped_face)

	cap_org.release()
	return


if __name__=='__main__':
	parser=argparse.ArgumentParser()
	parser.add_argument('-d',dest='dataset',choices=['DeepFakeDetection_original','DeepFakeDetection','FaceShifter','Face2Face','Deepfakes','FaceSwap','NeuralTextures','Original','Celeb-real','Celeb-synthesis','YouTube-real','DFDC','DFDCP'])
	parser.add_argument('-c',dest='comp',choices=['raw','c23','c40'],default='raw')
	parser.add_argument('-n',dest='num_frames',type=int,default=20)
	args=parser.parse_args()
	if args.dataset=='Original':
		dataset_path='data/FaceForensics++/original_sequences/youtube/{}/'.format(args.comp)
	elif args.dataset=='DeepFakeDetection_original':
		dataset_path='data/FaceForensics++/original_sequences/actors/{}/'.format(args.comp)
	elif args.dataset in ['DeepFakeDetection','FaceShifter','Face2Face','Deepfakes','FaceSwap','NeuralTextures']:
		dataset_path='data/FaceForensics++/manipulated_sequences/{}/{}/'.format(args.dataset,args.comp)
	elif args.dataset in ['Celeb-real','Celeb-synthesis','YouTube-real']:
		dataset_path='data/Celeb-DF-v2/{}/'.format(args.dataset)
	elif args.dataset in ['DFDC','DFDCVal']:
		dataset_path='data/{}/'.format(args.dataset)
	else:
		raise NotImplementedError

	device=torch.device('cuda')
	model = get_model("resnet50_2020-07-20", max_size=2048,device=device)
	model.eval()

	movies_path=dataset_path+'videos/'
	movies_path_list=sorted(glob(movies_path+'*.mp4'))
	print("{} : videos are exist in {}".format(len(movies_path_list),args.dataset))

	n_sample=len(movies_path_list)
	for i in tqdm(range(n_sample)):
		folder_path=movies_path_list[i].replace('videos/','frames_retina/').replace('.mp4','/')
		facecrop(model,movies_path_list[i],save_path=dataset_path,num_frames=args.num_frames)

3.2.3 运行命令

python3 src/preprocess/crop_dlib_ff.py -d Original -c c40 -n 20
or
CUDA_VISIBLE_DEVICES=0 python3 src/preprocess/crop_retina_ff.py -d Original -c c40 -n 20