高清人脸数据集—FFHQ

最新推荐文章于 2025-04-16 09:59:37 发布

studyeboy

最新推荐文章于 2025-04-16 09:59:37 发布

阅读量1.9w

点赞数 1

分类专栏：数据库文章标签：计算机视觉人工智能机器学习

本文链接：https://blog.csdn.net/studyeboy/article/details/121694975

版权

数据库专栏收录该内容

3 篇文章

订阅专栏

FFHQ全称Flickr-Faces-Hight-Quality（Flickr-Faces-HQ）是英伟达作为生成对抗网络（GAN）的基准创建的，也用于Style GAN的训练数据集中，于2019年开源。FFHQ是一个高质量的人脸数据集，包含1024x1024分辨率的70000张PNG格式高清人脸图像，在年龄、种族和图像背景上丰富多样且差异明显，在人脸属性上也拥有非常多的变化，拥有不同的年龄、性别、种族、肤色、表情、脸型、发型、人脸姿态等，包括普通眼镜、太阳镜、帽子、发饰及围巾等多种人脸周边配件，因此该数据集也是可以用于开发一些人脸属性分类或者人脸语义分割模型的。FFHQ的图像从Flickr上爬取，且均有许可才会下载，并使用了dlib进行人脸对齐和裁剪，之后使用算法移除了一些非真实人脸如雕像、画作及照片等图像。
在这里插入图片描述

Flickr [1-2] ，雅虎旗下图片分享网站。为一家提供免费及付费数位照片储存、分享方案之线上服务，也提供网络社群服务的平台。其重要特点就是基于社会网络的人际关系的拓展与内容的组织。这个网站的功能之强大，已超出了一般的图片服务，比如图片服务、联系人服务、组群服务。

FFHQ高清人脸数据集主要存储于谷歌云盘。
在这里插入图片描述
数据集统计分析：

对于需要单独训练和验证集的用例，指定了前 60,000 张图像用于训练，其余 10,000 张图像用于验证。然而，在 StyleGAN 论文中，使用了所有 70,000 张图像进行训练。

已经明确确保数据集中没有重复的图像。但是，请注意，如果从同一图像中提取了多个不同的人脸，则 in-the-wild 文件夹可能包含同一图像的多个副本。

可以直接从 Google Drive 获取数据，也可以使用提供的下载脚本。该脚本通过自动下载所有请求的文件、验证它们的校验和、在出错时多次重试每个文件以及使用多个并发连接来最大化带宽，使事情变得更加容易。

> python download_ffhq.py -h
usage: download_ffhq.py [-h] [-j] [-s] [-i] [-t] [-w] [-r] [-a]
                        [--num_threads NUM] [--status_delay SEC]
                        [--timing_window LEN] [--chunk_size KB]
                        [--num_attempts NUM]

Download Flickr-Face-HQ (FFHQ) dataset to current working directory.

optional arguments:
  -h, --help           show this help message and exit
  -j, --json           download metadata as JSON (254 MB)
  -s, --stats          print statistics about the dataset
  -i, --images         download 1024x1024 images as PNG (89.1 GB)
  -t, --thumbs         download 128x128 thumbnails as PNG (1.95 GB)
  -w, --wilds          download in-the-wild images as PNG (955 GB)
  -r, --tfrecords      download multi-resolution TFRecords (273 GB)
  -a, --align          recreate 1024x1024 images from in-the-wild images
  --num_threads NUM    number of concurrent download threads (default: 32)
  --status_delay SEC   time between download status prints (default: 0.2)
  --timing_window LEN  samples for estimating download eta (default: 50)
  --chunk_size KB      chunk size for each download thread (default: 128)
  --num_attempts NUM   number of download attempts per file (default: 10)

获取图像链接后再进行数据下载：

> python ..\download_ffhq.py --json --images
Downloading JSON metadata...
\ 100.00% done  2/2 files  0.25/0.25 GB   43.21 MB/s  ETA: done
Parsing JSON metadata...
Downloading 70000 files...
| 100.00% done  70001/70001 files  89.19 GB/89.19 GB  59.87 MB/s  ETA: done

该脚本还用作对齐和裁剪图像的自动化方案的参考实现。使用 python download_ffhq.py --wilds 下载原始图像后，您可以运行 python download_ffhq.py --align 使用元数据中包含的面部标志位置重现对齐的 1024×1024 图像的精确副本 .

ffhq-dataset-v2.json 文件以机器可读的格式包含每个图像的以下信息：

{
  "0": {                                                 # Image index
    "category": "training",                              # Training or validation
    "metadata": {                                        # Info about the original Flickr photo:
      "photo_url": "https://www.flickr.com/photos/...",  # - Flickr URL
      "photo_title": "DSCF0899.JPG",                     # - File name
      "author": "Jeremy Frumkin",                        # - Author
      "country": "",                                     # - Country where the photo was taken
      "license": "Attribution-NonCommercial License",    # - License name
      "license_url": "https://creativecommons.org/...",  # - License detail URL
      "date_uploaded": "2007-08-16",                     # - Date when the photo was uploaded to Flickr
      "date_crawled": "2018-10-10"                       # - Date when the photo was crawled from Flickr
    },
    "image": {                                           # Info about the aligned 1024x1024 image:
      "file_url": "https://drive.google.com/...",        # - Google Drive URL
      "file_path": "images1024x1024/00000/00000.png",    # - Google Drive path
      "file_size": 1488194,                              # - Size of the PNG file in bytes
      "file_md5": "ddeaeea6ce59569643715759d537fd1b",    # - MD5 checksum of the PNG file
      "pixel_size": [1024, 1024],                        # - Image dimensions
      "pixel_md5": "47238b44dfb87644460cbdcc4607e289",   # - MD5 checksum of the raw pixel data
      "face_landmarks": [...]                            # - 68 face landmarks reported by dlib
    },
    "thumbnail": {                                       # Info about the 128x128 thumbnail:
      "file_url": "https://drive.google.com/...",        # - Google Drive URL
      "file_path": "thumbnails128x128/00000/00000.png",  # - Google Drive path
      "file_size": 29050,                                # - Size of the PNG file in bytes
      "file_md5": "bd3e40b2ba20f76b55dc282907b89cd1",    # - MD5 checksum of the PNG file
      "pixel_size": [128, 128],                          # - Image dimensions
      "pixel_md5": "38d7e93eb9a796d0e65f8c64de8ba161"    # - MD5 checksum of the raw pixel data
    },
    "in_the_wild": {                                     # Info about the in-the-wild image:
      "file_url": "https://drive.google.com/...",        # - Google Drive URL
      "file_path": "in-the-wild-images/00000/00000.png", # - Google Drive path
      "file_size": 3991569,                              # - Size of the PNG file in bytes
      "file_md5": "1dc0287e73e485efb0516a80ce9d42b4",    # - MD5 checksum of the PNG file
      "pixel_size": [2016, 1512],                        # - Image dimensions
      "pixel_md5": "86b3470c42e33235d76b979161fb2327",   # - MD5 checksum of the raw pixel data
      "face_rect": [667, 410, 1438, 1181],               # - Axis-aligned rectangle of the face region
      "face_landmarks": [...],                           # - 68 face landmarks reported by dlib
      "face_quad": [...]                                 # - Aligned quad of the face region
    }
  },
  ...
}