通过3DMM模型驱动单张头像《3D-FM GAN: Towards 3D-Controllable Face Manipulation》-CSDN博客

主要贡献点

we propose 3D-FM GAN, a novel conditional GAN framework designed specifically for 3D-controllable Face Manipulation, and does not require any tuning after the end-to-end learning phase.
基于conditional GAN做人脸的操作
A StyleGAN conditional generator then takes in both the original image and the manipulated face rendering to synthesize the edited face.
引入了StyleGAN，结合了真实照片和渲染模型的输入。

引入了两种训练策略，既保留人脸的identity，又保留了可编辑性
Moreover, we develop two essential training strategies, reconstruction and disentangled training, to help our model gain abilities of identity preservation and 3D editability.
又引入了multiplicative co-modulation的架构平衡两者
As we find an interesting trade-off between identity and editability in the network structure and the simple encoding strategy is sub-optimal, we propose a novel multiplicative co-modulation architecture for our framework.

方法

整体流程

the generator G, the face reconstruction network FR, and the renderer Rd.

数据集

FFHQ. FFHQ [23] is a human face photo dataset, where most identities only have one corresponding image. For each of the training image P, we extract its render counterpart by R = Rd(FR(P)) to form the (P, R) pair.

Synthetic Dataset. We also require a dataset where each identity has multiple images with various attributes of expression, pose, and illumination. Such a dataset is crucial for model to perform learning for editing. While this kind of high-quality dataset is not publicly available, we leverage DiscoFaceGAN [10], Gd, to synthesize one as follows.