BabyGAN：根据父母照片生成孩子照片

在下李某

已于 2024-04-23 19:29:24 修改

阅读量3.1k

点赞数 55

文章标签： python 机器学习深度学习

于 2024-04-22 23:28:00 首次发布

本文链接：https://blog.csdn.net/qq_52761874/article/details/138095620

版权

本文章是笔者在大三时的《模式识别》课程作业，如今将此作业上传供网友学习。

注意：本案例必须使用GPU运行，请查看《ModelArts JupyterLab 硬件规格使用指南》了解切换硬件规格的方法

一、案例内容简介

本案例可根据一张父亲和母亲的正脸照片，生成孩子的照片，并且可以调节参数，看看不同性别和年龄孩子的长相。

为保证照片的生成效果，上传父母的照片时尽量上传能露出五官且浅色底的照片。

本案例仅用于学习交流，请勿用于其他用途。

另外，由于技术不完善的原因，生成的孩子照片可能会有扭曲或失真，你可以更换不同的父母照片，重新生成孩子照片，直到达到满意的生成效果为止。

下面开始按步骤运行本案例。

二、实验原理简介

BabvGAN 是一个基于 stvleGAN 的儿童长相预测器，可以基于编码器和生成器，输入父亲和母亲的图像，经过神经网络的处理后，生成或预测未来孩了的长相。

GAN包含有两个模型，一个是生成模型（generative model），一个是判别模型(discriminative model)。生成模型的任务是生成看起来自然真实的、和原始数据相似的实例。判别模型的任务是判断给定的实例看起来是自然真实的还是人为伪造的（真实实例来源于数据集，伪造实例来源于生成模型）。

这可以看做一种零和游戏。论文采用类比的手法通俗理解：生成模型像“一个造假团伙，试图生产和使用假币”，而判别模型像“检测假币的警察”。生成器（generator）试图欺骗判别器（discriminator），判别器则努力不被生成器欺骗。模型经过交替优化训练，两种模型都能得到提升，但最终我们要得到的是效果提升到很高很好的生成模型（造假团伙），这个生成模型（造假团伙）所生成的产品能达到真假难分的地步。

在GAN模型训练过程中，生成网络的目标就是尽量生成真实的图片去欺骗判别网络。而判断网络的目标就是尽量把生成生成的图片和真实的图片分别开来。这样，生成网络和判断网络构成了一个动态的“博弈过程”。对应的，对于GAN，情况就是生成模型恢复了训练数据的分布（造出了和真实数据一模一样的样本），判别模型再也判别不出来结果，准确率为 50%，最终达到一个纳什均衡状态。

三、实验步骤

1. 安装所需的模块

本步骤耗时约4分钟

!pip install imutils==0.5.4 moviepy==1.0.3 dlib==19.22.0 imageio==2.9.0

2. 下载代码和模型文件

import os
import moxing as mox

root_dir = '/home/ma-user/work/ma_share/'
code_dir = os.path.join(root_dir, 'BabyGAN')
if not os.path.exists(os.path.join(root_dir, 'BabyGAN.zip')):
    mox.file.copy('obs://modelarts-labs-bj4-v2/case_zoo/babyGAN/BabyGAN.zip', os.path.join(root_dir, 'BabyGAN.zip'))
    os.system('cd %s; unzip BabyGAN.zip' % root_dir)

os.chdir(code_dir)

3. 加载相关模块及模型

import cv2
import math
import pickle
import imageio
import warnings
import PIL.Image
import numpy as np
from glob import glob
from PIL import Image
import tensorflow as tf
from random import randrange
import moviepy.editor as mpy
import matplotlib.pyplot as plt
from IPython.display import clear_output
from moviepy.video.io.ffmpeg_writer import FFMPEG_VideoWriter

import config
import dnnlib
import dnnlib.tflib as tflib
from encoder.generator_model import Generator

%matplotlib inline
warnings.filterwarnings("ignore")

加载模型文件，本代码块只可执行一次，如果发生报错，请restart kernel，重新运行所有代码

tflib.init_tf()
URL_FFHQ = "./karras2019stylegan-ffhq-1024x1024.pkl"
with dnnlib.util.open_url(URL_FFHQ, cache_dir=config.cache_dir) as f:
    generator_network, discriminator_network, Gs_network = pickle.load(f)
generator = Generator(Gs_network, batch_size=1, randomize_noise=False)
model_scale = int(2 * (math.log(1024, 2) - 1))

age_direction = np.load('./ffhq_dataset/latent_directions/age.npy')
horizontal_direction = np.load('./ffhq_dataset/latent_directions/angle_horizontal.npy')
vertical_direction = np.load('./ffhq_dataset/latent_directions/angle_vertical.npy')
eyes_open_direction = np.load('./ffhq_dataset/latent_directions/eyes_open.npy')
gender_direction = np.load('./ffhq_dataset/latent_directions/gender.npy')
smile_direction = np.load('./ffhq_dataset/latent_directions/smile.npy')

def get_watermarked(pil_image: Image) -> Image:
    try:
        image = cv2.cvtColor(np.array(pil_image), cv2.COLOR_RGB2BGR)
        (h, w) = image.shape[:2]
        image = np.dstack([image, np.ones((h, w), dtype="uint8") * 255])
        pct = 0.08
        full_watermark = cv2.imread('./media/logo.png', cv2.IMREAD_UNCHANGED)
        (fwH, fwW) = full_watermark.shape[:2]
        wH = int(pct * h * 2)
        wW = int((wH * fwW) / fwH * 0.1)
        watermark = cv2.resize(full_watermark, (wH, wW), interpolation=cv2.INTER_AREA)
        overlay = np.zeros((h, w, 4), dtype="uint8")
        (wH, wW) = watermark.shape[:2]
        overlay[h - wH - 10: h - 10, 10: 10 + wW] = watermark
        output = image.copy()
        cv2.addWeighted(overlay, 0.5, output, 1.0, 0, output)
        rgb_image = cv2.cvtColor(output, cv2.COLOR_BGR2RGB)
        return Image.fromarray(rgb_image)
    except:
        return pil_image


def generate_final_images(latent_vector, direction, coeffs, i):
    new_latent_vector = latent_vector.copy()
    new_latent_vector[:8] = (latent_vector + coeffs * direction)[:8]
    new_latent_vector = new_latent_vector.reshape((1, 18, 512))
    generator.set_dlatents(new_latent_vector)
    img_array = generator.generate_images()[0]
    img = PIL.Image.fromarray(img_array, 'RGB')
    if size[0] >= 512: img = get_watermarked(img)
    img_path = "./for_animation/" + str(i) + ".png"
    img.thumbnail(animation_size, PIL.Image.ANTIALIAS)
    img.save(img_path)
    face_img.append(imageio.imread(img_path))
    clear_output()
    return img


def generate_final_image(latent_vector, direction, coeffs):
    new_latent_vector = latent_vector.copy()
    new_latent_vector[:8] = (latent_vector + coeffs * direction)[:8]
    new_latent_vector = new_latent_vector.reshape((1, 18, 512))
    generator.set_dlatents(new_latent_vector)
    img_array = generator.generate_images()[0]
    img = PIL.Image.fromarray(img_array, 'RGB')
    if size[0] >= 512: img = get_watermarked(img)
    img.thumbnail(size, PIL.Image.ANTIALIAS)
    img.save("face.png")
    if download_image == True: files.download("face.png")
    return img


def plot_three_images(imgB, fs=10):
    f, axarr = plt.subplots(1, 3, figsize=(fs, fs))
    axarr[0].imshow(Image.open('./aligned_images/father_01.png'))
    axarr[0].title.set_text("Father's photo")
    axarr[1].imshow(imgB)
    axarr[1].title.set_text("Child's photo")
    axarr[2].imshow(Image.open('./aligned_images/mother_01.png'))
    axarr[2].title.set_text("Mother's photo")
    plt.setp(plt.gcf().get_axes(), xticks=[], yticks=[])
    plt.show()

4. 准备好父亲和母亲的照片

本案例已各准备好一张默认的父母亲照片，可在左侧边栏的文件资源管理窗口中，进入到 ma_share/BabyGAN 目录，再进入到 father_image 或 mother_image 目录即可看到已提供的父母亲照片，如下图所示：

view_photos

如果你需更换父母亲的照片，请查看本文第11节“更换父亲和母亲的照片”

if len(glob(os.path.join('./father_image', '*.jpg'))) != 1 or (not os.path.exists('./father_image/father.jpg')):
    raise Exception('请在 ma_share/BabyGAN/father_image 目录下准备一张父亲的照片，且命名为father.jpg')

if len(glob(os.path.join('./mother_image', '*.jpg'))) != 1 or (not os.path.exists('./mother_image/mother.jpg')):
    raise Exception('请在 ma_share/BabyGAN/father_image 目录下准备一张母亲的照片，且命名为mother.jpg')

5. 获取父亲的脸部区域，并进行人脸对齐

!python align_images.py ./father_image ./aligned_images

查看父亲的人脸

if os.path.isfile('./aligned_images/father_01.png'):
    pil_father = Image.open('./aligned_images/father_01.png')
    (fat_width, fat_height) = pil_father.size
    resize_fat = max(fat_width, fat_height) / 256
    display(pil_father.resize((int(fat_width / resize_fat), int(fat_height / resize_fat))))
else:
    raise ValueError('No face was found or there is more than one in the photo.')

在这里插入图片描述

6. 获取母亲的脸部区域，并进行人脸对齐

!python align_images.py ./mother_image ./aligned_images

查看母亲的人脸

if os.path.isfile('./aligned_images/mother_01.png'):
    pil_mother = Image.open('./aligned_images/mother_01.png')
    (mot_width, mot_height) = pil_mother.size
    resize_mot = max(mot_width, mot_height) / 256
    display(pil_mother.resize((int(mot_width / resize_mot), int(mot_height / resize_mot))))
else:
    raise ValueError('No face was found or there is more than one in the photo.')

在这里插入图片描述

7. 提取人脸特征

本步骤耗时约3分钟

!python encode_images.py \
    --early_stopping False \
    --lr=0.25 \
    --batch_size=2 \
    --iterations=100 \
    --output_video=False \
    ./aligned_images \
    ./generated_images \
    ./latent_representations

if len(glob(os.path.join('./generated_images', '*.png'))) == 2:
    first_face = np.load('./latent_representations/father_01.npy')
    second_face = np.load('./latent_representations/mother_01.npy')
    print("Generation of latent representation is complete! Now comes the fun part.")
else:
    raise ValueError('Something wrong. It may be impossible to read the face in the photos. Upload other photos and try again.')

8. 生成一家三口照片

请修改下面代码中的 gender_influence 和 person_age参数，

gender_influence：性别影响因子，取值范围[0.01, 0.99]，取值越接近0，父亲的容貌影响越大，反之母亲影响越大；
person_age：年龄影响因子，取值范围[10, 50]，设置该值后，将生成对应年龄的小孩的容貌。

每次修改该参数值后，重新运行下面的代码块，即可生成孩子的新照片

genes_influence = 0.2  # 性别影响因子，取值范围[0.01, 0.99]，取值越接近0，父亲的容貌影响越大，反之母亲影响越大
person_age = 10  # 年龄影响因子，取值范围[10, 50]，设置该值后，将生成对应年龄的小孩的容貌

style = "Default"
if style == "Father's photo":
    lr = ((np.arange(1, model_scale + 1) / model_scale) ** genes_influence).reshape((model_scale, 1))
    rl = 1 - lr
    hybrid_face = (lr * first_face) + (rl * second_face)
elif style == "Mother's photo":
    lr = ((np.arange(1, model_scale + 1) / model_scale) ** (1 - genes_influence)).reshape((model_scale, 1))
    rl = 1 - lr
    hybrid_face = (rl * first_face) + (lr * second_face)
else:
    hybrid_face = ((1 - genes_influence) * first_face) + (genes_influence * second_face)

intensity = -((person_age / 5) - 6)
resolution = "512"
size = int(resolution), int(resolution)

download_image = False
face = generate_final_image(hybrid_face, age_direction, intensity)
plot_three_images(face, fs=15)

在这里插入图片描述

9. 查看孩子各年龄段的容貌

请修改下面代码中的 gender_influence 参数，该参数是性别影响因子，取值范围[0.01, 0.99]，取值越接近0，父亲的容貌影响越大，反之母亲影响越大。

每次修改该参数值后，要重新运行下面的代码块

person_age = 10  # 小孩的年龄，取值范围[10, 50]，设置该值后，将生成对应年龄的小孩的容貌

!rm -rf ./for_animation
!mkdir ./for_animation
face_img = []
intensity = -((person_age / 5) - 6)
animation_resolution = "512"
animation_size = int(animation_resolution), int(animation_resolution)
frames_number = 50  # 容貌变化的图像数，取值范围[10, 50]
download_image = False

for i in range(1, frames_number):
    gender_influence = i / frames_number
    hybrid_face = ((1 - gender_influence) * first_face) + (gender_influence * second_face)
    face = generate_final_images(hybrid_face, age_direction, intensity, i)
    clear_output()
    print(str(i) + " of {} photo generated".format(str(frames_number)))

for j in reversed(face_img):
    face_img.append(j)

animation_name = str(person_age) + "_years.mp4"
imageio.mimsave('./for_animation/' + animation_name, face_img)
clear_output()
display(mpy.ipython_display('./for_animation/' + animation_name, height=400, autoplay=1, loop=1))

BabyGAN：根据父母照片生成孩子照片-不同年龄视频

10. 查看孩子不同性别的容貌

请修改下面代码中的 person_age 参数，该参数是年龄影响因子，取值范围[10, 50]，设置该值后，将生成对应年龄的小孩的容貌。

每次修改该参数值后，要重新运行下面的代码块

person_age = 10  # 小孩的年龄，取值范围[10, 50]，设置该值后，将生成对应年龄的小孩的容貌

!rm -rf ./for_animation
!mkdir ./for_animation
face_img = []
intensity = -((person_age / 5) - 6)
animation_resolution = "512"
animation_size = int(animation_resolution), int(animation_resolution)
frames_number = 50  # 容貌变化的图像数，取值范围[10, 50]
download_image = False

for i in range(1, frames_number):
    gender_influence = i / frames_number
    hybrid_face = ((1 - gender_influence) * first_face) + (gender_influence * second_face)
    face = generate_final_images(hybrid_face, age_direction, intensity, i)
    clear_output()
    print(str(i) + " of {} photo generated".format(str(frames_number)))

for j in reversed(face_img):
    face_img.append(j)

animation_name = str(person_age) + "_years.mp4"
imageio.mimsave('./for_animation/' + animation_name, face_img)
clear_output()
display(mpy.ipython_display('./for_animation/' + animation_name, height=400, autoplay=1, loop=1))

BabyGAN：根据父母照片生成孩子照片-不同性别

11. 更换父亲和母亲的照片

接下来，你可以上传自己感兴趣的父母亲照片到father_image 和 mother_image目录下，重新运行代码，即可生成新的孩子照片。

你需要按照如下规则和步骤进行：

1、参考下图的操作，进入到 ma_share/BabyGAN 目录；

view_photos

2、准备一张父亲的照片，上传到 father_image 目录下，命名必须为father.jpg；（如果你不知道上传文件到 JupyterLab 的方法，请查看此文档）

3、准备一张母亲的照片，上传到 mother_image 目录下，命名必须为mother.jpg；

4、father_image 和 mother_image目录都只允许存在一张照片；

5、重新运行步骤4~10的代码。

四、实验总结

在进行实验过程中，在步骤3：加载相关模块及模型时遇到了ModuleNotFoundError:No module named 'numpy.typing’的报错，最后按照助教的提示解决了该错误。
在改变父亲与母亲照片后发现生成儿子的照片却没有改变，在再次运行时发现可能是自己未运行步骤7：提取人脸特征，导致使用的仍是默认的人脸特征，所以生成的儿子照片没有更改。
通过该实验简单了解到了生成对抗网络GAN的简单原理，对于深度学习算法方面有了进一步的了解，但是代码能力没怎么提升。