ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes

ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes

Taihong Xiao, Jiapeng Hong, and Jinwen Ma

Abstract

task: face attribute transfer
existing method: image-to-image translation
limitations:(1) failing to make image generation by exemplars; (2) unable to deal with multiple face attributes imultaneously; (3) low-quality generated images
solution: a novel model that receives two images of different attributes as inputs. All the attributes are encoded in the latent space in a disentangled manner(和NICE、Glow有点像?), learn the residual images so as to facilitate training on higher resolution images, generate high-quality images with fi ner details and less artifacts

Introduction

transferring face attributes: conditional image generation. A source face image would be modi ed to contain the targeted attribute, while the person identity should be preserved during the transferring process.
这里写图片描述

existing method:

methodprincipledrawbacks
Deep Manifold Traversalapproximate the natural image manifold and compute the attribute vector from the source domain to the target domain by using maximum mean discrepancy (MMD)suffers from unbearable time and memory cost
Visual Analogy-Makinguses a pair of reference images of the same person but different status to specify the attribute vector. Under the Linear Feature Space assumptions of feature space, image transfering can be formulated as I2=f1(f(I1)+v) I 2 = f − 1 ( f ( I 1 ) + v ) , f f is encoding/feature-extractingn function, v is the attribute vectorattribute can be different between inter-classes
GAN-based image-to-image translationdual learningaccording to Invariance of Domain Theorem, inappropriate to GAN
conditional image generationreceive image labels as the condition for generating images with desired attributesnot able do image generation by exemplars
BicycleGANintroduced a noise term to increase the diversityunable to generate images of certain attributes

Purpose and Intuition of Our Work

purposemethod
image generation by exemplarsreceive a reference for conditional image generation as latent variable/feature
deal with multiple face attributes simultaneouslydisentangle multiple attributes
satisfying qualityresidual learning

Our Method

The ELEGANT Model

A A ∈ positive set, with the i i -th attribute
B negative set, without the i i -th attribute
A,B are not matched
use an encoder Enc to obtain the latent encodings of images A A and B

z=Enc(x),x=A,B,zRn z = Enc ( x ) , x = A , B , z ∈ R n

z[i] z [ i ] encodes the information of the i i -th attribute of image. split the tensor zA into n n parts along with its channel dimension
problem: Such disentangled representation has to be learned
solution: iterative training strategy (train the model with respect to a single attribute each time and recurrently go over all attributes). Given A,B have different attribute at the i i -th position (whatever attributes at other positions), exchange the i-th part in their latent encodings zA,zB z A , z B , z_C,z_D = swap(z_A,z_B,i)
decode: learn the residual images rather than the original image.
A=A+Dec(concat(zA,zA))C=A+Dec(concat(zC,zA))B=B+Dec(concat(zB,zB))D=B+Dec(concat(zD,zB)) A ′ = A + Dec(concat ( z A , z A ) ) C = A + Dec(concat ( z C , z A ) ) B ′ = B + Dec(concat ( z B , z B ) ) D = B + Dec(concat ( z D , z B ) )

generator = encoder + decoder
这里写图片描述

discriminator: multi-scale, 2 discriminators that have identical network structure but operate at different image scales. D1 D 1 larger, guiding the Enc and Dec to produce ner details; D2 D 2 smaller, handling the overall image content so as to avoid generating grimaces.

Loss Functions

LDi=X=A,BE(logDi(X|YX))+X=C,DE(log(1Di(X|YX))),i=1,2Lreconstruct=X=A,BXXLadv=i=1,2X=C,DE(logDi(X|YX))LG=Lreconstruct+Ladv L D i = ∑ X = A , B E ( − log ⁡ D i ( X | Y X ) ) + ∑ X = C , D E ( − log ⁡ ( 1 − D i ( X | Y X ) ) ) , i = 1 , 2 L r e c o n s t r u c t = ∑ X = A , B ‖ X − X ′ ‖ L a d v = ∑ i = 1 , 2 ∑ X = C , D E ( − log ⁡ D i ( X | Y X ) ) L G = L r e c o n s t r u c t + L a d v

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值