cGANs with Projection Discriminator

最新推荐文章于 2023-12-31 01:33:24 发布

连理o

最新推荐文章于 2023-12-31 01:33:24 发布

阅读量289

点赞数 1

分类专栏： # Generative Models

本文链接：https://blog.csdn.net/weixin_42437114/article/details/119513489

版权

11 篇文章 1 订阅

订阅专栏

Introduction

We propose a novel, projection based way to incorporate the conditional information into the discriminator of GANs that respects the role of the conditional information in the underlining probabilistic model (i.e., a function that measures the information theoretic distance between the generative distribution and the target distribution).
- By construction, any assumption about the form of the distribution would act as a regularization on the choice of the discriminator. In this paper, we propose a specific form of the discriminator, a form motivated by a probabilistic model in which the distribution of the conditional variable $y$ given $x$ is discrete or uni-modal continuous distributions.
- As we will explain in the next section, adhering to this assumption will give rise to a structure of the discriminator that requires us to take an inner product between the embedded condition vector $y$ and the feature vector (Figure 1d).

Notation

$\boldsymbol x$ : input vector
$\boldsymbol y$ : conditional information (When $\boldsymbol y$ is discrete label information, we can assume that it is encoded as a one-hot vector.)
$D(\boldsymbol x, \boldsymbol y; θ) := \mathcal A(f(\boldsymbol x, \boldsymbol y; θ))$ : cGAN discriminator, where $\mathcal A$ is an activation function
$q$ : the true distributions
$p$ : the generated distributions

The standard adversarial loss for the discriminator is given by:
with $\mathcal A$ in $D$ representing the sigmoid function.
类似 GAN 中的推导，假设 $D$ 可以表示任意函数，则可以推导出 optimal discriinator $D^*(x,y)$ :
$D^*(x,y)=\frac{q(x,y)}{q(x,y)+p(x,y)}$ 由于现在假设激活函数为 sigmoid，因此有
$\mathcal A(f(x,y;\theta))=\frac{1}{1+\exp(-f^*(x,y))}=D^*(x,y)=\frac{q(x,y)}{q(x,y)+p(x,y)}$ 因此有
$f^*(\boldsymbol x,\boldsymbol y)=\log\frac{q(\boldsymbol x,\boldsymbol y)}{p(\boldsymbol x,\boldsymbol y)}=\log \frac{q(\boldsymbol{x} \mid \boldsymbol{y}) q(\boldsymbol{y})}{p(\boldsymbol{x} \mid \boldsymbol{y}) p(\boldsymbol{y})}=\log \frac{q(\boldsymbol{y} \mid \boldsymbol{x})}{p(\boldsymbol{y} \mid \boldsymbol{x})}+\log \frac{q(\boldsymbol{x})}{p(\boldsymbol{x})}:=r(\boldsymbol{y} \mid \boldsymbol{x})+r(\boldsymbol{x})$

log linear model

Log linear model is the most popular model for $p (y ∣ x)$ . Assume that $y$ is a categorical variable taking a value in ${1, . . . , C\}$ .
如果我们要 softmax 来计算 $x$ 属于各个类别的概率，则
$p(y=c|x)=\frac{\exp(o_c)}{\sum_{j=1}^C\exp(o_j)}$ 其中 $o_j$ 为神经网络全连接层的输出，我们可以把它分解成全连接层的权重矩阵 $V^{pT}$ (size: $C\times d^L$ ) (上标 $p$ 代表该权重与真实概率 $p$ 有关) 与输入向量 $\phi(x)$ (size: $d^L\times 1$ ) (代表提取出的 $x$ 的 feature) 的乘积。这样代表类别概率的输出向量 $o$ 就可以表示为 $o=V^{pT}\phi(x)$ ，而其中 $o_j$ 为:
$o_j=v_j^{pT}\phi(x)$ 因此，
$\begin{aligned}\log p(y=c|x)&=\log \frac{\exp(v_c^{pT}\phi(x))}{\sum_{j=1}^C\exp(v_j^{pT}\phi(x))} \\&=v_c^{pT}\phi(x)-\log(\sum_{j=1}^C\exp(v_j^{pT}\phi(x))) \end{aligned}$ 设 $Z^p(\phi(x)):=\sum_{j=1}^C\exp(v_j^{pT}\phi(x))$ ，则
$\begin{aligned}\log p(y=c|x)&=v_c^{pT}\phi(x)-\log Z^p(\phi(x)) \end{aligned}$
假设 $\log q(y=c|x)$ 也可以表示为上面的形式，并且使用同样的 $\phi$ ，则下面的对数似然比可表示为
$\begin{aligned}\log\frac{q(y=c|x)}{p(y=c|x)}&=v_c^{qT}\phi(x)-\log Z^q(\phi(x))-v_c^{pT}\phi(x)+\log Z^p(\phi(x)) \\&=(v_c^{q}-v_c^{p})^T\phi(x)-(\log Z^q(\phi(x))-\log Z^p(\phi(x))) \end{aligned}$

将 log linear model 代入 $f^*(x,y)$ 可得
$\begin{aligned}f^*(x,y=c)&=\log \frac{q({y=c} \mid {x})}{p({y=c} \mid {x})}+\log \frac{q({x})}{p({x})} \\&=(v_c^{q}-v_c^{p})^T\phi(x)-(\log Z^q(\phi(x))-\log Z^p(\phi(x)))+\log \frac{q({x})}{p({x})} \end{aligned}$
设 $v_c=v_c^{q}-v_c^{p}$ ; $\quad\psi(\phi(x))=-(\log Z^q(\phi(x))-\log Z^p(\phi(x)))+\log \frac{q({x})}{p({x})}$ ，则
$\begin{aligned}f^*(x,y=c)&=v_c^T\phi(x)+\psi(\phi(x)) \end{aligned}$
设 $V$ 的各行向量为 $v_c^T$ . 因为 $y$ 为 one-hot vector，则
$f^*(x,y)=y^TV\phi(x)+\psi(\phi(x))=(V^Ty)\cdot \phi(x)+\psi(\phi(x))$ 得出了下图的结构 (左路可以看作是在判断 $x$ 是否真实，右路可以看作在判断 $x$ 是否属于 $y$ 类)：

We refer to this model of the discriminator as projection for short.

关注