论文阅读：（DCGAN）Unsupervised representation learning with deep

最新推荐文章于 2024-07-30 11:21:27 发布

小吴同学真棒

最新推荐文章于 2024-07-30 11:21:27 发布

阅读量487

点赞数

分类专栏：学习人工智能文章标签： DCGAN GAN self-supervised unsupervised learning image pretext

本文链接：https://blog.csdn.net/qq_36627158/article/details/117399024

版权

学习同时被 2 个专栏收录

115 篇文章 7 订阅

订阅专栏

人工智能

72 篇文章 5 订阅

订阅专栏

Unsupervised representation learning with deep convolutional generative adversarial networks

（2016 ICLR）

Alec Radford, Luke Metz, Soumith Chintala

Notes

Contributions

We propose and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings. We name this class of architectures Deep Convolutional GANs (DCGAN). Moreover, we use the trained discriminators for image classification tasks, showing competitive performance with other unsupervised algorithms. Finally, we show that the generators have interesting vector arithmetic properties allowing for easy manipulation of many semantic qualities of generated samples.

Method

We identified a family of architectures that resulted in stable training across a range of datasets and allowed for training higher resolution and deeper generative models. These are the architecture guidelines for stable Deep Convolutional GANs:

Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).
Use batchnorm in both the generator and the discriminator.
Remove fully connected hidden layers for deeper architectures.
Use ReLU activation in generator for all layers except for the output, which uses Tanh.
Use LeakyReLU activation in the discriminator for all layers.

Training Process

code：https://github.com/eriklindernoren/Keras-GAN/blob/master/dcgan/dcgan.py

# Build the generator
self.generator = self.build_generator()

# The generator takes noise as input and generates imgs
z = Input(shape=(self.latent_dim,))
img = self.generator(z)

# For the combined model we will only train the generator
self.discriminator.trainable = False

# The discriminator takes generated images as input and determines validity
valid = self.discriminator(img)

# The combined model  (stacked generator and discriminator)
# Trains the generator to fool the discriminator
self.combined = Model(z, valid)
self.combined.compile(loss='binary_crossentropy', optimizer=optimizer)

Classifying CIFAR-10 using GANs as a feature extractor

To evaluate the quality of the representations learned by DCGANs for supervised tasks, we train on Imagenet-1k and then use the discriminator’s convolutional features from all layers, maxpooling each layers representation to produce a 4 × 4 spatial grid. These features are then flattened and concatenated to form a 28672 dimensional vector and a regularized linear L2-SVM classifier is trained on top of them.

Code in Generator & Discriminator

code：https://github.com/eriklindernoren/Keras-GAN/blob/master/dcgan/dcgan.py

Generator

def build_generator(self):

        model = Sequential()

        model.add(Dense(128 * 7 * 7, activation="relu", input_dim=self.latent_dim))
        model.add(Reshape((7, 7, 128)))
        model.add(UpSampling2D())
        model.add(Conv2D(128, kernel_size=3, padding="same"))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Activation("relu"))
        model.add(UpSampling2D())
        model.add(Conv2D(64, kernel_size=3, padding="same"))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Activation("relu"))
        model.add(Conv2D(self.channels, kernel_size=3, padding="same"))
        model.add(Activation("tanh"))

        model.summary()

        noise = Input(shape=(self.latent_dim,))
        img = model(noise)

        return Model(noise, img)

Discriminator

def build_discriminator(self):

    model = Sequential()

    model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=self.img_shape, padding="same"))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))
    model.add(ZeroPadding2D(padding=((0,1),(0,1))))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Conv2D(128, kernel_size=3, strides=2, padding="same"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Conv2D(256, kernel_size=3, strides=1, padding="same"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))

    model.summary()

    img = Input(shape=self.img_shape)
    validity = model(img)

    return Model(img, validity)