Python 深度学习 第12章 生成式深度学习

Python 深度学习 第12章 生成式深度学习

内容概要

第12章探讨了生成式深度学习的应用,包括文本生成、DeepDream、神经风格迁移、变分自编码器(VAE)和生成对抗网络(GAN)。通过这些技术,读者将了解如何使用深度学习进行艺术创作和内容生成。通过本章,读者将掌握如何使用深度学习技术进行创意任务,如文本生成、图像风格迁移和新图像的生成。
在这里插入图片描述

主要内容

  1. 文本生成

    • 序列数据生成:使用循环神经网络(RNN)或Transformer模型预测序列中的下一个词或字符。
    • 采样策略:通过调整softmax温度控制生成文本的随机性。
    • 语言模型:训练模型以预测序列中的下一个词,然后通过采样生成新文本。
  2. DeepDream

    • 工作原理:通过反向运行卷积神经网络,最大化特定层的激活,生成具有艺术效果的图像。
    • 实现步骤:使用预训练的InceptionV3模型,通过梯度上升优化输入图像以最大化特定层的激活。
  3. 神经风格迁移

    • 内容和风格损失:通过预训练的卷积神经网络(如VGG19)提取图像的内容和风格特征。
    • 优化过程:通过梯度下降最小化内容损失和风格损失,生成结合目标图像内容和参考图像风格的新图像。
  4. 变分自编码器(VAE)

    • 工作原理:通过编码器将输入图像映射到潜在空间的分布参数,然后通过解码器重建图像。
    • 采样层:在潜在空间中采样以生成新的图像。
    • 应用:用于图像生成和编辑,如MNIST手写数字的生成。
  5. 生成对抗网络(GAN)

    • 工作原理:由生成器和判别器组成,生成器生成图像,判别器区分真实图像和生成图像。
    • 训练过程:通过对抗训练使生成器生成的图像逐渐接近真实图像的分布。

关键代码和算法

1.1 文本生成回调

class TextGenerator(keras.callbacks.Callback):
    def __init__(self, prompt, generate_length, model_input_length, temperatures=(1.,), print_freq=1):
        self.prompt = prompt
        self.generate_length = generate_length
        self.model_input_length = model_input_length
        self.temperatures = temperatures
        self.print_freq = print_freq

    def on_epoch_end(self, epoch, logs=None):
        if (epoch + 1) % self.print_freq != 0:
            return
        for temperature in self.temperatures:
            print(f"== Generating with temperature {temperature}")
            sentence = self.prompt
            for i in range(self.generate_length):
                tokenized_sentence = text_vectorization([sentence])
                predictions = self.model(tokenized_sentence)
                next_token = sample_next(predictions[0, i, :])
                sampled_token = tokens_index[next_token]
                sentence += " " + sampled_token
            print(sentence)

1.2 DeepDream实现

def gradient_ascent_step(image, learning_rate):
    with tf.GradientTape() as tape:
        tape.watch(image)
        loss = compute_loss(image)
    grads = tape.gradient(loss, image)
    grads = tf.math.l2_normalize(grads)
    image += learning_rate * grads
    return loss, image

def gradient_ascent_loop(image, iterations, learning_rate, max_loss=None):
    for i in range(iterations):
        loss, image = gradient_ascent_step(image, learning_rate)
        if max_loss is not None and loss > max_loss:
            break
        print(f"... Loss value at step {i}: {loss:.2f}")
    return image

1.3 神经风格迁移

def compute_loss(combination_image, base_image, style_reference_image):
    input_tensor = tf.concat([base_image, style_reference_image, combination_image], axis=0)
    features = feature_extractor(input_tensor)
    loss = tf.zeros(shape=())
    layer_features = features[content_layer_name]
    base_image_features = layer_features[0, :, :, :]
    combination_features = layer_features[2, :, :, :]
    loss = loss + content_weight * content_loss(base_image_features, combination_features)
    for layer_name in style_layer_names:
        layer_features = features[layer_name]
        style_reference_features = layer_features[1, :, :, :]
        combination_features = layer_features[2, :, :, :]
        style_loss_value = style_loss(style_reference_features, combination_features)
        loss += (style_weight / len(style_layer_names)) * style_loss_value
    loss += total_variation_weight * total_variation_loss(combination_image)
    return loss

1.4 VAE实现

class VAE(keras.Model):
    def __init__(self, encoder, decoder, **kwargs):
        super().__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder
        self.sampler = Sampler()
        self.total_loss_tracker = keras.metrics.Mean(name="total_loss")
        self.reconstruction_loss_tracker = keras.metrics.Mean(name="reconstruction_loss")
        self.kl_loss_tracker = keras.metrics.Mean(name="kl_loss")

    def train_step(self, data):
        with tf.GradientTape() as tape:
            z_mean, z_log_var = self.encoder(data)
            z = self.sampler(z_mean, z_log_var)
            reconstruction = self.decoder(z)
            reconstruction_loss = tf.reduce_mean(
                tf.reduce_sum(
                    keras.losses.binary_crossentropy(data, reconstruction),
                    axis=(1, 2)
                )
            )
            kl_loss = -0.5 * (1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))
            total_loss = reconstruction_loss + tf.reduce_mean(kl_loss)
        grads = tape.gradient(total_loss, self.trainable_weights)
        self.optimizer.apply_gradients(zip(grads, self.trainable_weights))
        self.total_loss_tracker.update_state(total_loss)
        self.reconstruction_loss_tracker.update_state(reconstruction_loss)
        self.kl_loss_tracker.update_state(kl_loss)
        return {
            "total_loss": self.total_loss_tracker.result(),
            "reconstruction_loss": self.reconstruction_loss_tracker.result(),
            "kl_loss": self.kl_loss_tracker.result(),
        }

1.5 GAN实现

class GAN(keras.Model):
    def __init__(self, discriminator, generator, latent_dim):
        super().__init__()
        self.discriminator = discriminator
        self.generator = generator
        self.latent_dim = latent_dim
        self.d_loss_metric = keras.metrics.Mean(name="d_loss")
        self.g_loss_metric = keras.metrics.Mean(name="g_loss")

    def compile(self, d_optimizer, g_optimizer, loss_fn):
        super().compile()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        self.loss_fn = loss_fn

    def train_step(self, real_images):
        batch_size = tf.shape(real_images)[0]
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        generated_images = self.generator(random_latent_vectors)
        combined_images = tf.concat([generated_images, real_images], axis=0)
        labels = tf.concat([tf.ones((batch_size, 1)), tf.zeros((batch_size, 1))], axis=0)
        labels += 0.05 * tf.random.uniform(tf.shape(labels))
        with tf.GradientTape() as tape:
            predictions = self.discriminator(combined_images)
            d_loss = self.loss_fn(labels, predictions)
        grads = tape.gradient(d_loss, self.discriminator.trainable_weights)
        self.d_optimizer.apply_gradients(zip(grads, self.discriminator.trainable_weights))
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        misleading_labels = tf.zeros((batch_size, 1))
        with tf.GradientTape() as tape:
            predictions = self.discriminator(self.generator(random_latent_vectors))
            g_loss = self.loss_fn(misleading_labels, predictions)
        grads = tape.gradient(g_loss, self.generator.trainable_weights)
        self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_weights))
        self.d_loss_metric.update_state(d_loss)
        self.g_loss_metric.update_state(g_loss)
        return {"d_loss": self.d_loss_metric.result(), "g_loss": self.g_loss_metric.result()}

精彩语录

  1. 中文:生成式深度学习的潜力不仅限于被动任务和反应性任务,还扩展到创意活动。
    英文原文:The potential of artificial intelligence to emulate human thought processes goes beyond passive tasks such as object recognition and mostly reactive tasks such as driving a car. It extends well into creative activities.
    解释:这句话强调了生成式深度学习在创意领域的广泛应用。

  2. 中文:深度学习语言模型捕捉的是语言的统计结构,而非其根本意义。
    英文原文:Language models are all form and no substance.
    解释:这句话指出语言模型的局限性,强调其缺乏真正的语义理解。

  3. 中文:变分自编码器(VAE)通过学习连续的潜在空间,使得图像编辑成为可能。
    英文原文:VAEs result in highly structured, continuous latent representations. For this reason, they work well for doing all sorts of image editing in latent space.
    解释:这句话总结了VAE在图像编辑中的优势。

  4. 中文:生成对抗网络(GAN)通过对抗训练生成逼真的图像。
    英文原文:GANs enable the generation of fairly realistic synthetic images by forcing the generated images to be statistically almost indistinguishable from real ones.
    解释:这句话介绍了GAN的核心思想。

  5. 中文:GAN的训练是一个动态过程,需要平衡生成器和判别器的能力。
    英文原文:Training a GAN is a dynamic process rather than a simple gradient descent process with a fixed loss landscape.
    解释:这句话强调了GAN训练的复杂性和挑战性。

总结

通过本章的学习,读者将掌握生成式深度学习的核心技术,包括文本生成、DeepDream、神经风格迁移、VAE和GAN。这些技术为艺术创作和内容生成提供了强大的工具,展示了深度学习在创意领域的巨大潜力。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值