Python 深度学习第12章生成式深度学习

odoo中国

于 2025-04-21 09:00:00 发布

阅读量2.3k

点赞数 58

分类专栏：人工智能文章标签： python 深度学习开发语言 GAN 生成对抗网络神经风格迁移

本文链接：https://blog.csdn.net/qq_26226783/article/details/147372709

版权

人工智能专栏收录该内容

41 篇文章

订阅专栏

Python 深度学习第12章生成式深度学习

内容概要

第12章探讨了生成式深度学习的应用，包括文本生成、DeepDream、神经风格迁移、变分自编码器（VAE）和生成对抗网络（GAN）。通过这些技术，读者将了解如何使用深度学习进行艺术创作和内容生成。通过本章，读者将掌握如何使用深度学习技术进行创意任务，如文本生成、图像风格迁移和新图像的生成。
在这里插入图片描述

主要内容

文本生成
- 序列数据生成：使用循环神经网络（RNN）或Transformer模型预测序列中的下一个词或字符。
- 采样策略：通过调整softmax温度控制生成文本的随机性。
- 语言模型：训练模型以预测序列中的下一个词，然后通过采样生成新文本。
DeepDream
- 工作原理：通过反向运行卷积神经网络，最大化特定层的激活，生成具有艺术效果的图像。
- 实现步骤：使用预训练的InceptionV3模型，通过梯度上升优化输入图像以最大化特定层的激活。
神经风格迁移
- 内容和风格损失：通过预训练的卷积神经网络（如VGG19）提取图像的内容和风格特征。
- 优化过程：通过梯度下降最小化内容损失和风格损失，生成结合目标图像内容和参考图像风格的新图像。
变分自编码器（VAE）
- 工作原理：通过编码器将输入图像映射到潜在空间的分布参数，然后通过解码器重建图像。
- 采样层：在潜在空间中采样以生成新的图像。
- 应用：用于图像生成和编辑，如MNIST手写数字的生成。
生成对抗网络（GAN）
- 工作原理：由生成器和判别器组成，生成器生成图像，判别器区分真实图像和生成图像。
- 训练过程：通过对抗训练使生成器生成的图像逐渐接近真实图像的分布。

关键代码和算法

1.1 文本生成回调

class TextGenerator(keras.callbacks.Callback):
    def __init__(self, prompt, generate_length, model_input_length, temperatures=(1.,), print_freq=1):
        self.prompt = prompt
        self.generate_length = generate_length
        self.model_input_length = model_input_length
        self.temperatures = temperatures
        self.print_freq = print_freq

    def on_epoch_end(self, epoch, logs=None):
        if (epoch + 1) % self.print_freq != 0:
            return
        for temperature in self.temperatures:
            print(f"== Generating with temperature {temperature}")
            sentence = self.prompt
            for i in range(self.generate_length):
                tokenized_sentence = text_vectorization([sentence])
                predictions = self.model(tokenized_sentence)
                next_token = sample_next(predictions[0, i, :])
                sampled_token = tokens_index[next_token]
                sentence += " " + sampled_token
            print(sentence)

1.2 DeepDream实现

def gradient_ascent_step(image, learning_rate):
    with tf.GradientTape() as tape:
        tape.watch(image)
        loss = compute_loss(image)
    grads = tape.gradient(loss, image)
    grads = tf.math.l2_normalize(grads)
    image += learning_rate * grads
    return loss, image

def gradient_ascent_loop(image, iterations, learning_rate, max_loss=None):
    for i in range(iterations):
        loss, image = gradient_ascent_step(image, learning_rate)
        if max_loss is not None and loss > max_loss:
            break
        print(f"... Loss value at step {i}: {loss:.2f}")
    return image

1.3 神经风格迁移

def compute_loss(combination_image, base_image, style_reference_image):
    input_tensor = tf.concat([base_image, style_reference_image, combination_image], axis=0)
    features = feature_extractor(input_tensor)
    loss = tf.zeros(shape=())
    layer_features = features[content_layer_name]
    base_image_features = layer_features[0, :, :, :]
    combination_features = layer_features[2, :, :, :]
    loss = loss + content_weight * content_loss(base_image_features, combination_features)
    for layer_name in style_layer_names:
        layer_features = features[layer_name]
        style_reference_features = layer_features[1, :, :, :]
        combination_features = layer_features[2, :, :, :]
        style_loss_value = style_loss(style_reference_features, combination_features)
        loss += (style_weight / len(style_layer_names)) * style_loss_value
    loss += total_variation_weight * total_variation_loss(combination_image)
    return loss

1.4 VAE实现

class VAE(keras.Model):
    def __init__(self, encoder, decoder, **kwargs):
        super().__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder
        self.sampler = Sampler()
        self.total_loss_tracker = keras.metrics.Mean(name="total_loss")
        self.reconstruction_loss_tracker = keras.metrics.Mean(name="reconstruction_loss")
        self.kl_loss_tracker = keras.metrics.Mean(name="kl_loss")

    def train_step(self, data):
        with tf.GradientTape() as tape:
            z_mean, z_log_var = self.encoder(data)
            z = self.sampler(z_mean, z_log_var)
            reconstruction = self.decoder(z)
            reconstruction_loss = tf.reduce_mean(
                tf.reduce_sum(
                    keras.losses.binary_crossentropy(data, reconstruction),
                    axis=(1, 2)
                )
            )
            kl_loss = -0.5 * (1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))
            total_loss = reconstruction_loss + tf.reduce_mean(kl_loss)
        grads = tape.gradient(total_loss, self.trainable_weights)
        self.optimizer.apply_gradients(zip(grads, self.trainable_weights))
        self.total_loss_tracker.update_state(total_loss)
        self.reconstruction_loss_tracker.update_state(reconstruction_loss)
        self.kl_loss_tracker.update_state(kl_loss)
        return {
            "total_loss": self.total_loss_tracker.result(),
            "reconstruction_loss": self.reconstruction_loss_tracker.result(),
            "kl_loss": self.kl_loss_tracker.result(),
        }

1.5 GAN实现

class GAN(keras.Model):
    def __init__(self, discriminator, generator, latent_dim):
        super().__init__()
        self.discriminator = discriminator
        self.generator = generator
        self.latent_dim = latent_dim
        self.d_loss_metric = keras.metrics.Mean(name="d_loss")
        self.g_loss_metric = keras.metrics.Mean(name="g_loss")

    def compile(self, d_optimizer, g_optimizer, loss_fn):
        super().compile()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        self.loss_fn = loss_fn

    def train_step(self, real_images):
        batch_size = tf.shape(real_images)[0]
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        generated_images = self.generator(random_latent_vectors)
        combined_images = tf.concat([generated_images, real_images], axis=0)
        labels = tf.concat([tf.ones((batch_size, 1)), tf.zeros((batch_size, 1))], axis=0)
        labels += 0.05 * tf.random.uniform(tf.shape(labels))
        with tf.GradientTape() as tape:
            predictions = self.discriminator(combined_images)
            d_loss = self.loss_fn(labels, predictions)
        grads = tape.gradient(d_loss, self.discriminator.trainable_weights)
        self.d_optimizer.apply_gradients(zip(grads, self.discriminator.trainable_weights))
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        misleading_labels = tf.zeros((batch_size, 1))
        with tf.GradientTape() as tape:
            predictions = self.discriminator(self.generator(random_latent_vectors))
            g_loss = self.loss_fn(misleading_labels, predictions)
        grads = tape.gradient(g_loss, self.generator.trainable_weights)
        self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_weights))
        self.d_loss_metric.update_state(d_loss)
        self.g_loss_metric.update_state(g_loss)
        return {"d_loss": self.d_loss_metric.result(), "g_loss": self.g_loss_metric.result()}

精彩语录

中文：生成式深度学习的潜力不仅限于被动任务和反应性任务，还扩展到创意活动。
英文原文：The potential of artificial intelligence to emulate human thought processes goes beyond passive tasks such as object recognition and mostly reactive tasks such as driving a car. It extends well into creative activities.
解释：这句话强调了生成式深度学习在创意领域的广泛应用。
中文：深度学习语言模型捕捉的是语言的统计结构，而非其根本意义。
英文原文：Language models are all form and no substance.
解释：这句话指出语言模型的局限性，强调其缺乏真正的语义理解。
中文：变分自编码器（VAE）通过学习连续的潜在空间，使得图像编辑成为可能。
英文原文：VAEs result in highly structured, continuous latent representations. For this reason, they work well for doing all sorts of image editing in latent space.
解释：这句话总结了VAE在图像编辑中的优势。
中文：生成对抗网络（GAN）通过对抗训练生成逼真的图像。
英文原文：GANs enable the generation of fairly realistic synthetic images by forcing the generated images to be statistically almost indistinguishable from real ones.
解释：这句话介绍了GAN的核心思想。
中文：GAN的训练是一个动态过程，需要平衡生成器和判别器的能力。
英文原文：Training a GAN is a dynamic process rather than a simple gradient descent process with a fixed loss landscape.
解释：这句话强调了GAN训练的复杂性和挑战性。