Deep Dream 模型

最新推荐文章于 2024-03-19 09:52:12 发布

纸上得来终觉浅～

最新推荐文章于 2024-03-19 09:52:12 发布

阅读量657

点赞数

分类专栏：图像处理文章标签：图像处理深度学习 tensorflow inception模型 DeepDream

本文链接：https://blog.csdn.net/qq_32172681/article/details/91047016

版权

本文介绍了如何使用TensorFlow实现Deep Dream技术，通过优化Inception模型的卷积层通道激活值生成图像。首先，导入Inception模型，然后通过梯度下降法最大化特定通道的平均值以生成图像。接着，通过分块处理解决大尺寸图像优化时的内存问题。此外，还介绍了如何生成更高质量的图像，利用高斯和拉普拉斯金字塔以及梯度标准化提升图像的低频成分，使图像更加柔和。最后，展示了如何结合背景图像生成更复杂的Deep Dream效果。

摘要由CSDN通过智能技术生成

Deep Dream 是 Google 公司在 2015 年公布的一项有趣的技术。本文通过极大化卷积层某个通道的平均值来生成图像，并学习了如何生成更大尺寸和更高质量的图像。

1、导入 Inception模型

原始的 Deep Dream模型只需要优化 ImageNet模型卷积层某个通道的激活值就可以了，为此应该先在 TensorFlow导入一个 ImageNet图像识别模型。这里以 Inception 模型为例进行介绍，新建文件 load_inception.py。

# 引入基本模块
from __future__ import print_function # 兼容print的py2、py3版本
import numpy as np
import tensorflow as tf

# 创建图和会话
graph = tf.Graph
sess = tf.InteractiveSession(graph=graph)

TensorFlow 提供了一种特殊的以“.pb”为扩展名的文件，可以事先将模型导入到pb 文件中，再在需要的时候导出。对于 Inception 模型，对应的 pb 文件为 tensorflow_inception_graph.pb。使用下面的程序就可以把 Inception 模型导入TensorFlow 中。tensorflow_inception_graph.pb文件中，既存储了inception网络结构，也存储了数据。

# 导入inception模块
file_name = 'tensorflow_inception_graph.pb'
with tf.gfile.FastGFile(file_name,'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

# 输入数据占位符
input_data = tf.placeholder(np.float32,name='input_data')

# 数据预处理,去均值,增加维数
mean_value = 117.0
data_processed = tf.expand_dims(input_data - mean_value, 0)

# 导入模型
tf.import_graph_def(graph_def,{'input_data':data_processed})

为什么要增加维数呢？因为格式(height, width, channel)只能表示一张图片，但在训练神经网络时往往需要同时送入多张国片，因此在前面加了一维，让输入图像的格式变为(batch, height, width, channel)，这符合Inception模型需要的输入格式。

为什么要去均值呢？因为在训练 Inception 模型的时候，已经做了减去均值的预处理，因此应该使用同样的预处理方法，才能保持输入的一致。此处使用的 Inception 模型减去的是一个固定的均值 117，所以在程序中也定义 7 imagenet_mean= 117，并用 t_input 减去 imagenet_mean。

2、生成图像

新建文件gen_naive.py，首先导入inception模型，同上述一致。

# 引入基本模块
from __future__ import print_function # 兼容print的py2、py3版本
import numpy as np
import tensorflow as tf
import scipy

# 创建图和会话
graph = tf.Graph()
sess = tf.InteractiveSession(graph=graph)
# 导入inception模块
file_name = 'tensorflow_inception_graph.pb'
with tf.gfile.GFile(file_name,'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
# 输入数据占位符
input_data = tf.placeholder(np.float32,name='t_input')
# 数据预处理,去均值
mean_value = 117.0
data_processed = tf.expand_dims(input_data - mean_value, 0)
# 导入模型
tf.import_graph_def(graph_def,{'input': data_processed})

新建一个保存图片函数，将numpy.ndarray保存为文件的形式

# 保存图像
def save_image(image_array,image_name):
    scipy.misc.toimage(image_array).save(image_name)
    print("%s saved" % image_name)

将mixed4d_3x3_bottleneck_pre_relu层的输出，任意选择一个通道作为优化目标；再用均匀分布定义一个原始图像，表示优化起点

# 得到 mixed4d_3x3_bottleneck_pre_relu层channel通道的输出，作为训练目标
name = 'mixed4d_3x3_bottleneck_pre_relu'
channel = 139 # 共144个通道，此处随机选择一个通道进行调整
layer_output = graph.get_tensor_by_name('import/%s:0' % name)
# 输入图像初始化
img_init = np.random.uniform(size=[224,224,3])+100.0

定义训练函数

# 训练
def train(t_obj,t_img,iter_n=20,lr=1.0):
    # 优化目标 t_score是 t_obj 的平均值 。t_score越大 ， 就说明神经网络卷积层对应通道的平均激活越大 。
    t_score = tf.reduce_mean(t_obj)
    # 梯度
    t_grad = tf.gradients(t_score,input_data)[0]

    img = t_img.copy()
    for i in range(iter_n):
        grad,score = sess.run([t_grad,t_score],{input_data:img})
        grad /= grad.std() + 1e-8
        img += grad * lr # 将梯度运用到图像上