神经风格转换

最新推荐文章于 2023-08-03 15:03:27 发布

深山里的菜叶子

最新推荐文章于 2023-08-03 15:03:27 发布

阅读量403

点赞数

分类专栏：深度学习文章标签：深度学习 tensorflow

本文链接：https://blog.csdn.net/m0_57292867/article/details/125602668

版权

深度学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

参考学习：

深度学习项目-风格转换_codingfishc的博客-CSDN博客_深度学习风格转换深度学习项目-风格转换神经风格转换1.导入包2.加载VGG模型3.搭建神经风格算法4. 风格权值5.解决优化问题神经风格转换神经风格转换（Neural Style Transfer，NST）是深学习中最有趣的技术之一。如下图所示，它合并两个图像，即“内容”图像（CContent）和“风格”图像（SStyle），以创建“生成的”图像（GGenerated）。生成的图像G将图像C的“内容”与图像S...https://blog.csdn.net/weixin_38189929/article/details/96882894

训练神经网络的简单例子（TensorFlow平台下Python实现）_Jaster_wisdom的博客-CSDN博客_tensorflow样例https://blog.csdn.net/Jaster_wisdom/article/details/78018653在做神经风格转换作业的时候，参考了网上很多帖子，但实际操作的时候总是会有各种各样的报错，还有许多帖子本身写的也有点问题，经过各种修改最终完成神经风格转换的作业，故写下学习总结以便后续复习查阅。

所用：python版本3.6；tensorflow版本：1.14.0；scipy：1.2.1；numpy：1.19.5

目标：

实现神经风格转换算法
用算法生成新的艺术图像

1、导入包

import os
import sys
#数据的输入输出
import scipy.io  
#数据和图片之间的一个操作 
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
from nst_utils import *
import nst_utils
import numpy as np
import tensorflow as tf

%matplotlib inline

Scipy是一个高级的科学计算库，Scipy由一些特定功能的子模块组成，可以应对不同的应用，例如插值运算，优化算法、图像处理、数学统计等；它和Numpy联系很密切，Scipy依赖于Numpy，一般都是操控Numpy数组来进行科学计算；

其包含的功能：最优化、线性代数、积分、插值、拟合、特殊函数、快速傅里叶变换、信号处理、图像处理、常微分方程求解器等

应用场景:Scipy是高端科学计算工具包，用于数学、科学、工程学等领域

2、加载VGG模型（从VGG模型加载参数）

model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")
print(model)

3、搭建神经风格算法

content_image = scipy.misc.imread("images/louvre.jpg")
imshow(content_image)

所得结果：

4、计算内容损失值

def compute_content_cost(a_C, a_G):
    #获取a_G的维度信息
    m, n_H, n_W, n_C = a_G.get_shape().as_list()
    #对a_C与a_G从3维降到2维
    a_C_unrolled = tf.reshape(a_C, [n_H*n_W, n_C])
    a_G_unrolled = tf.reshape(a_G, [n_H*n_W, n_C])
    #计算内容代价
    J_content = 1./(4 * n_H * n_W *n_C)*tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled)))

    return J_content

tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    a_C = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    J_content = compute_content_cost(a_C, a_G)
    print("J_content = " + str(J_content.eval()))

所得结果：

J_content = 6.7655926

5、计算风格损失值并查看风格图片

style_image = scipy.misc.imread("images/stone_style.jpg")
imshow(style_image)

风格图片：

6、计算风格矩阵

def gram_matrix(A):
    
    GA = tf.matmul(A, tf.transpose(A))
    return GA

tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    A = tf.random_normal([3, 2*1], mean=1, stddev=4)
    GA = gram_matrix(A)

    print("GA = " + str(GA.eval()))

所得结果：

GA = [[ 6.422305 -4.429122 -2.096682]
 [-4.429122 19.465837 19.563871]
 [-2.096682 19.563871 20.686462]]

7、计算单隐藏层的风格损失

def compute_style_cost(model, STYLE_LAYERS):
   
    J_style = 0

    for layer_name, coeff in STYLE_LAYERS:
        out = model[layer_name]
        a_S = sess.run(out)
        a_G = out
        J_style_layer = compute_layer_style_cost(a_S, a_G)
        J_style += coeff * J_style_layer

    return J_style

8、定义优化函数的总体损失

def total_cost(J_content, J_style, alpha = 10, beta = 40):
   
    J = alpha * J_content + beta * J_style
    return J

tf.reset_default_graph()

with tf.Session() as test:
    np.random.seed(3)
    J_content = np.random.randn()    
    J_style = np.random.randn()
    J = total_cost(J_content, J_style)
    print("J = " + str(J))

所得结果：

J = 35.34667875478276

9、解决最优化的问题

#启动交互式Session
tf.reset_default_graph()
sess = tf.InteractiveSession()

#载入、转换、和归一化”内容”图片 
content_image = scipy.misc.imread("images/louvre.jpg")
content_image = reshape_and_normalize_image(content_image)

#载入、转换、和归一化”风格”图片
style_image = scipy.misc.imread("images/stone_style.jpg")
style_image = reshape_and_normalize_image(style_image)

#按照内容图片初始化一张随机噪点的图片
generated_image = generate_noise_image(content_image)
imshow(generated_image[0])

sess.run(model['input'].assign(content_image))
out = model['conv4_2']
a_C = sess.run(out)
a_G = out
J_content = compute_content_cost(a_C, a_G)

sess.run(model["input"].assign(style_image))

def compute_layer_style_cost(a_S, a_G):

    m, n_H, n_W, n_C = a_G.get_shape().as_list()

    a_S = tf.transpose(tf.reshape(a_S, [n_H * n_W, n_C]))
    a_G = tf.transpose(tf.reshape(a_G, [n_H * n_W, n_C]))

    GS = gram_matrix(a_S)
    GG = gram_matrix(a_G)

    J_style_layer = 1/(4*n_C*n_C*n_H*n_H*n_W*n_W)*tf.reduce_sum(tf.square(tf.subtract(GS, GG)))

    return J_style_layer

J_style = compute_style_cost(model, STYLE_LAYERS)
J = total_cost(J_content, J_style, alpha = 10, beta = 40)

## 定义优化器,设置学习率为2.0
optimizer = tf.train.AdamOptimizer(2.0)

## 定义学习目标：最小化成本
train_step = optimizer.minimize(J)

def model_nn(sess, input_image, num_iterations = 200, is_print_info = True,
             is_plot = True, is_save_process_image = True,
             save_last_image_to = "output1/generated_image1.jpg"):
    #初始化全局变量
    sess.run(tf.global_variables_initializer())    
    sess.run(model["input"].assign(input_image))

    for i in range(num_iterations):
        sess.run(train_step)

        #产生把数据输入模型后生成的图像
        generated_image = sess.run(model["input"])

        if is_print_info and i % 20 == 0:
            Jt, Jc, Js = sess.run([J, J_content, J_style])
            print("第 " + str(i) + "轮训练," +
                  "  总成本为:"+ str(Jt) +
                  "  内容成本为：" + str(Jc) +
                  "  风格成本为：" + str(Js))
        if is_save_process_image:
            nst_utils.save_image("output/" + str(i) + ".png", generated_image)

    nst_utils.save_image(save_last_image_to, generated_image)

    return generated_image


#非GPU版本,约25-30min
generated_image = model_nn(sess, generated_image)

10、所得结果