A neural algorithm of artistic style

最新推荐文章于 2022-03-21 10:04:02 发布

imperfect00

最新推荐文章于 2022-03-21 10:04:02 发布

阅读量418

点赞数

分类专栏：深度学习图像处理

本文链接：https://blog.csdn.net/u011961856/article/details/77920495

版权

深度学习同时被 2 个专栏收录

71 篇文章 2 订阅

订阅专栏

图像处理

40 篇文章 0 订阅

订阅专栏

文章的目的是对于给定的风格化模板，将其风格传输到输入图像中，风格传输采用的VGG的１-５个conv层，网络具体结构如下：

这里写图片描述

图中，底端图片从左到右分别为风格图像（style image），风格转换结果图像(result image)，未风格转换的图像(content image),左边的网络的目的是使得style image 与result image 的风格特征尽量相似，这里用conv层的feature　map的相关性表示风格特征，从而可以得到每个conv层的风格转换损失函数为：
$\begin{aligned}E_L(a,x)=\frac{1}{4N_l^2M_l^2}(G_{ij}^l-A_{ij}^l)\end{aligned}$

右边的网络为使得result image 与content image的特征尽量相似，即两张图像的每个conv层的feaure map 尽量相似，因此每层的特征转换损失函数为：
$\begin{aligned}l_{content}(l)=\frac{1}{2}\sum_{i,j}(F_{ij}^l-P_{ij}^l)^2\end{aligned}$
文章共用到了vgg的conv１－５层,将５层的loss_style和loss_content相加则可以得到总的损失函数：
$\begin{aligned}l_{total}=\alpha*l_{style}+\beta*l_{content}\end{aligned}$
网络的训练和求解result image:

一般的CNN优化，都是通过有监督训练去优化网络的每层的w、b等参数，而本文额优化的参数不再是网络的w和b，而是初始result image 图像为一张噪声图片x（也可以初始化为content image），并通过梯度下降法求解使得损失函数最小的result image ,作为最终的风格转换结果图像。

看完这篇文献，感觉需要重点理解的就是：

可以用feature map的相关性表示图像的风格特征；
在CNN训练时，优化的参数不再是网络的w和b，优化损失函数的同时，可以更新得到输出图像;

讲完原理，要理解代码就很容易了，在此仅贴出文献主要代码以便进一步理解：
https://github.com/woodrush/neural-art-tf/blob/master/neural-art-tf.py

import tensorflow as tf
import numpy as np
from models import VGG16, I2V
from utils import read_image, save_image, parseArgs, getModel, add_mean
import argparse

import time
content_image_path, style_image_path, params_path, modeltype, width, alpha, beta, num_iters, device, args = parseArgs()

# The actual calculation
print "Read images..."
content_image = read_image(content_image_path, width)
style_image   = read_image(style_image_path, width)
g = tf.Graph()
with g.device(device), g.as_default(), tf.Session(graph=g, config=tf.ConfigProto(allow_soft_placement=True)) as sess:
    print "Load content values..."
    image = tf.constant(content_image)# 得到content image的feature map
    model = getModel(image, params_path, modeltype)
    content_image_y_val = [sess.run(y_l) for y_l in model.y()]  # sess.run(y_l) is a constant numpy array

    print "Load style values..."
    image = tf.constant(style_image)
    model = getModel(image, params_path, modeltype)
    y = model.y()# stlye image的feature map
    style_image_st_val = []
    for l in range(len(y)):
        num_filters = content_image_y_val[l].shape[3]
        st_shape = [-1, num_filters]
        st_ = tf.reshape(y[l], st_shape)
        st = tf.matmul(tf.transpose(st_), st_)＃计算stlye image的每一层的风格特征
        style_image_st_val.append(sess.run(st))  # sess.run(st) is a constant numpy array

    print "Construct graph..."
    # Start from white noise
    # gen_image = tf.Variable(tf.truncated_normal(content_image.shape, stddev=20), trainable=True, name='gen_image')
    # Start from the original image
    gen_image = tf.Variable(tf.constant(np.array(content_image, dtype=np.float32)), trainable=True, name='gen_image')＃初始化风格转换图像为content image
    model = getModel(gen_image, params_path, modeltype)
    y = model.y()
    L_content = 0.0
    L_style   = 0.0
    for l in range(len(y)):
        # Content loss
        L_content += model.alpha[l]*tf.nn.l2_loss(y[l] - content_image_y_val[l])＃result image与content image的特征损失函数
        # Style loss
        num_filters = content_image_y_val[l].shape[3]
        st_shape = [-1, num_filters]
        st_ = tf.reshape(y[l], st_shape)
        st = tf.matmul(tf.transpose(st_), st_)＃计算result image的每一层的风格特征
        N = np.prod(content_image_y_val[l].shape).astype(np.float32)
        L_style += model.beta[l]*tf.nn.l2_loss(st - style_image_st_val[l])/N**2/len(y)＃result image 与　style　image的风格损失函数
    # The loss
    L = alpha* L_content + beta * L_style＃总的损失函数

    # The optimizer，梯度下降法优化result image
    global_step = tf.Variable(0, trainable=False)
    learning_rate = tf.train.exponential_decay(learning_rate=2.0, global_step=global_step, decay_steps=100, decay_rate=0.94, staircase=True)
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(L, global_step=global_step)
    # A more simple optimizer
    # train_step = tf.train.AdamOptimizer(learning_rate=2.0).minimize(L)

    print "Start calculation..."
    # The optimizer has variables that require initialization as well
    sess.run(tf.initialize_all_variables())
    for i in range(num_iters):
        if i % 10 == 0:
            gen_image_val = sess.run(gen_image)
            save_image(gen_image_val, i, args.out_dir)
            print "L_content, L_style:", sess.run(L_content), sess.run(L_style)
        print "Iter:", i
        sess.run(train_step)

imperfect00

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
A neural algorithm of artistic style

文章的目的是对于给定的风格化模板，将其风格传输到输入图像中，风格传输采用的VGG的１-５个conv层，网络具体结构如下：图中，底端图片从左到右分别为风格图像（style image），风格转换结果图像(result image)，未风格转换的图像(content image),左边的网络的目的是使得style image 与result image 的风格特征尽量相似，这里用conv层的feat
复制链接

扫一扫

专栏目录