A neural algorithm of artistic style

文章的目的是对于给定的风格化模板,将其风格传输到输入图像中,风格传输采用的VGG的1-5个conv层,网络具体结构如下:

这里写图片描述

图中,底端图片从左到右分别为风格图像(style image),风格转换结果图像(result image),未风格转换的图像(content image),左边的网络的目的是使得style image 与result image 的风格特征尽量相似,这里用conv层的feature map的相关性表示风格特征,从而可以得到每个conv层的风格转换损失函数为:
EL(a,x)=14N2lM2l(GlijAlij)

右边的网络为使得result image 与content image的特征尽量相似,即两张图像的每个conv层的feaure map 尽量相似,因此每层的特征转换损失函数为:
lcontent(l)=12i,j(FlijPlij)2
文章共用到了vgg的conv1-5层,将5层的loss_style和loss_content相加则可以得到总的损失函数:
ltotal=αlstyle+βlcontent
网络的训练和求解result image:

一般的CNN优化,都是通过有监督训练去优化网络的每层的w、b等参数,而本文额优化的参数不再是网络的w和b,而是初始result image 图像为一张噪声图片x(也可以初始化为content image) ,并通过梯度下降法求解使得损失函数最小的result image ,作为最终的风格转换结果图像。

看完这篇文献,感觉需要重点理解的就是

  • 可以用feature map的相关性表示图像的风格特征;

  • 在CNN训练时,优化的参数不再是网络的w和b,优化损失函数的同时,可以更新得到输出图像;

讲完原理,要理解代码就很容易了,在此仅贴出文献主要代码以便进一步理解:
https://github.com/woodrush/neural-art-tf/blob/master/neural-art-tf.py

import tensorflow as tf
import numpy as np
from models import VGG16, I2V
from utils import read_image, save_image, parseArgs, getModel, add_mean
import argparse

import time
content_image_path, style_image_path, params_path, modeltype, width, alpha, beta, num_iters, device, args = parseArgs()

# The actual calculation
print "Read images..."
content_image = read_image(content_image_path, width)
style_image   = read_image(style_image_path, width)
g = tf.Graph()
with g.device(device), g.as_default(), tf.Session(graph=g, config=tf.ConfigProto(allow_soft_placement=True)) as sess:
    print "Load content values..."
    image = tf.constant(content_image)# 得到content image的feature map
    model = getModel(image, params_path, modeltype)
    content_image_y_val = [sess.run(y_l) for y_l in model.y()]  # sess.run(y_l) is a constant numpy array

    print "Load style values..."
    image = tf.constant(style_image)
    model = getModel(image, params_path, modeltype)
    y = model.y()# stlye image的feature map
    style_image_st_val = []
    for l in range(len(y)):
        num_filters = content_image_y_val[l].shape[3]
        st_shape = [-1, num_filters]
        st_ = tf.reshape(y[l], st_shape)
        st = tf.matmul(tf.transpose(st_), st_)#计算stlye image的每一层的风格特征
        style_image_st_val.append(sess.run(st))  # sess.run(st) is a constant numpy array

    print "Construct graph..."
    # Start from white noise
    # gen_image = tf.Variable(tf.truncated_normal(content_image.shape, stddev=20), trainable=True, name='gen_image')
    # Start from the original image
    gen_image = tf.Variable(tf.constant(np.array(content_image, dtype=np.float32)), trainable=True, name='gen_image')#初始化风格转换图像为content image
    model = getModel(gen_image, params_path, modeltype)
    y = model.y()
    L_content = 0.0
    L_style   = 0.0
    for l in range(len(y)):
        # Content loss
        L_content += model.alpha[l]*tf.nn.l2_loss(y[l] - content_image_y_val[l])#result image与content image的特征损失函数
        # Style loss
        num_filters = content_image_y_val[l].shape[3]
        st_shape = [-1, num_filters]
        st_ = tf.reshape(y[l], st_shape)
        st = tf.matmul(tf.transpose(st_), st_)#计算result image的每一层的风格特征
        N = np.prod(content_image_y_val[l].shape).astype(np.float32)
        L_style += model.beta[l]*tf.nn.l2_loss(st - style_image_st_val[l])/N**2/len(y)#result image 与 style image的风格损失函数
    # The loss
    L = alpha* L_content + beta * L_style#总的损失函数

    # The optimizer,梯度下降法优化result image
    global_step = tf.Variable(0, trainable=False)
    learning_rate = tf.train.exponential_decay(learning_rate=2.0, global_step=global_step, decay_steps=100, decay_rate=0.94, staircase=True)
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(L, global_step=global_step)
    # A more simple optimizer
    # train_step = tf.train.AdamOptimizer(learning_rate=2.0).minimize(L)

    print "Start calculation..."
    # The optimizer has variables that require initialization as well
    sess.run(tf.initialize_all_variables())
    for i in range(num_iters):
        if i % 10 == 0:
            gen_image_val = sess.run(gen_image)
            save_image(gen_image_val, i, args.out_dir)
            print "L_content, L_style:", sess.run(L_content), sess.run(L_style)
        print "Iter:", i
        sess.run(train_step)
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值