(个人)基于深度学习的中国传统特色图像的风格迁移创新实训第二周（1）

最新推荐文章于 2024-09-07 11:33:40 发布

ImageStyleTransfer

最新推荐文章于 2024-09-07 11:33:40 发布

阅读量387

点赞数

分类专栏： CNN 文章标签： Tensorflow

本文链接：https://blog.csdn.net/buhuijavaaaa/article/details/79948492

版权

CNN 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文介绍了使用Tensorflow进行深度学习的传统图像风格迁移实践。主要内容包括：利用COCO数据集，构建训练网络，使用VGG16作为损失网络，设置Relu激活函数，定义风格和内容损失函数，采用50的batch大小，并使用total_variation_loss计算图像的变异性。通过这些步骤，成功实现了训练过程。

摘要由CSDN通过智能技术生成

这周的主要任务是先实现论文中使用的方法。

使用的是微软的COCO数据集，大小为17G.

首先来构建训练网络。

1. 首先利用yml文件来进行输入，使用yml文件可以实现快速输入数据。在yml文件中要加入自己想要迁移风格的style图片的信息，包括图片的位置，大小等：

style_image: img/shuimo.jpg 
naming: "shuimo" 
model_path: models

2.按照论文中的方法利用了VGG16作为损失网络。通过输入traing的图片，按照论文里的方法经过卷积之后，在4_1层得到可以最好表示风格的输出，构建这样一个网络，如下。

 with tf.variable_scope('conv1'):
        conv1 = relu(instance_norm(conv2d(image, 3, 32, 9, 1)))
    with tf.variable_scope('conv2'):
        conv2 = relu(instance_norm(conv2d(conv1, 32, 64, 3, 2)))
    with tf.variable_scope('conv3'):
        conv3 = relu(instance_norm(conv2d(conv2, 64, 128, 3, 2)))
    with tf.variable_scope('res1'):
        res1 = residual(conv3, 128, 3, 1)
    with tf.variable_scope('res2'):
        res2 = residual(res1, 128, 3, 1)
    with tf.variable_scope('res3'):
        res3 = residual(res2, 128, 3, 1)
    with tf.variable_scope('res4'):
        res4 = residual(res3, 128, 3, 1)
    with tf.variable_scope('res5'):
        res5 = residual(res4, 128, 3, 1)
    # print(res5.get_shape())
    with tf.variable_scope('deconv1'):
        # deconv1 = relu(instance_norm(conv2d_transpose(res5, 128, 64, 3, 2)))
        deconv1 = relu(instance_norm(resize_conv2d(res5, 128, 64, 3, 2, training)))
    with tf.variable_scope('deconv2'):
        # deconv2 = relu(instance_norm(conv2d_transpose(deconv1, 64, 32, 3, 2)))
        deconv2 = relu(instance_norm(resize_conv2d(deconv1, 64, 32, 3, 2, training)))
    with tf.variable_scope('deconv3'):
        # deconv_test = relu(instance_norm(conv2d(deconv2, 32, 32, 2, 1)))
        deconv3 = tf.nn.tanh(instance_norm(conv2d(deconv2, 32, 3, 9, 1)))

    y = (deconv3 + 1) * 127.5

    # Remove border effect reducing padding.
    height = tf.shape(y)[1]
    width = tf.shape(y)[2]
    y = tf.slice(y, [0, 10, 10, 0], tf.stack([-1, height - 20, width - 20, -1]))

3.激活函数选择了Relu:

def relu(input):
relu = tf.nn.relu(input)

nan_to_zero = tf.where(tf.equal(relu, relu), relu, tf.zeros_like(relu))

return nan_to_zero

4.batch大小选择了50，没进行50次更新一次网络中的权重和偏置。

 beta = tf.Variable(tf.zeros([size]), name='beta')
    scale = tf.Variable(tf.ones([size]), name='scale')
    pop_mean = tf.Variable(tf.zeros([size]))
    pop_var = tf.Variable(tf.ones([size]))
    epsilon = 1e-3

    batch_mean, batch_var = tf.nn.moments(x, [0, 1, 2])
    train_mean = tf.assign(pop_mean, pop_mean * decay + batch_mean * (1 - decay))
    train_var = tf.assign(pop_var, pop_var * decay + batch_var * (1 - decay))

5.接下来定义了style图片的损失函数和内容图片的损失函数，由于采用了快速风格迁移的方法，因此使用了分开计算：

def style_loss(endpoints_dict, style_features_t, style_layers):
    style_loss = 0
    style_loss_summary = {}
    for style_gram, layer in zip(style_features_t, style_layers):
        generated_images, _ = tf.split(endpoints_dict[layer], 2, 0)
        size = tf.size(generated_images)
        layer_style_loss = tf.nn.l2_loss(gram(generated_images) - style_gram) * 2 / tf.to_float(size)
        style_loss_summary[layer] = layer_style_loss
        style_loss += layer_style_loss
    return style_loss, style_loss_summary

def content_loss(endpoints_dict, content_layers):
    content_loss = 0
    for layer in content_layers:
        generated_images, content_images = tf.split(endpoints_dict[layer], 2, 0)
        size = tf.size(generated_images)
        content_loss += tf.nn.l2_loss(generated_images - content_images) * 2 / tf.to_float(size)  # remain the same as in the paper
    return content_loss

6.按照论文中的方法来对两个损失函数一起进行约束，

def total_variation_loss(layer):
shape = tf.shape(layer)
height = shape[1]
width = shape[2]
y = tf.slice(layer, [0, 0, 0, 0], tf.stack([-1, height - 1, -1, -1])) - tf.slice(layer, [0, 1, 0, 0], [-1, -1, -1, -1])
x = tf.slice(layer, [0, 0, 0, 0], tf.stack([-1, -1, width - 1, -1])) - tf.slice(layer, [0, 0, 1, 0], [-1, -1, -1, -1])
loss = tf.nn.l2_loss(x) / tf.to_float(tf.size(x)) + tf.nn.l2_loss(y) / tf.to_float(tf.size(y))

return loss

7.网络基本构建完成，接下来就需要传训练集图片的数据，因为训练集中的图片都有编号，因此可以直接顺次读取相应的图片。利用tensorflow内置的decode_png方法实现对图片的读取。

def get_image(path, height, width, preprocess_fn):
    png = path.lower().endswith('png')
    img_bytes = tf.read_file(path)
    image = tf.image.decode_png(img_bytes, channels=3) if png else tf.image.decode_jpeg(img_bytes, channels=3)
    return preprocess_fn(image, height, width)