风格迁移的算法实现
风格迁移的过程
这里选择VGG19作为训练模型,主要的过程如下:
- 使用预训练网络计算激活层:计算风格参考图像,目标图像和生成图像的VGG19层激活
- 定义损失函数:使用这三张图像上计算的层激活来定义之前所述的损失函数
- 损失函数最小化:通过梯度下降来实现
准备工作
初始化变量
- 目标图像和参考图像的地址路径
- 调整图像的高度为400像素,为了确保处理后的图像具有相似的尺寸,方面迁移
from keras.preprocessing.image import load_img,img_to_array
# 目标图像的路径
target_image_path = 'T1.jpg'
# 风格图像的路径
style_reference_image_path = 'S1.jpg'
# 加载图像并获得图像size
width,height = load_img(target_image_path).size
# 将图像缩放height 固定为400
img_height = 400
# 将宽度也随着图像的高度一起缩放到相应的大小
img_width =int(width*img_height/height)
辅助函数的构建,用来加载图像,预处理和后处理。
import numpy as np
from keras.applications import vgg19
def preprocess_image(image_path):
# 图像的加载并缩放到相应的尺寸
img = load_img(image_path,target_size=(img_height,img_width))
# 转换成数组的形式
img = img_to_array(img)
# 转换成numpy矩阵的形式
img = np.expand_dims(img,axis=0)
# 转换成vgg19输入的tensor
# vgg19.preprocess_input 的作用是减去 ImageNet 的平均像素值
img = vgg19.preprocess_input(img)
return img
def deprocess_image(x):
# 相当于 vgg19.preprocess_input 的逆操作
# 加上 ImageNet 的平均像素值
x[:,:,0] += 103.939
x[:,:,1] += 116.779
x[:,:,2] += 123.68
# BGR=》 RGB
x = x[:,:,::-1]
# 让图像正常显示再[0,255]
x = np.clip(x,0,255).astpye('uint8')
return x
Using TensorFlow backend.
构建VGG19模型
先利用keras中的applications构建VGG19模型,并且选用的预训练数据是基于imagenet数据。
在input_tensor中我们这里利用tensorflow的constant变量参数来定义: - 目标图像 - 参考图像 表示常量,不会随着时间而改变。 这里的变量就是生成图像,它会随着时间而改变。所以利用palceholder占位符来定义该张量。
from keras import backend as K
# constant变量参数来定义
target_image = K.constant(preprocess_image(target_image_path))
style_reference_image = K.constant(preprocess_image(style_reference_image_path))
# 用palceholder占位符来定义该张量
combination_image = K.placeholder((1,img_height,img_width,3))
# 将三张图像合并为一个批量
input_tensor = K.concatenate([target_image,
style_reference_image,
combination_image],axis=0)
# 加载VGG19模型
model = vgg19.VGG19(input_tensor=input_tensor,
weights ='imagenet',
include_top=False)
print('Model loaded')
model.summary()
WARNING:tensorflow:From D:Anaconda3envstfgpulibsite-packagestensorflowpythonframeworkop_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Model loaded
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, None, None, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, None, None, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, None, None, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, None, None, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, None, None, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, None, None, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, None, None, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_conv4 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, None, None, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_conv4 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, None, None, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv4 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, None, None, 512) 0
=================================================================
Total params: 20,024,384
Trainable params: 20,024,384
Non-trainable params: 0
_________________________________________________________________
三个损失算法的实践
内容损失
要保证目标图像和生成图像在 VGG19 卷积神经网络的顶层具有相 似的结果。
def content_loss(base,combination):
return K.sum(K.square(combination - base))
风格损失
计算输入矩阵的格拉姆矩阵也就是矩阵的内积,即原始特征矩阵 中相互关系的映射 再计算两个矩阵的相对关系。
def gram_matrix(x):
features = K.batch_flatten(K.permute_dimensions(x,(2,0,1)))
gram = K.dot(features,K.transpose(features))
return gram
def style_loss(style,combination):
S = gram_matrix(style)
C = gram_matrix(combination)
channels = 3
size = img_height * img_width
return K.sum(K.square(S-C)) / (4. * (channels **2) * (size **2))
总变差损失(total variation loss)
它促使生成图像具有空间连续性,从而避免结果过度像素化。 你可以将其理解为正则化损失。
def total_variation_loss(x):
a = K.square(
x[:, :img_height - 1, :img_width - 1, :] -
x[:, 1:, :img_width - 1, :])
b = K.square(
x[:, :img_height - 1, :img_width - 1, :] -
x[:, :img_height - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
分享关于人工智能,机器学习,深度学习以及计算机视觉的好文章,同时自己对于这个领域学习心得笔记。想要一起深入学习人工智能的小伙伴一起结伴学习吧!扫码上车!
http://weixin.qq.com/r/RT8ZAf7Ew6O8rbpi92pX (二维码自动识别)