吴恩达深度学习 deeplearning.ai (4-4) 编程作业

最新推荐文章于 2024-06-19 21:37:28 发布

墨水河刘能

最新推荐文章于 2024-06-19 21:37:28 发布

阅读量1.3k

点赞数 4

分类专栏：吴恩达深度学习文章标签： tensorflow 神经网络机器学习深度学习 python

本文链接：https://blog.csdn.net/weixin_47440593/article/details/108843488

版权

吴恩达深度学习专栏收录该内容

4 篇文章 3 订阅

订阅专栏

本文参考何宽大神的博客https://blog.csdn.net/u013733326/article/details/80767079

首先先放一下第一个编程作业的代码

from keras.models import Sequential
from keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate
from keras.models import Model
from keras.layers.normalization import BatchNormalization
from keras.layers.pooling import MaxPooling2D, AveragePooling2D
from keras.layers.merge import Concatenate
from keras.layers.core import Lambda, Flatten, Dense
from keras.initializers import glorot_uniform
from keras.engine.topology import Layer
from keras import backend as K
#------------用于绘制模型细节，可选--------------#
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model
#------------------------------------------------#

K.set_image_data_format('channels_first')

import time
import cv2
import os
import numpy as np
from numpy import genfromtxt
import pandas as pd
import tensorflow as tf
import fr_utils
from inception_blocks_v2 import *
import matplotlib.pyplot as plt
plt.show()
np.set_printoptions(threshold=np.nan)

#获取模型
FRmodel = faceRecoModel(input_shape=(3,96,96))
#打印模型的总参数数量
print("参数数量：" + str(FRmodel.count_params()))


def triplet_loss(y_true, y_pred, alpha=0.2):
    """
    根据公式（4）实现三元组损失函数

    参数：
        y_true -- true标签，当你在Keras里定义了一个损失函数的时候需要它，但是这里不需要。
        y_pred -- 列表类型，包含了如下参数：
            anchor -- 给定的“anchor”图像的编码，维度为(None,128)
            positive -- “positive”图像的编码，维度为(None,128)
            negative -- “negative”图像的编码，维度为(None,128)
        alpha -- 超参数，阈值

    返回：
        loss -- 实数，损失的值
    """
    # 获取anchor, positive, negative的图像编码
    anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]

    # 第一步：计算"anchor" 与 "positive"之间编码的距离，这里需要使用axis=-1
    pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), axis=-1)

    # 第二步：计算"anchor" 与 "negative"之间编码的距离，这里需要使用axis=-1
    neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), axis=-1)

    # 第三步：减去之前的两个距离，然后加上alpha
    basic_loss = tf.add(tf.subtract(pos_dist, neg_dist), alpha)

    # 通过取带零的最大值和对训练样本的求和来计算整个公式
    loss = tf.reduce_sum(tf.maximum(basic_loss, 0))

    return loss


with tf.Session() as test:
    tf.set_random_seed(1)
    y_true = (None, None, None)
    y_pred = (tf.random_normal([3, 128], mean=6, stddev=0.1, seed=1),
              tf.random_normal([3, 128], mean=1, stddev=1, seed=1),
              tf.random_normal([3, 128], mean=3, stddev=4, seed=1))
    loss = triplet_loss(y_true, y_pred)

    print("loss = " + str(loss.eval()))


#开始时间
start_time = time.process_time()

#编译模型
FRmodel.compile(optimizer = 'adam', loss = triplet_loss, metrics = ['accuracy'])

#加载权值
fr_utils.load_weights_from_FaceNet(FRmodel)

#结束时间
end_time = time.process_time()

#计算时差
minium = end_time - start_time

print("执行了：" + str(int(minium / 60)) + "分" + str(int(minium%60)) + "秒")

输出结果：

参数数量：3743280
loss = 528.1432
执行了：1分33秒

继续：

def img_to_encoding(image_path, model):
    img1 = cv2.imread(image_path, 1)
    img = img1[...,::-1]
    img = np.around(np.transpose(img, (2,0,1))/255.0, decimals=12)
    x_train = np.array([img])
    embedding = model.predict_on_batch(x_train)
    return embedding

人脸验证

我们构建一个数据库，里面包含了允许进入的人员的编码向量，我们使用fr_uitls.img_to_encoding(image_path, model)函数来生成编码，它会根据图像来进行模型的前向传播。
我们这里的数据库使用的是一个字典来表示，这个字典将每个人的名字映射到他们面部的128维编码上

database = {}
database["danielle"] = img_to_encoding("images/danielle.png", FRmodel)
database["younes"] = img_to_encoding("images/younes.jpg", FRmodel)
database["tian"] = img_to_encoding("images/tian.jpg", FRmodel)
database["andrew"] = img_to_encoding("images/andrew.jpg", FRmodel)
database["kian"] = img_to_encoding("images/kian.jpg", FRmodel)
database["dan"] = img_to_encoding("images/dan.jpg", FRmodel)
database["sebastiano"] = img_to_encoding("images/sebastiano.jpg", FRmodel)
database["bertrand"] = img_to_encoding("images/bertrand.jpg", FRmodel)
database["kevin"] = img_to_encoding("images/kevin.jpg", FRmodel)
database["felix"] = img_to_encoding("images/felix.jpg", FRmodel)
database["benoit"] = img_to_encoding("images/benoit.jpg", FRmodel)
database["arnaud"] = img_to_encoding("images/arnaud.jpg", FRmodel)

现在，当有人出现在你的门前刷他们的身份证的时候，你可以在数据库中查找他们的编码，用它来检查站在门前的人是否与身份证上的名字匹配。

现在我们要实现 verify() 函数来验证摄像头的照片(image_path)是否与身份证上的名称匹配，这个部分可由以下步骤构成：

根据image_path来计算编码。
计算与存储在数据库中的身份图像的编码的差距。
如果差距小于0.7，那么就打开门，否则就不开门。

如上所述，我们使用L2(np.linalg.norm)来计算差距。(注意:在本实现中，将L2的误差(而不是L2误差的平方)与阈值0.7进行比较。)

def verify(image_path, identity, database, model):
    """
    对“identity”与“image_path”的编码进行验证。
    
    参数：
        image_path -- 摄像头的图片。
        identity -- 字符类型，想要验证的人的名字。
        database -- 字典类型，包含了成员的名字信息与对应的编码。
        model -- 在Keras的模型的实例。
        
    返回：
        dist -- 摄像头的图片与数据库中的图片的编码的差距。
        is_open_door -- boolean,是否该开门。
    """
    #第一步：计算图像的编码，使用fr_utils.img_to_encoding()来计算。
    encoding = fr_utils.img_to_encoding(image_path, model)
    
    #第二步：计算与数据库中保存的编码的差距
    dist = np.linalg.norm(encoding - database[identity])
    
    #第三步：判断是否打开门
    if dist < 0.7:
        print("欢迎 " + str(identity) + "回家！")
        is_door_open = True
    else:
        print("经验证，您与" + str(identity) + "不符！")
        is_door_open = False
    
    return dist, is_door_open

现在younes在门外，相机已经拍下了照片并存放在了(“images/camera_0.jpg”)，现在我们来验证一下~

verify("images/camera_0.jpg","younes",database,FRmodel)

运行：
Benoit已经被禁止进入，也从数据库中删除了自己的信息，他偷了Kian的身份证并试图通过门禁，我们来看看他能不能进入呢？
```
verify("images/camera_2.jpg", "kian", database, FRmodel)
```
运行：

人脸识别

面部验证系统基本运行良好，但是自从Kian的身份证被偷后，那天晚上他回到房子那里就不能进去了!为了减少这种恶作剧，你想把你的面部验证系统升级成面部识别系统。这样就不用再带身份证了，一个被授权的人只要走到房子前面，前门就会自动为他们打开!

我们将实现一个人脸识别系统，该系统将图像作为输入，并确定它是否是授权人员之一(如果是，是谁),与之前的人脸验证系统不同，我们不再将一个人的名字作为输入的一部分。

现在我们要实现who_is_it()函数，实现它需要有以下步骤：

根据image_path计算图像的编码。
从数据库中找出与目标编码具有最小差距的编码。
- 初始化min_dist变量为足够大的数字（100），它将找到与输入的编码最接近的编码。
- 遍历数据库中的名字与编码，可以使用for (name, db_enc) in database.items()语句。
  - 计算目标编码与当前数据库编码之间的L2差距。
  - 如果差距小于min_dist，那么就更新名字与编码到identity与min_dist中.

def who_is_it(image_path, database,model):
    """
    根据指定的图片来进行人脸识别
    
    参数：
        images_path -- 图像地址
        database -- 包含了名字与编码的字典
        model -- 在Keras中的模型的实例。
        
    返回：
        min_dist -- 在数据库中与指定图像最相近的编码。
        identity -- 字符串类型，与min_dist编码相对应的名字。
    """
    #步骤1：计算指定图像的编码，使用fr_utils.img_to_encoding()来计算。
    encoding = fr_utils.img_to_encoding(image_path, model)
    
    #步骤2 ：找到最相近的编码
    ## 初始化min_dist变量为足够大的数字，这里设置为100
    min_dist = 100
    
    ## 遍历数据库找到最相近的编码
    for (name,db_enc) in database.items():
        ### 计算目标编码与当前数据库编码之间的L2差距。
        dist = np.linalg.norm(encoding - db_enc)
        
        ### 如果差距小于min_dist，那么就更新名字与编码到identity与min_dist中。
        if dist < min_dist:
            min_dist = dist
            identity = name
    
    # 判断是否在数据库中
    if min_dist > 0.7:
        print("抱歉，您的信息不在数据库中。")
        
    else:
        print("姓名" + str(identity) + "  差距：" + str(min_dist))
    
    return min_dist, identity

Younes站在前门，相机给他拍了张照片(“images/camera_0.jpg”)。让我们看看who_it_is()算法是否识别Younes。

who_is_it("images/camera_0.jpg", database, FRmodel)

运行结果：

请记住：

人脸验证解决了更容易的1:1匹配问题，人脸识别解决了更难的1∶k匹配问题。
三重损失是训练神经网络学习人脸图像编码的一种有效的损失函数。
相同的编码可用于验证和识别。测量两个图像编码之间的距离可以确定它们是否是同一个人的图片。

第二部分 - 神经风格转换

深度学习在艺术上的应用：神经风格转换

在这里，我们将：

实现神经风格转换算法
用算法生成新的艺术图像

在之前的学习中我们都是优化了一个成本函数来获得一组参数值，在这里我们将优化成本函数以获取像素值，我们先来导入包：

import time
import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
import nst_utils
import numpy as np
import tensorflow as tf

%matplotlib inline

迁移学习

神经风格转换（NST）使用先前训练好了的卷积网络，并在此基础之上进行构建。使用在不同任务上训练的网络并将其应用于新任务的想法称为迁移学习。

根据原始的NST论文(https://arxiv.org/abs/1508.06576 )，我们将使用VGG网络，具体地说，我们将使用VGG-19，这是VGG网络的19层版本。这个模型已经在非常大的ImageNet数据库上进行了训练，因此学会了识别各种低级特征(浅层)和高级特征(深层)。

运行以下代码从VGG模型加载参数。这可能需要几秒钟的时间。

model = nst_utils.load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")

print(model)

运行结果：

Colocations handled automatically by placer.
{'input': <tf.Variable 'Variable:0' shape=(1, 300, 400, 3) dtype=float32_ref>, 'conv1_1': <tf.Tensor 'Relu:0' shape=(1, 300, 400, 64) dtype=float32>, 'conv1_2': <tf.Tensor 'Relu_1:0' shape=(1, 300, 400, 64) dtype=float32>, 'avgpool1': <tf.Tensor 'AvgPool:0' shape=(1, 150, 200, 64) dtype=float32>, 'conv2_1': <tf.Tensor 'Relu_2:0' shape=(1, 150, 200, 128) dtype=float32>, 'conv2_2': <tf.Tensor 'Relu_3:0' shape=(1, 150, 200, 128) dtype=float32>, 'avgpool2': <tf.Tensor 'AvgPool_1:0' shape=(1, 75, 100, 128) dtype=float32>, 'conv3_1': <tf.Tensor 'Relu_4:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_2': <tf.Tensor 'Relu_5:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_3': <tf.Tensor 'Relu_6:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_4': <tf.Tensor 'Relu_7:0' shape=(1, 75, 100, 256) dtype=float32>, 'avgpool3': <tf.Tensor 'AvgPool_2:0' shape=(1, 38, 50, 256) dtype=float32>, 'conv4_1': <tf.Tensor 'Relu_8:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_2': <tf.Tensor 'Relu_9:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_3': <tf.Tensor 'Relu_10:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_4': <tf.Tensor 'Relu_11:0' shape=(1, 38, 50, 512) dtype=float32>, 'avgpool4': <tf.Tensor 'AvgPool_3:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_1': <tf.Tensor 'Relu_12:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_2': <tf.Tensor 'Relu_13:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_3': <tf.Tensor 'Relu_14:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_4': <tf.Tensor 'Relu_15:0' shape=(1, 19, 25, 512) dtype=float32>, 'avgpool5': <tf.Tensor 'AvgPool_4:0' shape=(1, 10, 13, 512) dtype=float32>}

- 神经风格转换

我们可以使用下面3个步骤来构建神经风格转换（Neural Style Transfer，NST）算法：

构建内容损失函数J c o n t e n t ( C , G ) J_{content}(C,G)Jcontent(C,G)
构建风格损失函数J s t y l e ( S , G ) J_{style}(S,G)Jstyle(S,G)
把它放在一起得到J ( G ) = α J c o n t e n t ( C , G ) + β J s t y l e ( S , G ) J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G)J(G)=αJcontent(C,G)+βJstyle(S,G).

计算内容损失

在我们的运行的例子中，内容图像C是巴黎卢浮宫博物馆的图片，运行下面的代码来看看卢浮宫的图片：

content_image = scipy.misc.imread("images/louvre.jpg")
imshow(content_image)

继续：

def compute_content_cost(a_C, a_G):
    """
    计算内容代价的函数
    
    参数：
        a_C -- tensor类型，维度为(1, n_H, n_W, n_C)，表示隐藏层中图像C的内容的激活值。
        a_G -- tensor类型，维度为(1, n_H, n_W, n_C)，表示隐藏层中图像G的内容的激活值。
    
    返回：
        J_content -- 实数，用上面的公式1计算的值。
        
    """
    
    #获取a_G的维度信息
    m, n_H, n_W, n_C = a_G.get_shape().as_list()
    
    #对a_C与a_G从3维降到2维
    a_C_unrolled = tf.transpose(tf.reshape(a_C, [n_H * n_W, n_C]))
    a_G_unrolled = tf.transpose(tf.reshape(a_G, [n_H * n_W, n_C]))
    
    #计算内容代价
    #J_content = (1 / (4 * n_H * n_W * n_C)) * tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled)))
    J_content = 1/(4*n_H*n_W*n_C)*tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled)))
    return J_content
tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    a_C = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    J_content = compute_content_cost(a_C, a_G)
    print("J_content = " + str(J_content.eval()))
    
    test.close()

我们先来看一下下面的风格图片：

style_image = scipy.misc.imread("images/monet_800600.jpg")

imshow(style_image)

、

def gram_matrix(A):
    """
    Argument:
    A -- matrix of shape (n_C, n_H*n_W)

    Returns:
    GA -- Gram matrix of A, of shape (n_C, n_C)
    """

    ### START CODE HERE ### (≈1 line)
    GA = tf.matmul(A,tf.transpose(A))
    ### END CODE HERE ###

    return GA

tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    A = tf.random_normal([3, 2*1], mean=1, stddev=4)
    GA = gram_matrix(A)

    print("GA = " + str(GA.eval()))

# GRADED FUNCTION: compute_layer_style_cost
def compute_layer_style_cost(a_S, a_G):
    """
    Arguments:
    a_S -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image S
    a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image G

    Returns:
    J_style_layer -- tensor representing a scalar value, style cost defined above by equation (2)
    """

    ### START CODE HERE ###
    # Retrieve dimensions from a_G (≈1 line)
    m, n_H, n_W, n_C = a_G.get_shape().as_list()

    # Reshape the images to have them of shape (n_H*n_W, n_C) (≈2 lines)
    a_S = tf.reshape(a_S, [n_W*n_H, n_C])
    a_G = tf.reshape(a_G, [n_W*n_H, n_C])

    # Computing gram_matrices for both images S and G (≈2 lines)
    GS = gram_matrix(tf.transpose(a_S))
    GG = gram_matrix(tf.transpose(a_G))
    # GS = gram_matrix(a_S)
    # GG = gram_matrix(a_G)

    # Computing the loss (≈1 line)
    J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG))) / (4*tf.to_float(tf.square(n_C*n_H*n_W)))

    ### END CODE HERE ###

    return J_style_layer

tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    a_S = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    J_style_layer = compute_layer_style_cost(a_S, a_G)
    
    print("J_style_layer = " + str(J_style_layer.eval()))

风格权重
到目前为止，你获得了一层的风格。如果我们从几个不同的层“合并”样式成本，我们会得到更好的结果。在完成这个练习之后，你可以自由地回来尝试不同的权重，看看它如何改变生成的图像G。这是一个相当合理的默认值:

STYLE_LAYERS = [
    ('conv1_1', 0.2),
    ('conv2_1', 0.2),
    ('conv3_1', 0.2),
    ('conv4_1', 0.2),
    ('conv5_1', 0.2)]

def compute_style_cost(model, STYLE_LAYERS):
    """
    计算几个选定层的总体风格成本

    参数：
        model -- 加载了的tensorflow模型
        STYLE_LAYERS -- 字典，包含了：
                        - 我们希望从中提取风格的层的名称
                        - 每一层的系数（coeff）
    返回：
        J_style - tensor类型，实数，由公式(2)定义的成本计算方式来计算的值。

    """
    # 初始化所有的成本值
    J_style = 0

    for layer_name, coeff in STYLE_LAYERS:

        #选择当前选定层的输出
        out = model[layer_name]

        #运行会话，将a_S设置为我们选择的隐藏层的激活值
        #sess.run实际上是 启动一个计算
        #在tensorflow中，eval和run都是获取当前结点的值的一种方式。
		#在使用eval时，若有一个 t 是Tensor对象，调用t.eval()相当于调用sess.run(t)
		#注意：在上面for循环的内循环中，a_G是一个张量，尚未进行评估。 当我们在下面	的model_nn（）中运行TensorFlow图时，它将在每次迭代中被评估和更新。 
        a_S = sess.run(out)

        # 将a_G设置为来自同一图层的隐藏层激活,这里a_G引用model[layer_name]，并且		还没有计算，
        # 在后面的代码中，我们将图像G指定为模型输入，这样当我们运行会话时，
        # 这将是以图像G作为输入，从隐藏层中获取的激活值。
        a_G = out

        #计算当前层的风格成本
        J_style_layer = compute_layer_style_cost(a_S,a_G)

        # 计算总风格成本，同时考虑到系数。
        J_style += coeff * J_style_layer

    return J_style

# GRADED FUNCTION: total_cost

def total_cost(J_content, J_style, alpha = 10, beta = 40):
    """
    Computes the total cost function
    
    Arguments:
    J_content -- content cost coded above
    J_style -- style cost coded above
    alpha -- hyperparameter weighting the importance of the content cost
    beta -- hyperparameter weighting the importance of the style cost
    
    Returns:
    J -- total cost as defined by the formula above.
    """
    
    ### START CODE HERE ### (≈1 line)
    J = alpha * J_content + beta * J_style
    ### END CODE HERE ###
    
    return J

tf.reset_default_graph()

with tf.Session() as test:
    np.random.seed(3)
    J_content = np.random.randn()    
    J_style = np.random.randn()
    J = total_cost(J_content, J_style)
    print("J = " + str(J))

初始化TensorFlow graph并在大量迭代中运行它，在每一次迭代中更新生成的图像。
让我们详细地过一遍每个步骤。
你已经实现了总成本J(G)。现在我们将使用TensorFlow来对G进行优化。要做到这一点，你的程序必须重置graph并使用“Interactive Session”。与常规session不同，“Interactive Session”将自己安装为构建graph的默认session。这允许你运行变量，而不需要经常引用session对象，这简化了代码。
让我们开始Interactive Session吧。

# Reset the graph
tf.reset_default_graph()
# Start interactive session
sess = tf.InteractiveSession()

让我们对“内容”图像(卢浮宫的图片)加载、整形和归一化操作:

content_image = scipy.misc.imread("images/louvre_small.jpg")
content_image = reshape_and_normalize_image(content_image)

让我们对风格”图像(莫奈的画)加载、整形和归一化操作:

style_image = scipy.misc.imread("images/monet.jpg")
style_image = reshape_and_normalize_image(style_image)

现在，我们将“生成的”图像初始化为从内容图像创建的带噪声图像。通过初始化生成的图像的像素，使其主要是噪声，但仍然与内容图像有轻微的相关性，这将有助于“生成”图像的内容更快地匹配“内容”图像的内容。(可以在nst_utils.py中查看generate_noise_image(…))

generated_image = generate_noise_image(content_image)
plt.imshow(generated_image[0])

接下来，如第(2)部分所述，让我们加载VGG16模型。

model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")

为了让程序计算内容成本，我们现在将分配一个a_C和一个a_G作为适当的隐含层激活。我们将使用conv4_2层来计算内容成本。下面的代码做了以下工作:
1. 将内容图像指定为VGG模型的输入。
2. 设a_C作为“conv4_2”层的隐含层激活的张量。
3. 设a_G作为同一层的隐含层激活的张量。
4. 使用a_C和a_G计算内容成本。

# Assign the content image to be the input of the VGG model.  
sess.run(model['input'].assign(content_image))

# Select the output tensor of layer conv4_2
out = model['conv4_2']

# Set a_C to be the hidden layer activation from the layer we have selected
a_C = sess.run(out)

# Set a_G to be the hidden layer activation from same layer. Here, a_G references model['conv4_2'] 
# and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that
# when we run the session, this will be the activations drawn from the appropriate layer, with G as input.
a_G = out

# Compute the content cost
J_content = compute_content_cost(a_C, a_G)
# Assign the input of the model to be the "style" image 
sess.run(model['input'].assign(style_image))

## 计算风格成本
J_style = compute_style_cost(model, STYLE_LAYERS)
### START CODE HERE ### (1 line)
J = total_cost(J_content, J_style, 10, 40)
### END CODE HERE ###
optimizer=tf.train.AdamOptimizer(2.0)
#train_step
train_step=optimizer.minimize(J)

def model_nn(sess, input_image, num_iterations = 200):

    # Initialize global variables (you need to run the session on the initializer)
    ### START CODE HERE ### (1 line)
    sess.run(tf.global_variables_initializer())
    ### END CODE HERE ###

    # Run the noisy input image (initial generated image) through the model. Use assign().
    ### START CODE HERE ### (1 line)
    sess.run(model["input"].assign(input_image))
    ### END CODE HERE ###

    for i in range(num_iterations):

        # Run the session on the train_step to minimize the total cost
        ### START CODE HERE ### (1 line)
        sess.run(train_step)
        ### END CODE HERE ###

        # Compute the generated image by running the session on the current model['input']
        ### START CODE HERE ### (1 line)
        generated_image =sess.run(model["input"])
        ### END CODE HERE ###

        # Print every 20 iteration.
        if i%20 == 0:
            Jt, Jc, Js = sess.run([J, J_content, J_style])
            print("Iteration " + str(i) + " :")
            print("total cost = " + str(Jt))
            print("content cost = " + str(Jc))
            print("style cost = " + str(Js))

            # save current generated image in the "/output" directory
            nst_utils.save_image("output/" + str(i) + ".png", generated_image)

    # save last generated image
    nst_utils.save_image('output/generated_image.jpg', generated_image)

    return generated_image

运行以下程序生成一个艺术图像。在CPU上运行，每20次迭代大约需要2分钟的时间。但是在140次迭代之后，你就可以开始观察有吸引力的结果了。神经风格转移通常使用gpu进行训练。

model_nn(sess, generated_image)

程序运行完以后，可以在output文件夹中看到生成的图像。

用自己的图像进行测试
1.将自己的图片调为（400*300）的大小；
2.将图片存入images文件夹中；
3.修改3.4节中的代码：

content_image = scipy.misc.imread("images/my_content.jpg")
style_image = scipy.misc.imread("images/my_style.jpg")

4.重新运行程序

墨水河刘能

关注

4
点赞
踩
11

收藏

觉得还不错? 一键收藏
7
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录