吴恩达深度学习学习笔记——C4W4——特殊应用:人脸识别和神经风格转换——作业2——神经风格转换

这里主要梳理一下作业的主要内容和思路,完整作业文件可参考:

https://github.com/pandenghuang/Andrew-Ng-Deep-Learning-notes/tree/master/assignments/C4W4/Assignment/Neural_Style_Transfer

作业完整截图,参考本文结尾:作业完整截图。

Deep Learning & Art: Neural Style Transfer(深度学习与艺术:神经风格转换)

Welcome to the second assignment of this week. In this assignment, you will learn about Neural Style Transfer. This algorithm was created by Gatys et al. (2015) (https://arxiv.org/abs/1508.06576).

In this assignment, you will:

  • Implement the neural style transfer algorithm
  • Generate novel artistic images using your algorithm

Most of the algorithms you've studied optimize a cost function to get a set of parameter values. In Neural Style Transfer, you'll optimize a cost function to get pixel values!

1 - Problem Statement (问题描述)

Neural Style Transfer (NST) is one of the most fun techniques in deep learning. As seen below, it merges two images, namely, a "content" image (C) and a "style" image (S), to create a "generated" image (G). The generated image G combines the "content" of the image C with the "style" of image S.

In this example, you are going to generate an image of the Louvre museum in Paris (content image C), mixed with a painting by Claude Monet, a leader of the impressionist movement (style image S).

...

2 - Transfer Learning(转换学习)

Neural Style Transfer (NST) uses a previously trained convolutional network, and builds on top of that. The idea of using a network trained on a different task and applying it to a new task is called transfer learning.

Following the original NST paper (https://arxiv.org/abs/1508.06576), we will use the VGG network. Specifically, we'll use VGG-19, a 19-layer version of the VGG network. This model has already been trained on the very large ImageNet database, and thus has learned to recognize a variety of low level features (at the earlier layers) and high level features (at the deeper layers).

Run the following code to load parameters from the VGG model. This may take a few seconds.

3 - Neural Style Transfer(神经风格转换)

We will build the NST algorithm in three steps:

  • Build the content cost function 𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡(𝐶,𝐺)
  • Build the style cost function 𝐽𝑠𝑡𝑦𝑙𝑒(𝑆,𝐺)
  • Put it together to get 𝐽(𝐺)=𝛼𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡(𝐶,𝐺)+𝛽𝐽𝑠𝑡𝑦𝑙𝑒(𝑆,𝐺)

3.1 - Computing the content cost(计算内容成本)

In our running example, the content image C will be the picture of the Louvre Museum in Paris. Run the code below to see a picture of the Louvre.

...

**What you should remember**:
- The content cost takes a hidden layer activation of the neural network, and measures how different 𝑎(𝐶) and 𝑎(𝐺) are.
- When we minimize the content cost later, this will help make sure 𝐺 has similar content as 𝐶.

3.2 - Computing the style cost (计算风格成本)

For our running example, we will use the following style image:

...

3.2.1 - Style matrix(风格矩阵)

The style matrix is also called a "Gram matrix." In linear algebra, the Gram matrix G of a set of vectors (𝑣1,…,𝑣𝑛) is the matrix of dot products, whose entries are 𝐺𝑖𝑗=𝑣𝑖𝑇𝑣𝑗=𝑛𝑝.𝑑𝑜𝑡(𝑣𝑖,𝑣𝑗). In other words, 𝐺𝑖𝑗 compares how similar 𝑣𝑖 is to 𝑣𝑗: If they are highly similar, you would expect them to have a large dot product, and thus for 𝐺𝑖𝑗Gij to be large.

Note that there is an unfortunate collision in the variable names used here. We are following common terminology used in the literature, but 𝐺 is used to denote the Style matrix (or Gram matrix) as well as to denote the generated image 𝐺. We will try to make sure which 𝐺 we are referring to is always clear from the context.

In NST, you can compute the Style matrix by multiplying the "unrolled" filter matrix with their transpose:

...

3.2.2 - Style cost(风格成本)

After generating the Style matrix (Gram matrix), your goal will be to minimize the distance between the Gram matrix of the "style" image S and that of the "generated" image G. For now, we are using only a single hidden layer 𝑎[𝑙]a[l], and the corresponding style cost for this layer is defined as:

...

3.2.3 Style Weights(风格权重)

So far you have captured the style from only one layer. We'll get better results if we "merge" style costs from several different layers. After completing this exercise, feel free to come back and experiment with different weights to see how it changes the generated image 𝐺G. But for now, this is a pretty reasonable default:

...

**What you should remember**:
- The style of an image can be represented using the Gram matrix of a hidden layer's activations. However, we get even better results combining this representation from multiple different layers. This is in contrast to the content representation, where usually using just a single hidden layer is sufficient.
- Minimizing the style cost will cause the image 𝐺 to follow the style of the image 𝑆.

...

3.3 - Defining the total cost to optimize(定义需优化的总成本)

Finally, let's create a cost function that minimizes both the style and the content cost. The formula is:

𝐽(𝐺)=𝛼𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡(𝐶,𝐺)+𝛽𝐽𝑠𝑡𝑦𝑙𝑒(𝑆,𝐺)

...

4 - Solving the optimization problem(解决优化问题)

 

Finally, let's put everything together to implement Neural Style Transfer!

Here's what the program will have to do:

  1. Create an Interactive Session
  2. Load the content image
  3. Load the style image
  4. Randomly initialize the image to be generated
  5. Load the VGG16 model
  6. Build the TensorFlow graph:
    • Run the content image through the VGG16 model and compute the content cost
    • Run the style image through the VGG16 model and compute the style cost
    • Compute the total cost
    • Define the optimizer and the learning rate
  7. Initialize the TensorFlow graph and run it for a large number of iterations, updating the generated image at every step.
Lets go through the individual steps in detail.

 

 

You've previously implemented the overall cost 𝐽(𝐺). We'll now set up TensorFlow to optimize this with respect to 𝐺. To do so, your program has to reset the graph and use an "Interactive Session". Unlike a regular session, the "Interactive Session" installs itself as the default session to build a graph. This allows you to run variables without constantly needing to refer to the session object, which simplifies the code.

Lets start the interactive session.

...

5 - Test with your own image (Optional/Ungraded)(使用自己的图片进行测试)

Finally, you can also rerun the algorithm on your own images!

...

6 - Conclusion(小结)

Great job on completing this assignment! You are now able to use Neural Style Transfer to generate artistic images. This is also your first time building a model in which the optimization algorithm updates the pixel values rather than the neural network's parameters. Deep learning has many different types of models and this is only one of them!

What you should remember:
- Neural Style Transfer is an algorithm that given a content image C and a style image S can generate an artistic image
- It uses representations (hidden layer activations) based on a pretrained ConvNet.
- The content cost function is computed using one hidden layer's activations.
- The style cost function for one layer is computed using the Gram matrix of that layer's activations. The overall style cost function is obtained using several hidden layers.
- Optimizing the total cost function results in synthesizing new images.

 

This was the final programming exercise of this course. Congratulations--you've finished all the programming exercises of this course on Convolutional Networks! We hope to also see you in Course 5, on Sequence models!

 

References:(参考资料)

The Neural Style Transfer algorithm was due to Gatys et al. (2015). Harish Narayanan and Github user "log0" also have highly readable write-ups from which we drew inspiration. The pre-trained network used in this implementation is a VGG network, which is due to Simonyan and Zisserman (2015). Pre-trained weights were from the work of the MathConvNet team.

作业完整截图:

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值