Style Transfer（PyTorch）

长命百岁️

已于 2022-03-17 17:01:20 修改

阅读量429

点赞数

分类专栏：深度学习文章标签： pytorch 深度学习人工智能

于 2021-11-04 19:30:43 首次发布

本文链接：https://blog.csdn.net/qq_52852138/article/details/121149459

版权

深度学习专栏收录该内容

21 篇文章 2 订阅

订阅专栏

Style Transfer-PyTorch

Content Loss

content loss用来计算原图片和生成的图片之间像素的差距，这里用的是卷积层获取的 feature map 之间的差距

通过卷积层，有多少个卷积核就会生成多少个 feature_map（也就是一个卷积核的输出结果）

公式为： $L_c = w_c \times \sum_{i,j} (F_{ij}^{\ell} - P_{ij}^{\ell})^2$

$w_c$ 是当前层两张照片之间的差距的权重
$F^l$ 是当前图片的 feature_map
$P^l$ 是内容来源图片的 feature_map

在这里插入图片描述

def content_loss(content_weight, content_current, content_original):
    """
    Compute the content loss for style transfer.
    
    Inputs:
    - content_weight: Scalar giving the weighting for the content loss.
    - content_current: features of the current image; this is a PyTorch Tensor of shape
      (1, C_l, H_l, W_l).
    - content_target: features of the content image, Tensor with shape (1, C_l, H_l, W_l).
    
    Returns:
    - scalar content loss
    """
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    N,C,H,W = content_current.shape
    Fc = content_current.view(C,H*W) # view 相当于 reshape
    Pc = content_original.view(C,H*W)
    Lc = content_weight * (Fc - Pc).pow(2).sum()
    return Lc

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

Style Loss

这里我们使用格拉姆矩阵（Gram matrix G）来表示feature map每个通道（channel）之间的联系（也就是风格）。

Gram matrix G ： $G_{ij}^\ell = \sum_k F^{\ell}_{ik} F^{\ell}_{jk}$
- $F^l$ 是当前图片的 feature_map

在这里插入图片描述

Gram matrix 的核心就是以上两个矩阵进行矩阵乘法

def gram_matrix(features, normalize=True):
    """
    Compute the Gram matrix from features.
    
    Inputs:
    - features: PyTorch Tensor of shape (N, C, H, W) giving features for
      a batch of N images.
    - normalize: optional, whether to normalize the Gram matrix
        If True, divide the Gram matrix by the number of neurons (H * W * C)
    
    Returns:
    - gram: PyTorch Tensor of shape (N, C, C) giving the
      (optionally normalized) Gram matrices for the N input images.
    """
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    
    N,C,H,W = features.shape
    F = features.view(N , C , H * W) # N * C * M
    F_t = F.permute(0 , 2 , 1)  # N * M * C    permute的作用是交换维度
    gram = torch.matmul(F , F_t) # N * C * C
    if normalize:
      gram = gram / (C * H * W)
    return gram

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

第 l 层的 Loss： $L_s^\ell = w_\ell \sum_{i, j} \left(G^\ell_{ij} - A^\ell_{ij}\right)^2$
- $A^l$ 是特征来源图片的 Gram matrix
- 和 $G^l$ 的计算方法相同，只是换了输入
Loss function： $L_s = \sum_{\ell \in \mathcal{L}} L_s^\ell$
- 所有层的 loss 相加
torch.matmul
- torch.matmul 是tensor的乘法，输入可以是高维的。（超过两维的时候前面的都当做 batch ，最后两维进行矩阵乘法）
- 当输入是都是二维时，就是普通的矩阵乘法，和 tensor.mm 函数用法相同

a = torch.ones(3 , 4)
b = torch.ones(4 , 2)
c = torch.matmul(a , b)
c.shape
>> torch.Szie([3 , 2])

a = torch.ones(5 , 3 , 4)
b = torch.ones(4 , 2)
c = torch.matmul(a , b)
c.shape
>> torch.Size([5 , 3 , 2])

a = torch.ones(2 , 5 , 3)
b = torch.ones(1 , 3 , 4)
c = torch.matmul(a , b)
c.shape
>> torch.Size([2 , 5 , 4])

计算 Style Loss

# Now put it together in the style_loss function...
def style_loss(feats, style_layers, style_targets, style_weights):
    """
    Computes the style loss at a set of layers.
    
    Inputs:
    - feats: list of the features at every layer of the current image, as produced by
      the extract_features function.
    - style_layers: List of layer indices into feats giving the layers to include in the
      style loss.
    - style_targets: List of the same length as style_layers, where style_targets[i] is
      a PyTorch Tensor giving the Gram matrix of the source style image computed at
      layer style_layers[i].
    - style_weights: List of the same length as style_layers, where style_weights[i]
      is a scalar giving the weight for the style loss at layer style_layers[i].
      
    Returns:
    - style_loss: A PyTorch Tensor holding a scalar giving the style loss.
    """
    # Hint: you can do this with one for loop over the style layers, and should
    # not be very much code (~5 lines). You will need to use your gram_matrix function.
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    style_current = []
    #style_loss = torch.zeros([1],dtype=float)
    style_loss = 0
    for i,idx in enumerate(style_layers):
      style_current.append(gram_matrix(feats[idx].clone()))
      style_loss += (style_current[i] - style_targets[i]).pow(2).sum() * style_weights[i]
    
    return style_loss

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

Total-variation regularization

total variation loss可以使图像变得平滑，变得更加平滑被证明是有帮助的

具有过多和可能是虚假细节的信号具有高的总变化，即，信号的绝对梯度的积分是高的。根据该原理，减小信号的总变化，使其与原始信号紧密匹配，去除不需要的细节，同时保留诸如边缘的重要细节

Total Variation(TV)的方程是这样的： $\mathcal{R}_{V^{\beta}}(f)=\int_{\Omega}\left(\left(\frac{\partial f}{\partial u}(u, v)\right)^{2}+\left(\frac{\partial f}{\partial v}(u, v)\right)^{2}\right)^{\frac{\beta}{2}} d u d v$
在图像中，连续域的积分就变成了像素离散域中求和，所以可以这么算： $\mathcal{R}_{V^{\beta}}(\mathbf{x})=\sum_{i, j}\left(\left(x_{i, j+1}-x_{i j}\right)^{2}+\left(x_{i+1, j}-x_{i j}\right)^{2}\right)^{\frac{\beta}{2}}$
也就是说，求每一个像素和横向下一个像素的差的平方，加上纵向下一个像素的差的平方。然后开β/2次根

在本次实验中，我们的 Total Variation(TV) 公式为：

$L_{tv} = w_t \times \left(\sum_{c=1}^3\sum_{i=1}^{H-1}\sum_{j=1}^{W} (x_{i+1,j,c} - x_{i,j,c})^2 + \sum_{c=1}^3\sum_{i=1}^{H}\sum_{j=1}^{W - 1} (x_{i,j+1,c} - x_{i,j,c})^2\right)$

可以看出，和上面的式子一样，只是计算了三通道像素，并且乘以权重 $w_t$

def tv_loss(img, tv_weight):
    """
    Compute total variation loss.
    
    Inputs:
    - img: PyTorch Variable of shape (1, 3, H, W) holding an input image.
    - tv_weight: Scalar giving the weight w_t to use for the TV loss.
    
    Returns:
    - loss: PyTorch Variable holding a scalar giving the total variation loss
      for img weighted by tv_weight.
    """
    # Your implementation should be vectorized and not require any loops!
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    N,C,H,W = img.shape
    x1 = img[: , : , 0:H-1 , :]
    x2 = img[: , : , 1:H , :] # x1的所有元素的后一个的组合
    y1 = img[: , : , : , 0:W-1]
    y2 = img[: , : , : , 1:W] # y1的所有元素的后一个的组合
    loss = ((x2-x1).pow(2).sum() + (y2-y1).pow(2).sum()) * tv_weight
    return loss


    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

Generate some pretty pictures

Example 1

在这里插入图片描述

Example 2

在这里插入图片描述

Example 3

在这里插入图片描述

结论分析与体会

本次实验主要学习了利用 PyTorch 进行风格迁移训练（将一张图片的内容与另一张图片的风格相结合）

计算 content loss
- 将当前图片的 features map 与 content 原图片的 features map进行比较
- 每一层的 loss 公式为： $L_c = w_c \times \sum_{i,j} (F_{ij}^{\ell} - P_{ij}^{\ell})^2$
计算 style loss
- 计算当前图片的 features map 对应的 Gram matrix
  - Gram matrix G ： $G_{ij}^\ell = \sum_k F^{\ell}_{ik} F^{\ell}_{jk}$
- 计算 style 原图片的 features map 对应的 Gram matrix
  - Gram matrix A ： $A_{ij}^\ell = \sum_k P^{\ell}_{ik} P^{\ell}_{jk}$
- 利用当前图片的 Gram matrix 和 style 原图片的 Gram matrix 计算 loss
  - 第 l 层的 Loss： $L_s^\ell = w_\ell \sum_{i, j} \left(G^\ell_{ij} - A^\ell_{ij}\right)^2$
  - Loss function： $L_s = \sum_{\ell \in \mathcal{L}} L_s^\ell$
计算 Total-variation loss
- $L_{tv} = w_t \times \left(\sum_{c=1}^3\sum_{i=1}^{H-1}\sum_{j=1}^{W} (x_{i+1,j,c} - x_{i,j,c})^2 + \sum_{c=1}^3\sum_{i=1}^{H}\sum_{j=1}^{W - 1} (x_{i,j+1,c} - x_{i,j,c})^2\right)$
总的 Loss
- $l o s s = c o n t e n t L o s s + s t y l e L o s s + t v L o s s$
使用梯度下降的方法缩小 loss 对模型进行优化

长命百岁️

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
Style Transfer（PyTorch）

Style Transfer-PyTorchContent Losscontent loss用来计算原图片和生成的图片之间像素的差距，这里用的是卷积层获取的 feature map 之间的差距通过卷积层，有多少个卷积核就会生成多少个 feature_map（也就是一个卷积核的输出结果）公式为：Lc=wc×∑i,j(Fijℓ−Pijℓ)2L_c = w_c \times \sum_{i,j} (F_{ij}^{\ell} - P_{ij}^{\ell})^2Lc=wc×∑i,j(Fijℓ−Pi
复制链接

扫一扫