Style Transfer(PyTorch)

Style Transfer-PyTorch

Content Loss

content loss用来计算原图片和生成的图片之间像素的差距,这里用的是卷积层获取的 feature map 之间的差距

通过卷积层,有多少个卷积核就会生成多少个 feature_map(也就是一个卷积核的输出结果)

公式为: L c = w c × ∑ i , j ( F i j ℓ − P i j ℓ ) 2 L_c = w_c \times \sum_{i,j} (F_{ij}^{\ell} - P_{ij}^{\ell})^2 Lc=wc×i,j(FijPij)2

  • w c w_c wc 是当前层两张照片之间的差距的权重
  • F l F^l Fl 是当前图片的 feature_map
  • P l P^l Pl 是内容来源图片的 feature_map

在这里插入图片描述

def content_loss(content_weight, content_current, content_original):
    """
    Compute the content loss for style transfer.
    
    Inputs:
    - content_weight: Scalar giving the weighting for the content loss.
    - content_current: features of the current image; this is a PyTorch Tensor of shape
      (1, C_l, H_l, W_l).
    - content_target: features of the content image, Tensor with shape (1, C_l, H_l, W_l).
    
    Returns:
    - scalar content loss
    """
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    N,C,H,W = content_current.shape
    Fc = content_current.view(C,H*W) # view 相当于 reshape
    Pc = content_original.view(C,H*W)
    Lc = content_weight * (Fc - Pc).pow(2).sum()
    return Lc

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

Style Loss

这里我们使用格拉姆矩阵(Gram matrix G)来表示feature map每个通道(channel)之间的联系(也就是风格)。

  • Gram matrix G : G i j ℓ = ∑ k F i k ℓ F j k ℓ G_{ij}^\ell = \sum_k F^{\ell}_{ik} F^{\ell}_{jk} Gij=kFikFjk

    • F l F^l Fl 是当前图片的 feature_map

在这里插入图片描述

在这里插入图片描述

  • Gram matrix 的核心就是以上两个矩阵进行矩阵乘法

    def gram_matrix(features, normalize=True):
        """
        Compute the Gram matrix from features.
        
        Inputs:
        - features: PyTorch Tensor of shape (N, C, H, W) giving features for
          a batch of N images.
        - normalize: optional, whether to normalize the Gram matrix
            If True, divide the Gram matrix by the number of neurons (H * W * C)
        
        Returns:
        - gram: PyTorch Tensor of shape (N, C, C) giving the
          (optionally normalized) Gram matrices for the N input images.
        """
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    
        
        N,C,H,W = features.shape
        F = features.view(N , C , H * W) # N * C * M
        F_t = F.permute(0 , 2 , 1)  # N * M * C    permute的作用是交换维度
        gram = torch.matmul(F , F_t) # N * C * C
        if normalize:
          gram = gram / (C * H * W)
        return gram
    
        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    
  • 第 l 层的 Loss: L s ℓ = w ℓ ∑ i , j ( G i j ℓ − A i j ℓ ) 2 L_s^\ell = w_\ell \sum_{i, j} \left(G^\ell_{ij} - A^\ell_{ij}\right)^2 Ls=wi,j(GijAij)2

    • A l A^l Al 是特征来源图片的 Gram matrix
    • G l G^l Gl 的计算方法相同 , 只是换了输入
  • Loss function: L s = ∑ ℓ ∈ L L s ℓ L_s = \sum_{\ell \in \mathcal{L}} L_s^\ell Ls=LLs

    • 所有层的 loss 相加
  • torch.matmul

    • torch.matmul 是tensor的乘法,输入可以是高维的。(超过两维的时候前面的都当做 batch , 最后两维进行矩阵乘法)
    • 当输入是都是二维时,就是普通的矩阵乘法,和 tensor.mm 函数用法相同
a = torch.ones(3 , 4)
b = torch.ones(4 , 2)
c = torch.matmul(a , b)
c.shape
>> torch.Szie([3 , 2])
a = torch.ones(5 , 3 , 4)
b = torch.ones(4 , 2)
c = torch.matmul(a , b)
c.shape
>> torch.Size([5 , 3 , 2])
a = torch.ones(2 , 5 , 3)
b = torch.ones(1 , 3 , 4)
c = torch.matmul(a , b)
c.shape
>> torch.Size([2 , 5 , 4])
  • 计算 Style Loss

    # Now put it together in the style_loss function...
    def style_loss(feats, style_layers, style_targets, style_weights):
        """
        Computes the style loss at a set of layers.
        
        Inputs:
        - feats: list of the features at every layer of the current image, as produced by
          the extract_features function.
        - style_layers: List of layer indices into feats giving the layers to include in the
          style loss.
        - style_targets: List of the same length as style_layers, where style_targets[i] is
          a PyTorch Tensor giving the Gram matrix of the source style image computed at
          layer style_layers[i].
        - style_weights: List of the same length as style_layers, where style_weights[i]
          is a scalar giving the weight for the style loss at layer style_layers[i].
          
        Returns:
        - style_loss: A PyTorch Tensor holding a scalar giving the style loss.
        """
        # Hint: you can do this with one for loop over the style layers, and should
        # not be very much code (~5 lines). You will need to use your gram_matrix function.
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    
        style_current = []
        #style_loss = torch.zeros([1],dtype=float)
        style_loss = 0
        for i,idx in enumerate(style_layers):
          style_current.append(gram_matrix(feats[idx].clone()))
          style_loss += (style_current[i] - style_targets[i]).pow(2).sum() * style_weights[i]
        
        return style_loss
    
        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    
    

Total-variation regularization

total variation loss可以使图像变得平滑 ,变得更加平滑被证明是有帮助的

具有过多和可能是虚假细节的信号具有高的总变化,即,信号的绝对梯度的积分是高的。根据该原理,减小信号的总变化,使其与原始信号紧密匹配,去除不需要的细节,同时保留诸如边缘的重要细节

  • Total Variation(TV)的方程是这样的: R V β ( f ) = ∫ Ω ( ( ∂ f ∂ u ( u , v ) ) 2 + ( ∂ f ∂ v ( u , v ) ) 2 ) β 2 d u d v \mathcal{R}_{V^{\beta}}(f)=\int_{\Omega}\left(\left(\frac{\partial f}{\partial u}(u, v)\right)^{2}+\left(\frac{\partial f}{\partial v}(u, v)\right)^{2}\right)^{\frac{\beta}{2}} d u d v RVβ(f)=Ω((uf(u,v))2+(vf(u,v))2)2βdudv
  • 在图像中,连续域的积分就变成了像素离散域中求和,所以可以这么算: R V β ( x ) = ∑ i , j ( ( x i , j + 1 − x i j ) 2 + ( x i + 1 , j − x i j ) 2 ) β 2 \mathcal{R}_{V^{\beta}}(\mathbf{x})=\sum_{i, j}\left(\left(x_{i, j+1}-x_{i j}\right)^{2}+\left(x_{i+1, j}-x_{i j}\right)^{2}\right)^{\frac{\beta}{2}} RVβ(x)=i,j((xi,j+1xij)2+(xi+1,jxij)2)2β
  • 也就是说,求每一个像素和横向下一个像素的差的平方,加上纵向下一个像素的差的平方。然后开β/2次根

在本次实验中,我们的 Total Variation(TV) 公式为:

L t v = w t × ( ∑ c = 1 3 ∑ i = 1 H − 1 ∑ j = 1 W ( x i + 1 , j , c − x i , j , c ) 2 + ∑ c = 1 3 ∑ i = 1 H ∑ j = 1 W − 1 ( x i , j + 1 , c − x i , j , c ) 2 ) L_{tv} = w_t \times \left(\sum_{c=1}^3\sum_{i=1}^{H-1}\sum_{j=1}^{W} (x_{i+1,j,c} - x_{i,j,c})^2 + \sum_{c=1}^3\sum_{i=1}^{H}\sum_{j=1}^{W - 1} (x_{i,j+1,c} - x_{i,j,c})^2\right) Ltv=wt×(c=13i=1H1j=1W(xi+1,j,cxi,j,c)2+c=13i=1Hj=1W1(xi,j+1,cxi,j,c)2)

  • 可以看出,和上面的式子一样,只是计算了三通道像素,并且乘以权重 w t w_t wt
def tv_loss(img, tv_weight):
    """
    Compute total variation loss.
    
    Inputs:
    - img: PyTorch Variable of shape (1, 3, H, W) holding an input image.
    - tv_weight: Scalar giving the weight w_t to use for the TV loss.
    
    Returns:
    - loss: PyTorch Variable holding a scalar giving the total variation loss
      for img weighted by tv_weight.
    """
    # Your implementation should be vectorized and not require any loops!
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    N,C,H,W = img.shape
    x1 = img[: , : , 0:H-1 , :]
    x2 = img[: , : , 1:H , :] # x1的所有元素的后一个的组合
    y1 = img[: , : , : , 0:W-1]
    y2 = img[: , : , : , 1:W] # y1的所有元素的后一个的组合
    loss = ((x2-x1).pow(2).sum() + (y2-y1).pow(2).sum()) * tv_weight
    return loss


    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

Generate some pretty pictures

  • Example 1

在这里插入图片描述

  • Example 2

在这里插入图片描述

  • Example 3

在这里插入图片描述

结论分析与体会

本次实验主要学习了利用 PyTorch 进行风格迁移训练(将一张图片的内容与另一张图片的风格相结合)

  • 计算 content loss
    • 将当前图片的 features map 与 content 原图片的 features map进行比较
    • 每一层的 loss 公式为: L c = w c × ∑ i , j ( F i j ℓ − P i j ℓ ) 2 L_c = w_c \times \sum_{i,j} (F_{ij}^{\ell} - P_{ij}^{\ell})^2 Lc=wc×i,j(FijPij)2
  • 计算 style loss
    • 计算当前图片的 features map 对应的 Gram matrix
      • Gram matrix G : G i j ℓ = ∑ k F i k ℓ F j k ℓ G_{ij}^\ell = \sum_k F^{\ell}_{ik} F^{\ell}_{jk} Gij=kFikFjk
    • 计算 style 原图片的 features map 对应的 Gram matrix
      • Gram matrix A : A i j ℓ = ∑ k P i k ℓ P j k ℓ A_{ij}^\ell = \sum_k P^{\ell}_{ik} P^{\ell}_{jk} Aij=kPikPjk
    • 利用当前图片的 Gram matrix 和 style 原图片的 Gram matrix 计算 loss
      • 第 l 层的 Loss: L s ℓ = w ℓ ∑ i , j ( G i j ℓ − A i j ℓ ) 2 L_s^\ell = w_\ell \sum_{i, j} \left(G^\ell_{ij} - A^\ell_{ij}\right)^2 Ls=wi,j(GijAij)2
      • Loss function: L s = ∑ ℓ ∈ L L s ℓ L_s = \sum_{\ell \in \mathcal{L}} L_s^\ell Ls=LLs
  • 计算 Total-variation loss
    • L t v = w t × ( ∑ c = 1 3 ∑ i = 1 H − 1 ∑ j = 1 W ( x i + 1 , j , c − x i , j , c ) 2 + ∑ c = 1 3 ∑ i = 1 H ∑ j = 1 W − 1 ( x i , j + 1 , c − x i , j , c ) 2 ) L_{tv} = w_t \times \left(\sum_{c=1}^3\sum_{i=1}^{H-1}\sum_{j=1}^{W} (x_{i+1,j,c} - x_{i,j,c})^2 + \sum_{c=1}^3\sum_{i=1}^{H}\sum_{j=1}^{W - 1} (x_{i,j+1,c} - x_{i,j,c})^2\right) Ltv=wt×(c=13i=1H1j=1W(xi+1,j,cxi,j,c)2+c=13i=1Hj=1W1(xi,j+1,cxi,j,c)2)
  • 总的 Loss
    • l o s s = c o n t e n t L o s s + s t y l e L o s s + t v L o s s loss = contentLoss + styleLoss + tvLoss loss=contentLoss+styleLoss+tvLoss
  • 使用梯度下降的方法缩小 loss 对模型进行优化
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

长命百岁️

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值