Pytorch学习(十)---解读Neural Style代码

本文解析PyTorch官方的Neural Style代码,对比传统写法,指出其简化之处,如自定义Module中无需显式定义backward,以及Gram矩阵在forward中计算。讨论了retain_graph与detach的用途,retain_graph用于保存计算图以便多次backward,detach则将Variable从计算图中截断。总结强调在自动求导框架下,全Variable操作避免手动写反向传播,以及detach和retain_graph的重要作用。
摘要由CSDN通过智能技术生成

总说

其实之前写过的torch版本的neural style代码的解读,可以参考
Torch7学习(七)——Neural-Style代码解析,不过那是传统的层的思想的框架,如今都是计算图的思想了。pytorch版本的写法与之前的写法还是有一定差异的,主要是简单了很多!对比之后你会震撼的。

pytorch官网的neural style代码

其他没啥好看的,主要看核心代码:

class ContentLoss(nn.Module):

    def __init__(self, target, weight):
        super(ContentLoss, self).__init__()
        # we 'detach' the target content from the tree used
        self.target = target.detach() * weight
        # to dynamically compute the gradient: this is a stated value,
        # not a variable. Otherwise the forward method of the criterion
        # will throw an error.
        self.weight = weight
        self.criterion = nn.MSELoss()

    def forward(self, input):
        self.loss = self.criterion(input * self.weight, self.target)
        self.output = input
        return self.output

    def backward(self, retain_graph=True):
        self.loss.backward(retain_graph=retain_graph)
        return self.loss

class GramMatrix(nn.Module):

    def forward(self, input):
        a, b, c, d = input.size()  # a=batch size(=1)
        # b=number of feature maps
        # (c,d)=dimensions of a f. map (N=c*d)

        features = input.view(a * b, c * d)  # resise F_XL into \hat F_XL

        G = torch.mm(features, features.t())  # compute the gram product

        # we 'normalize' the values of the gram matrix
        # by dividing by the number of element in each feature maps.
        return G.div(a * b * c * d)

class StyleLoss(nn.Module):

    def __init__(self, target, weight):
        super(StyleLoss, self).__init__()
        self.target = target.detach() * weight
        self.weight = weight
        self.gram = GramMatrix()
        self.criterion = nn.MSELoss()

    def forward(self, input):
        self.output = input.clone()
        self.G = self.gram(input)
        self.G.mul_(self.weight)
        self.loss = self.criterion(self.G, self.target)
        return self.output

    def backward(self, retain_graph=True):
        self.loss.backward(retain_graph=retain_graph)
        return self.loss

上面部分,我们发现比较诡异的地方:
1. 在Pytorch

以下是基于PyTorch的神经色彩转移的示例代码: ```python import numpy as np import torch import torchvision.transforms.functional as F from PIL import Image device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # 加载预训练的VGG网络 vgg = torch.hub.load('pytorch/vision:v0.6.0', 'vgg19', pretrained=True).features.to(device).eval() # 定义图像处理函数 def load_image(filename, size=None, scale=None): img = Image.open(filename).convert('RGB') if size is not None: img = img.resize((size, size), Image.BICUBIC) elif scale is not None: img = img.resize((int(img.width / scale), int(img.height / scale)), Image.BICUBIC) return F.to_tensor(img).unsqueeze(0).to(device) def save_image(tensor, filename): image = tensor.cpu().clone() image = image.squeeze(0) image = F.to_pil_image(image) image.save(filename) # 定义Gram矩阵函数 def gram_matrix(tensor): batch, depth, height, width = tensor.size() tensor = tensor.view(batch * depth, height * width) gram = torch.mm(tensor, tensor.t()) return gram.div(batch * depth * height * width) # 定义神经色彩转移函数 def neural_transfer(content, style, alpha=1.0, beta=1e6, iterations=1000): # 加载内容和风格图像 content_tensor = load_image(content) style_tensor = load_image(style, size=content_tensor.size(-1)) # 定义生成的图像为内容图像 target = content_tensor.clone().requires_grad_(True) # 定义优化器 optimizer = torch.optim.Adam([target], lr=0.01) # 迭代 for i in range(iterations): target_features = vgg(target) content_features = vgg(content_tensor) style_features = vgg(style_tensor) # 计算内容损失 content_loss = torch.mean((target_features['conv4_2'] - content_features['conv4_2']) ** 2) # 计算风格损失 style_loss = 0 for layer in ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv5_1']: target_feature = target_features[layer] target_gram = gram_matrix(target_feature) style_feature = style_features[layer] style_gram = gram_matrix(style_feature) layer_style_loss = torch.mean((target_gram - style_gram) ** 2) style_loss += layer_style_loss # 计算总损失 total_loss = alpha * content_loss + beta * style_loss # 更新生成的图像 optimizer.zero_grad() total_loss.backward() optimizer.step() # 输出损失 if i % 100 == 0: print('Iteration [{}/{}], Content Loss: {:.4f}, Style Loss: {:.4f}, Total Loss: {:.4f}'.format( i+1, iterations, content_loss.item(), style_loss.item(), total_loss.item())) # 返回生成的图像 return target.detach() # 运行神经色彩转移函数 content = 'content.jpg' style = 'style.jpg' output = 'output.jpg' result = neural_transfer(content, style, iterations=1000) save_image(result, output) ``` 在这个示例代码中,我们首先加载了预训练的VGG网络,并定义了图像处理函数和Gram矩阵函数。然后,我们定义了神经色彩转移函数,其中我们通过迭代更新生成的图像,并计算内容损失和风格损失。最后,我们运行神经色彩转移函数,并将生成的图像保存到文件中。 请注意,这个示例代码仅用于演示神经色彩转移的基本原理,实际的应用中可能需要进行更多的参数调整和优化。
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值