深度卷积网络图像风格转移(三)代码分析

理解 Deep Photo Style Transfer源代码

Taylor Guo, 2017年5月17日

代码哪里下载?需要自己动手……

  • Lua提供require函数来加载运行库
    1.require会搜索目录加载文件;
    2.require会判断是否文件已经加载,避免重复加载同一文件。

    require函数实现了不同lua文件的加载,类似于C++中的include,java中的import。require使用的路径和普通的路径还是有些区别,我们一般见到的路径都是一个目录列表。require的路径是一个模式列表,每一个模式指明一种由虚文件名(require的参数modname)转成实文件名的方法。更明确地说,每一个模式是一个包含可选的问号(?)的文件名。匹配的时候Lua会首先将问号用虚文件名替换,然后看是否有这样的文件存在。如果不存在继续用同样的方法用第二个模式匹配。

    – 例如:路径为: ?; ?.lua; c:\windows\?; /usr/local/lua/?/?.lua
    调用require(“add”)会打开这些文件:
    add
    add.lua
    c:\windows\add
    /usr/local/lua/add/add.lua
    参考:http://blog.csdn.net/xxxxyyyy2012/article/details/41675345

  • –torch: Torch是Torch7的主要包,其中的数据结构定义了多维数据结构张量和数学计算。还提供了很多有用的工具处理文件,任意类型的序列化,还有其他有用的工具。
    参考:
    https://github.com/torch/torch7
    https://github.com/torch/torch7/wiki/Cheatsheet#cuda

  • –nn:神经网络包,由不同模块组成。 Module是抽象类,子类是Container,而container又有三个构建神经网络最重要的子类:Sequential, Parallel和Concat,由这三类构成的神经网络中可以包含简单层,即Linear, Mean, Max和Reshape等,也包括卷积层还有激活函数Tanh, ReLU等。
    参考:
    http://blog.csdn.net/hungryof/article/details/52022415 ; https://github.com/soumith/cvpr2015/blob/master/Deep%20Learning%20with%20Torch.ipynb

    torch nn神经网络包中构建神经网络的子类
    这里写图片描述

  • –image:是Torch7发布包中处理图像的包。
    包含:
    保存和加载图像jpeg,png,ppm和PGM;
    简单变换-平移,缩放和旋转;
    参数化变换:卷积和卷曲;
    简单绘画:写文字和矩形;
    图形界面:显示和窗口;
    颜色空间:转成或从这里转 RGB,YUV,LAB和HSL;
    张量构造:创建Lenna,Fabio,高斯核拉普拉斯内核。
    参考: https://github.com/torch/image

  • –optim:torch7 几种优化方法和logger
    优化算法:
    Stochastic Gradient Descent
    Averaged Stochastic Gradient Descent
    L-BFGS
    Congugate Gradients
    AdaDelta
    AdaGrad
    Adam
    AdaMax
    FISTA with backtracking line search
    Nesterov’s Accelerated Gradient method
    RMSprop
    Rprop
    CMAES
    参考:https://github.com/torch/optim

  • –loadcaffe:在torch7下加载caffe网络,无需caffe依赖情况下,加载caffe网络,只需要安装protobuf。
    参考:
    https://github.com/szagoruyko/loadcaffe
    https://github.com/torch/torch7/wiki/Cheatsheet
    https://github.com/torch/rocks/blob/master/loadcaffe-1.0-0.rockspec

  • –libcuda_utils: cuda库
    参考:
    https://github.com/luanfujun/deep-photo-styletransfer

  • –cutorch :是一个torch7的CUDA后端。
    提供了:

    • 一种新的张量类型torch.CudaTensor,可以运行在GPU上。cutorch 支持大部分张量操作。
    • 还支持其他GPU张量类型,不过功能有限。
    • cutorch.*:获取和设置GPU,设备属性,内存使用状况等。
      参考:https://github.com/torch/cutorch
  • –cunn:CUDA后端对神经网络包的实现
    它基于nn包中的模块提供了CUDA的实现。
    参考:
    https://github.com/torch/cunn

  • –matio:C语言库,用于读取和写入matlab mat文件
    参考:
    https://github.com/soumith/matio-ffi.torch
    https://sourceforge.net/projects/matio/

代码

  • LUA代码主要是:deepmatting_seg.lua 和 neuralstyle_seg.lua。
    -deepmatting_seg.lua里面有完整的网络构建、 Laplacian Matting正则项、数值优化和损失函数计算。
    -neuralstyle_seg.lua里面有完整的网络构建、 语义分割遮罩、数值优化和损失函数计算。

neuralstyle_seg.lua 代码


require 'torch'
require 'nn'
require 'image'
require 'optim'

require 'loadcaffe'
require 'libcuda_utils'

require 'cutorch'
require 'cunn'


--lua的变量是全局的,即一个文件中的变量所有的文件都可以访问。除非加入local进行限制。
--torch.CmdLine() lua文件的CmdLine类用于在命令行下用几个不同参数时,解析参数。还可以打印输出定向到log文件中。
--option(名称,缺省,帮助) 存储一个可选变量。名称以“-”开头。
--参考官网:http://torch7.readthedocs.io/en/latest/cmdline/index.html



local matio = require 'matio'
local cmd = torch.CmdLine()

-- Basic options
cmd:option('-style_image', 'examples/inputs/seated-nude.jpg', 'Style target image')
cmd:option('-content_image', 'examples/inputs/tubingen.jpg','Content target image')
cmd:option('-style_seg', '', 'Style segmentation')
cmd:option('-style_seg_idxs', '', 'Style seg idxs')
cmd:option('-content_seg', '', 'Content segmentation')
cmd:option('-content_seg_idxs', '', 'Content seg idxs')

cmd:option('-gpu', 0, 'Zero-indexed ID of the GPU to use; for CPU mode set -gpu = -1')

-- Optimization options
cmd:option('-content_weight', 5e0)
cmd:option('-style_weight', 1e2)
cmd:option('-tv_weight', 1e-3)
cmd:option('-num_iterations', 1000)

-- Output options
cmd:option('-print_iter', 1)
cmd:option('-save_iter', 100)
cmd:option('-output_image', 'out.png') 
cmd:option('-index', 1)
cmd:option('-serial', 'serial_example') 

-- Other options
cmd:option('-proto_file', 'models/VGG_ILSVRC_19_layers_deploy.prototxt')
cmd:option('-model_file', 'models/VGG_ILSVRC_19_layers.caffemodel')
cmd:option('-backend', 'nn', 'nn|cudnn|clnn')
cmd:option('-cudnn_autotune', false)
cmd:option('-seed', 612)

cmd:option('-content_layers', 'relu4_2', 'layers for content')
cmd:option('-style_layers',   'relu1_1,relu2_1,relu3_1,relu4_1,relu5_1', 'layers for style')

local function main(params)
--设置GPU模式,如果有多个GPU,可以切换缺省GPU(分配CUDA张量做运算)。
-- GPU ID从1开始数,有4个GPU,可以设为setDevice(1),setDevice(2),setDevice(3),setDevice(4)。
  cutorch.setDevice(params.gpu + 1)
  cutorch.setHeapTracking(true)

--torch初始化的时候,可以用seed()作为随机种子生成器提供随机数生成器。
--也可以用manualSeed()重新初始化。
--manualSeed([gen,]number)用给定的数字number设置随机数生成器的种子。
  torch.manualSeed(params.seed)

--getDevice():返回当前设置的GPU序号。
  idx = cutorch.getDevice()
  print('gpu, idx = ', params.gpu, idx)

  -- content: pitie transferred input image
  --image.load(文件名,[通道(1是灰度图 或 3是彩色图),张量类型(浮点,双浮点,或字节)])后面两个参数可选。
  -- preprocess是自定义函数,后面有详细介绍。
  -- 将大小从[0,1]改为[0,255];将RGB改为BGR;减去平均像素值。
  -- local params = cmd:parse(arg)在最后有定义。
  local content_image = image.load(params.content_image, 3)
  local content_image_caffe = preprocess(content_image):float():cuda()
  local content_layers = params.content_layers:split(",")

  -- style: target model image
  local style_image = image.load(params.style_image, 3)
  local style_image_caffe = preprocess(style_image):float():cuda()
  local style_layers = params.style_layers:split(",")

  local c, h, w = content_image:size(1), content_image:size(2), content_image:size(3)
  local _, h2, w2 = style_image:size(1), style_image:size(2), style_image:size(3)
  local index = params.index

  -- segmentation images 
  -- 图像语义分割,在内容和风格上分别添加语义遮罩,遮罩里面的物体颜色可以自定义。
  --[
  local content_seg = image.load(params.content_seg, 3)
  content_seg = image.scale(content_seg, w, h, 'bilinear')
  local style_seg = image.load(params.style_seg, 3)
  style_seg = image.scale(style_seg, w2, h2, 'bilinear')
  local color_codes = {'blue', 'green', 'black', 'white', 'red', 'yellow', 'grey', 'lightblue', 'purple'}
  local color_content_masks, color_style_masks = {}, {}
  for j = 1, #color_codes do
    local content_mask_j = ExtractMask(content_seg, color_codes[j])
    local style_mask_j = ExtractMask(style_seg, color_codes[j])
    table.insert(color_content_masks, content_mask_j)
    table.insert(color_style_masks, style_mask_j)
  end 
  --]]

  -- Set up the network, inserting style and content loss modules
  -- 构建网络,插入风格和内容损失模块
  local content_losses, style_losses = {}, {}
  local next_content_idx, next_style_idx = 1, 1
  local net = nn.Sequential()

  if params.tv_weight > 0 then
    local tv_mod = nn.TVLoss(params.tv_weight):float():cuda()
    net:add(tv_mod)
  end

  -- load VGG-19 network
  -- 加载VGG-19网络,并添加风格和内容。
  local cnn = loadcaffe.load(params.proto_file, params.model_file, params.backend):float():cuda()

  paths.mkdir(tostring(params.serial))
  print('Exp serial:', params.serial)

  for i = 1, #cnn do
    if next_content_idx <= #content_layers or next_style_idx <= #style_layers then
      local layer = cnn:get(i)
      local name = layer.name
      local layer_type = torch.type(layer)
      local is_pooling = (layer_type == 'nn.SpatialMaxPooling' or layer_type == 'cudnn.SpatialMaxPooling')
      local is_conv    = (layer_type == 'nn.SpatialConvolution' or layer_type == 'cudnn.SpatialConvolution')

      net:add(layer)

      if is_pooling then
        for k = 1, #color_codes do
          color_content_masks[k] = image.scale(color_content_masks[k], math.ceil(color_content_masks[k]:size(2)/2), math.ceil(color_content_masks[k]:size(1)/2))
          color_style_masks[k]   = image.scale(color_style_masks[k],   math.ceil(color_style_masks[k]:size(2)/2),   math.ceil(color_style_masks[k]:size(1)/2))
        end
      elseif is_conv then
        local sap = nn.SpatialAveragePooling(3,3,1,1,1,1):float()
        for k = 1, #color_codes do
          color_content_masks[k] = sap:forward(color_content_masks[k]:repeatTensor(1,1,1))[1]:clone()
          color_style_masks[k]   = sap:forward(color_style_masks[k]:repeatTensor(1,1,1))[1]:clone()
        end
      end 
      color_content_masks = deepcopy(color_content_masks)
      color_style_masks = deepcopy(color_style_masks)


      if name == content_layers[next_content_idx] then
        print("Setting up content layer", i, ":", layer.name)
        local target = net:forward(content_image_caffe):clone()
        local loss_module = nn.ContentLoss(params.content_weight, target, false):float():cuda()
        net:add(loss_module)
        table.insert(content_losses, loss_module)
        next_content_idx = next_content_idx + 1
      end

     if name == style_layers[next_style_idx] then
        print("Setting up style layer  ", i, ":", layer.name)
        local gram = GramMatrix():float():cuda()
        local target_features = net:forward(style_image_caffe):clone()

        local target_grams = {}

        for j = 1, #color_codes do 
          local l_style_mask_ori = color_style_masks[j]:clone():cuda()
          local l_style_mask = l_style_mask_ori:repeatTensor(1,1,1):expandAs(target_features)
          local l_style_mean = l_style_mask_ori:mean()

          local masked_target_features = torch.cmul(l_style_mask, target_features)
          local masked_target_gram = gram:forward(masked_target_features):clone()
          if l_style_mean > 0 then
            masked_target_gram:div(target_features:nElement() * l_style_mean)
          end 
          table.insert(target_grams, masked_target_gram)
        end 

        local loss_module = nn.StyleLossWithSeg(params.style_weight, target_grams, color_content_masks, color_codes, next_style_idx, false):float():cuda()

        net:add(loss_module)
        table.insert(style_losses, loss_module)
        next_style_idx = next_style_idx + 1
      end 

    end
  end

  -- We don't need the base CNN anymore, so clean it up to save memory.
  cnn = nil
  for i=1,#net.modules do
    local module = net.modules[i]
    if torch.type(module) == 'nn.SpatialConvolutionMM' then
        -- remove these, not used, but uses gpu memory
        module.gradWeight = nil
        module.gradBias = nil
    end
  end
  collectgarbage()

  local mean_pixel = torch.CudaTensor({103.939, 116.779, 123.68})
  local meanImage = mean_pixel:view(3, 1, 1):expandAs(content_image_caffe)

  local img = torch.randn(content_image:size()):float():mul(0.0001):cuda()

  -- Run it through the network once to get the proper size for the gradient
  -- All the gradients will come from the extra loss modules, so we just pass
  -- zeros into the top of the net on the backward pass.
  local y = net:forward(img)
  local dy = img.new(#y):zero()

  -- Declaring this here lets us access it in maybe_print
  local optim_state = {
      maxIter = params.num_iterations,
      tolX = 0, tolFun = -1,
      verbose=true, 
  }

  local function maybe_print(t, loss)
    local verbose = (params.print_iter > 0 and t % params.print_iter == 0)
    if verbose then
      print(string.format('Iteration %d / %d', t, params.num_iterations))
      for i, loss_module in ipairs(content_losses) do
        print(string.format('  Content %d loss: %f', i, loss_module.loss))
      end
      for i, loss_module in ipairs(style_losses) do
        print(string.format('  Style %d loss: %f', i, loss_module.loss))
      end
      print(string.format('  Total loss: %f', loss))
    end
  end

  local function maybe_save(t)
    local should_save = params.save_iter > 0 and t % params.save_iter == 0
    should_save = should_save or t == params.num_iterations
    if should_save then
      local disp = deprocess(img:double())
      disp = image.minmax{tensor=disp, min=0, max=1}
      local filename = params.serial .. '/out' .. tostring(index) .. '_t_' .. tostring(t) .. '.png'
      image.save(filename, disp)
    end
  end

  local num_calls = 0
  local function feval(AffineModel) 
    num_calls = num_calls + 1

    local output = torch.add(img, meanImage)
    local input  = torch.add(content_image_caffe, meanImage)

    net:forward(img)

    local gradient_VggNetwork = net:updateGradInput(img, dy)

    local grad = gradient_VggNetwork

    local loss = 0
    for _, mod in ipairs(content_losses) do
      loss = loss + mod.loss
    end
    for _, mod in ipairs(style_losses) do
      loss = loss + mod.loss
    end
    maybe_print(num_calls, loss)
    maybe_save(num_calls)

    collectgarbage()

    -- optim.lbfgs expects a vector for gradients
    return loss, grad:view(grad:nElement()) 
  end

  -- Run optimization.
  local x, losses = optim.lbfgs(feval, img, optim_state)  
end


function build_filename(output_image, iteration)
  local ext = paths.extname(output_image)
  local basename = paths.basename(output_image, ext)
  local directory = paths.dirname(output_image)
  return string.format('%s/%s_%d.%s',directory, basename, iteration, ext)
end

-- Preprocess an image before passing it to a Caffe model. 
-- We need to rescale from [0, 1] to [0, 255], convert from RGB to BGR,
-- and subtract the mean pixel.
-- 将图像传给Caffe模型之前进行预处理;
-- 需要缩放,从[0,1]变为[0,255],将RGB转为BGR;
-- 减去平均像素值。
function preprocess(img)
  local mean_pixel = torch.DoubleTensor({103.939, 116.779, 123.68})
  local perm = torch.LongTensor{3, 2, 1}
  img = img:index(1, perm):mul(256.0)
  mean_pixel = mean_pixel:view(3, 1, 1):expandAs(img)
  img:add(-1, mean_pixel)
  return img
end


-- Undo the above preprocessing.
function deprocess(img)
  local mean_pixel = torch.DoubleTensor({103.939, 116.779, 123.68})
  mean_pixel = mean_pixel:view(3, 1, 1):expandAs(img)
  img = img + mean_pixel
  local perm = torch.LongTensor{3, 2, 1}
  img = img:index(1, perm):div(256.0)
  return img
end

function deepcopy(orig)
    local orig_type = type(orig)
    local copy
    if orig_type == 'table' then
        copy = {}
        for orig_key, orig_value in next, orig, nil do
            copy[deepcopy(orig_key)] = deepcopy(orig_value)
        end
        setmetatable(copy, deepcopy(getmetatable(orig)))
    else -- number, string, boolean, etc
        copy = orig
    end
    return copy
end

-- Define an nn Module to compute content loss in-place
local ContentLoss, parent = torch.class('nn.ContentLoss', 'nn.Module')

function ContentLoss:__init(strength, target, normalize)
  parent.__init(self)
  self.strength = strength
  self.target = target
  self.normalize = normalize or false
  self.loss = 0
  self.crit = nn.MSECriterion()
end

function ContentLoss:updateOutput(input)
  if input:nElement() == self.target:nElement() then
    self.loss = self.crit:forward(input, self.target) * self.strength
  else
    print('WARNING: Skipping content loss')
  end
  self.output = input
  return self.output
end

function ContentLoss:updateGradInput(input, gradOutput)
  if input:nElement() == self.target:nElement() then
    self.gradInput = self.crit:backward(input, self.target)
  end
  if self.normalize then
    self.gradInput:div(torch.norm(self.gradInput, 1) + 1e-8)
  end
  self.gradInput:mul(self.strength)
  self.gradInput:add(gradOutput)
  return self.gradInput
end

-- Returns a network that computes the CxC Gram matrix from inputs
-- of size C x H x W
function GramMatrix()
  local net = nn.Sequential()
  net:add(nn.View(-1):setNumInputDims(2))
  local concat = nn.ConcatTable()
  concat:add(nn.Identity())
  concat:add(nn.Identity())
  net:add(concat)
  net:add(nn.MM(false, true))
  return net
end


-- Define an nn Module to compute style loss in-place
local StyleLoss, parent = torch.class('nn.StyleLoss', 'nn.Module')

function StyleLoss:__init(strength, target, normalize)
  parent.__init(self)
  self.normalize = normalize or false
  self.strength = strength
  self.target = target
  self.loss = 0

  self.gram = GramMatrix()
  self.G = nil
  self.crit = nn.MSECriterion()
end

function StyleLoss:updateOutput(input)
  self.G = self.gram:forward(input)
  self.G:div(input:nElement())
  self.loss = self.crit:forward(self.G, self.target)
  self.loss = self.loss * self.strength
  self.output = input
  return self.output
end

function StyleLoss:updateGradInput(input, gradOutput)
  local dG = self.crit:backward(self.G, self.target)
  dG:div(input:nElement())
  self.gradInput = self.gram:backward(input, dG)
  if self.normalize then
    self.gradInput:div(torch.norm(self.gradInput, 1) + 1e-8)
  end
  self.gradInput:mul(self.strength)
  self.gradInput:add(gradOutput)
  return self.gradInput
end


function ExtractMask(seg, color)
  local mask = nil
  if color == 'green' then 
    mask = torch.lt(seg[1], 0.1)
    mask:cmul(torch.gt(seg[2], 1-0.1))
    mask:cmul(torch.lt(seg[3], 0.1))
  elseif color == 'black' then 
    mask = torch.lt(seg[1], 0.1)
    mask:cmul(torch.lt(seg[2], 0.1))
    mask:cmul(torch.lt(seg[3], 0.1))
  elseif color == 'white' then
    mask = torch.gt(seg[1], 1-0.1)
    mask:cmul(torch.gt(seg[2], 1-0.1))
    mask:cmul(torch.gt(seg[3], 1-0.1))
  elseif color == 'red' then 
    mask = torch.gt(seg[1], 1-0.1)
    mask:cmul(torch.lt(seg[2], 0.1))
    mask:cmul(torch.lt(seg[3], 0.1))
  elseif color == 'blue' then
    mask = torch.lt(seg[1], 0.1)
    mask:cmul(torch.lt(seg[2], 0.1))
    mask:cmul(torch.gt(seg[3], 1-0.1))
  elseif color == 'yellow' then
    mask = torch.gt(seg[1], 1-0.1)
    mask:cmul(torch.gt(seg[2], 1-0.1))
    mask:cmul(torch.lt(seg[3], 0.1))
  elseif color == 'grey' then 
    mask = torch.cmul(torch.gt(seg[1], 0.5-0.1), torch.lt(seg[1], 0.5+0.1))
    mask:cmul(torch.cmul(torch.gt(seg[2], 0.5-0.1), torch.lt(seg[2], 0.5+0.1)))
    mask:cmul(torch.cmul(torch.gt(seg[3], 0.5-0.1), torch.lt(seg[3], 0.5+0.1)))
  elseif color == 'lightblue' then
    mask = torch.lt(seg[1], 0.1)
    mask:cmul(torch.gt(seg[2], 1-0.1))
    mask:cmul(torch.gt(seg[3], 1-0.1))
  elseif color == 'purple' then 
    mask = torch.gt(seg[1], 1-0.1)
    mask:cmul(torch.lt(seg[2], 0.1))
    mask:cmul(torch.gt(seg[3], 1-0.1))
  else 
    print('ExtractMask(): color not recognized, color = ', color)
  end 
  return mask:float()
end

-- Define style loss with segmentation 
local StyleLossWithSeg, parent = torch.class('nn.StyleLossWithSeg', 'nn.Module')

--function StyleLossWithSeg:__init(strength, target_grams, color_content_masks, content_seg_idxs, layer_id, normalize)
function StyleLossWithSeg:__init(strength, target_grams, color_content_masks, color_codes, layer_id, normalize)
  parent.__init(self)
  self.strength = strength
  self.target_grams = target_grams
  self.color_content_masks = deepcopy(color_content_masks)
  self.color_codes = color_codes
  --self.content_seg_idxs = content_seg_idxs
  self.normalize = normalize

  self.loss = 0
  self.gram = GramMatrix()
  self.crit = nn.MSECriterion()

  self.layer_id = layer_id
end 

function StyleLossWithSeg:updateOutput(input)
  self.output = input
  return self.output
end 

function StyleLossWithSeg:updateGradInput(input, gradOutput)
  self.loss = 0
  self.gradInput = gradOutput:clone()
  self.gradInput:zero()
  for j = 1, #self.color_codes do 
    local l_content_mask_ori = self.color_content_masks[j]:clone():cuda()
    local l_content_mask = l_content_mask_ori:repeatTensor(1,1,1):expandAs(input) 
    local l_content_mean = l_content_mask_ori:mean()

    local masked_input_features = torch.cmul(l_content_mask, input)
    local masked_input_gram = self.gram:forward(masked_input_features):clone()
    if l_content_mean > 0 then 
      masked_input_gram:div(input:nElement() * l_content_mean)
    end

    local loss_j = self.crit:forward(masked_input_gram, self.target_grams[j])
    loss_j = loss_j * self.strength * l_content_mean
    self.loss = self.loss + loss_j

    local dG = self.crit:backward(masked_input_gram, self.target_grams[j])

    dG:div(input:nElement())

    local gradient = self.gram:backward(masked_input_features, dG) 

    if self.normalize then 
      gradient:div(torch.norm(gradient, 1) + 1e-8)
    end

    self.gradInput:add(gradient)
  end   

  self.gradInput:mul(self.strength)
  self.gradInput:add(gradOutput)
  return self.gradInput
end 


local TVLoss, parent = torch.class('nn.TVLoss', 'nn.Module')

function TVLoss:__init(strength)
  parent.__init(self)
  self.strength = strength
  self.x_diff = torch.Tensor()
  self.y_diff = torch.Tensor()
end

function TVLoss:updateOutput(input)
  self.output = input
  return self.output
end

-- TV loss backward pass inspired by kaishengtai/neuralart
function TVLoss:updateGradInput(input, gradOutput)
  self.gradInput:resizeAs(input):zero()
  local C, H, W = input:size(1), input:size(2), input:size(3)
  self.x_diff:resize(3, H - 1, W - 1)
  self.y_diff:resize(3, H - 1, W - 1)
  self.x_diff:copy(input[{{}, {1, -2}, {1, -2}}])
  self.x_diff:add(-1, input[{{}, {1, -2}, {2, -1}}])
  self.y_diff:copy(input[{{}, {1, -2}, {1, -2}}])
  self.y_diff:add(-1, input[{{}, {2, -1}, {1, -2}}])
  self.gradInput[{{}, {1, -2}, {1, -2}}]:add(self.x_diff):add(self.y_diff)
  self.gradInput[{{}, {1, -2}, {2, -1}}]:add(-1, self.x_diff)
  self.gradInput[{{}, {2, -1}, {1, -2}}]:add(-1, self.y_diff)
  self.gradInput:mul(self.strength)
  self.gradInput:add(gradOutput)
  return self.gradInput
end

function TVGradient(input, gradOutput, strength)
  local C, H, W = input:size(1), input:size(2), input:size(3)
  local gradInput = torch.CudaTensor(C, H, W):zero()
  local x_diff = torch.CudaTensor()
  local y_diff = torch.CudaTensor()
  x_diff:resize(3, H - 1, W - 1)
  y_diff:resize(3, H - 1, W - 1)
  x_diff:copy(input[{{}, {1, -2}, {1, -2}}])
  x_diff:add(-1, input[{{}, {1, -2}, {2, -1}}])
  y_diff:copy(input[{{}, {1, -2}, {1, -2}}])
  y_diff:add(-1, input[{{}, {2, -1}, {1, -2}}])
  gradInput[{{}, {1, -2}, {1, -2}}]:add(x_diff):add(y_diff)
  gradInput[{{}, {1, -2}, {2, -1}}]:add(-1, x_diff)
  gradInput[{{}, {2, -1}, {1, -2}}]:add(-1, y_diff)
  gradInput:mul(strength)
  gradInput:add(gradOutput)
  return gradInput
end 

--cmd:parse(arg)解析输入参数;
-- Cmd类的[table]parse(arg)解析一个给定的表,arg是lua创建的缺省的参数表;
-- 返回一个可选值的表。
local params = cmd:parse(arg)
main(params)

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值