sci rgan

最新推荐文章于 2022-02-23 22:27:02 发布

c2a2o2

最新推荐文章于 2022-02-23 22:27:02 发布

阅读量875

点赞数

分类专栏： sci

本文链接：https://blog.csdn.net/c2a2o2/article/details/79106485

版权

sci 专栏收录该内容

36 篇文章 3 订阅

订阅专栏

Abstract
摘要
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper con-volutional neural networks,
尽管在精度和更快和更深入的控制使用卷积神经网络的单图像超分辨率复原速度的突破，
one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors?
一个中心问题仍然没有解决：我们如何恢复更精细的纹理细节当我们超级大的倍增因子解决？
The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function.
基于优化的超分辨率方法的行为主要是由目标函数的选择驱动的。
Recent work has largely focused on minimizing the mean squared reconstruction error.
最近的工作主要集中在最小化均方重建误差。
The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution.
估计结果有很高的峰值信噪比，但他们往往缺乏高频细节和感知不在这个意义上，他们不匹配的保真度将在更高的分辨率。
In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR).
在本文中，我们提出了srgan，生成对抗网络（GAN）的图像超分辨率（SR）。
To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4 upscaling factors.
据我们所知，这是第一架能够推断出照片般逼真的自然图像4倍增的因素。
To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss.
为了实现这一点，我们提出了一个感性损失函数，它包括敌对损失和内容损失。
The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images.
对抗损失将我们的解决方案推到自然图像流形使用一个鉴别器网络，训练区分超分辨图像和原始照片真实感图像。
In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space.
此外，我们使用的内容损失的知觉相似性，而不是在像素空间的相似性。
Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks.
我们深深的剩余网络能够从大量的原始图像在公共基准恢复照片般逼真的纹理。
An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN.
一个广泛的平均意见得分（MOS）试验研究表明，采用srgan感知质量有非常大的收益。
The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.

MOS评分结果与srgan是比任何国家的最先进的方法，获得接近于原来的高分辨率图像。

Introduction
介绍
The highly challenging task of estimating a high-resolution (HR) image from its low-resolution (LR) counterpart is referred to as super-resolution (SR).
高分辨率（HR）图像从低分辨率（LR）对应的高挑战性的任务称为超分辨率（SR）。
SR received substantial attention from within the computer vision research community and has a wide range of applications [63, 71, 43].
SR在计算机视觉研究领域得到了广泛的关注，有着广泛的应用[ 63, 71, 43 ]。
The ill-posed nature of the underdetermined SR problem is particularly pronounced for high upscaling factors, for which texture detail in the reconstructed SR images is typically absent.
SR的欠定问题的病态性质特别明显的高倍增的因素，而在超分辨率重构图像的纹理细节通常是缺席。
The optimization target of supervised SR algorithms is commonly the minimization of the mean squared error (MSE) between the recovered HR image and the ground truth.
监督的SR算法的优化目标通常是恢复的HR图像和地面真相之间的均方误差（MSE）最小化。
This is convenient as minimizing MSE also maximizes the peak signal-to-noise ratio (PSNR), which is a common measure used to evaluate and compare SR algorithms [61].
这是方便的，因为最小化MSE也最大化峰值信噪比（PSNR），这是用于评价和比较SR算法的常用措施[ 61 ]。
However, the ability of MSE (and PSNR) to capture perceptually relevant differences, such as high texture detail, is very limited as they are defined based on pixel-wise image differences [60, 58, 26].
然而，MSE（和PSNR）捕捉感知相关差异的能力，如高纹理细节，是非常有限的，因为它们是基于像素的图像差异[ 60, 58, 26 ]定义的。
This is illustrated in Figure 2, where highest PSNR does not necessarily reflect the perceptually better SR result.
如图2所示，其中最高的PSNR不一定反映感知更好的SR结果。
The perceptual difference between the super-resolved and orig-inal image means that the recovered image is not photo-realistic as defined by Ferwerda [16].
之间的超分辨和原始图像，图像恢复是不真实的费沃达[ 16 ]中定义的感知差异。
In this work we propose a super-resolution generative adversarial network (SRGAN) for which we employ a deep residual network (ResNet) with skip-connection and diverge from MSE as the sole optimization target.
在这项工作中我们提出了一种超分辨率生成对抗网络（srgan），我们采用深剩余网络（ResNet）与跳接和偏离MSE作为优化目标。
Different from previous works, we define a novel perceptual loss us-ing high-level feature maps of the VGG network [49, 33, 5] combined with a discriminator that encourages solutions perceptually hard to distinguish from the HR reference images.
不同于以往的作品中，我们定义了一种新的知觉丧失我们的VGG网络[ 49, 33, 5高级特征地图]结合鉴别器鼓励解感知难区分人力资源的参考图像。
An example photo-realistic image that was super-resolved with a 4 upscaling factor is shown in Figure 1.
一个例子是超级逼真的图像，以4的倍增因子分解图如图1所示。

Related work
相关的工作
1.1.1 Image super-resolution
1.1.1图像超分辨率
Recent overview articles on image SR include Nasrollahi and Moeslund [43] or Yang et al. [61].
近年来对图像SR概述文章包括nasrollahi和moeslund [ 43 ]、杨等人。[ 61 ]。
Here we will focus on single image super-resolution (SISR) and will not further discuss approaches that recover HR images from multiple images [4, 15].
在这里，我们将重点放在单一的图像超分辨率（SISR）并没有进一步讨论的方法，从多个图像恢复的HR图像[ 4, 15 ]。
Prediction-based methods were among the first methods to tackle SISR.
基于预测的方法，第一个方法中处理和。
While these filtering approaches, e.g. linear, bicubic or Lanczos [14] filtering, can be very fast, they oversimplify the SISR problem and usually yield solutions with overly smooth textures.
虽然这些过滤方法，如线性、三次或Lanczos [ 14 ]过滤，可很快，他们过于简单化的问题解决方案和产量通常过于光滑的纹理。
Methods that put particularly focus on edge-preservation have been proposed [1, 39].
已经提出了特别注重边缘保存的方法[ 1, 39 ]。
More powerful approaches aim to establish a complex mapping between low- and high-resolution image informa-tion and usually rely on training data.
更强大的方法旨在建立低分辨率和高分辨率图像信息之间的复杂映射，并且通常依赖于训练数据。
Many methods that are based on example-pairs rely on LR training patches for which the corresponding HR counterparts are known.
许多基于示例对的方法依赖于相应的HR对应的LR训练补丁。
Early work was presented by Freeman et al. [18, 17]. Related ap-proaches to the SR problem originate in compressed sensing [62, 12, 69].
早期的工作是由弗里曼等人提出的。[ 18, 17 ]。相关方法的SR问题起源于压缩感知[ 62, 12, 69 ]。
In Glasner et al. [21] the authors exploit patch redundancies across scales within the image to drive the SR.
在戈拉舍等人。[ 21 ]作者利用图像中跨尺度的补丁冗余驱动SR。
This paradigm of self-similarity is also employed in Huang et al. [31], where self dictionaries are extended by further allowing for small transformations and shape variations.
这种自相似的范式也被应用于黄等人。[ 31 ]，通过进一步允许小的转换和形状变化，扩展自词典。
Gu et al. [25] proposed a convolutional sparse coding approach that improves consistency by processing the whole image rather than overlapping patches.
顾等人。[ 25 ]提出了一种卷积稀疏编码方法，通过处理整幅图像而不是重叠的补丁来提高一致性。
To reconstruct realistic texture detail while avoiding edge artifacts, Tai et al. [52] combine an edge-directed SR algorithm based on a gradient profile prior [50] with the benefits of learning-based detail synthesis.
重建真实纹理细节，同时避免边缘伪影。[ 52 ]结合基于梯度剖面的边缘定向SR算法（50）和基于学习的细节合成的好处。
Zhang et al. [70] propose a multi-scale dictionary to capture redundancies of similar image patches at different scales.
张等。[ 70 ]提出了一种多尺度字典来捕捉不同尺度上相似图像块的冗余。
To super-resolve landmark images, Yue et al. [67] retrieve correlating HR images with similar content from the web and propose a structure-aware matching criterion for alignment.
为了超分辨率地标图像，岳等人。[ 67 ]检索与Web内容相似的HR图像，并提出一种结构感知匹配准则。
Neighborhood embedding approaches upsample a LR image patch by finding similar LR training patches in a low dimensional manifold and combining their corresponding HR patches for reconstruction [54, 55].
邻域嵌入方法上采样一个LR图像块中寻找一个低维流形类似LR训练补丁和结合相应的HR补丁重建[ 54, 55 ]。
In Kim and Kwon [35] the authors emphasize the tendency of neighborhood approaches to overfit and formulate a more general map of example pairs using kernel ridge regression.
在基姆和Kwon [ 35 ]作者强调邻里接近趋势拟合和制定实例对使用核岭回归更一般的地图。
The regression problem can also be solved with Gaussian process regres-sion [27], trees [46] or Random Forests [47].
回归问题也可以用高斯过程回归[ 27 ]解决，树[ 46 ]或随机森林[ 47 ]。
In Dai et al. [6] a multitude of patch-specific regressors is learned and the most appropriate regressors selected during testing.
戴等人。[ 6 ]许多补丁的具体模型的学习和选择最合适的回归测试中。
Recently convolutional neural network (CNN) based SR algorithms have shown excellent performance.
最近卷积神经网络（美国有线电视新闻网）的SR算法表现出优异的性能。
In Wang et al. [59] the authors encode a sparse representation prior into their feed-forward network architecture based on the learned iterative shrinkage and thresholding algorithm (LISTA) [23].
在王等人。[ 59 ]作者编码的稀疏表示之前为前馈网络体系结构的基础上学会了迭代收缩阈值算法（LISTA）[ 23 ]。
Dong et al. [9, 10] used bicubic interpolation to upscale an input image and trained a three layer deep fully convolutional network end-to-end to achieve state-of-the-art SR performance.
董等人。[9, 10] used bicubic interpolation to upscale an input image and trained a three layer deep fully convolutional network end-to-end to achieve state-of-the-art SR performance.
Subsequently, it was shown that enabling the network to learn the upscaling filters directly can further increase performance both in terms of accuracy and speed [11, 48, 57].
随后，它表明，使网络学习高级的过滤器直接能进一步提高精度和速度方面的性能[ 11, 48, 57 ]。
With their deeply-recursive convolutional network (DRCN), Kim et al. [34] presented a highly performant architecture that allows for long-range pixel dependencies while keeping the number of model parameters small.
他们的深度递归卷积网络（归1），Kim et al。[ 34 ]提出了一种高性能的架构，允许远程像素的依赖的同时保持小数量的模型参数。
Of particular relevance for our paper are the works by Johnson et al. [33] and Bruna et al. [5], who rely on a loss function closer to perceptual similarity to recover visually more convincing HR images.
我们的论文特别相关的是约翰逊等人的作品。[ 33 ]和布鲁纳等人。[ 5 ]，依赖于接近感性相似的损失功能，以恢复视觉更令人信服的HR图像。

c2a2o2

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
sci rgan

Abstract摘要Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper con-volutional neural networks,尽管在精度和更快和更深入的控制使用卷积神经网络的单图像超分辨率复原速度的突破，one central pro
复制链接

扫一扫