(2)Papar Reading——Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks

论文介绍了一种名为SFM的随机频率掩蔽技术,通过在训练过程中随机遮挡图像频率成分,防止超分辨率和去噪网络过拟合。SFM在多种场景下改善了现有方法,通过分析和设计两种掩码模式,有效增强模型性能和泛化至高噪声水平。
摘要由CSDN通过智能技术生成
标题Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks
论文地址https://arxiv.org/abs/2003.07119
项目地址https://github.com/majedelhelou/SFM

速通版

论文提出了一种名为随机频率掩蔽(Stochastic Frequency Masking,SFM)的技术,通过在训练中随机掩蔽图像的频率成分来正则化网络,解决过拟合问题。SFM技术在多个任务上改善了最先进的方法。简单来说,对每个通道的图片做DCT转到频域,然后根据两种不同的mode选择mask,iDCT反转回去再来做SR和denoising的任务。
fig1

原文部分翻译

Abstract

Super-resolution and denoising are ill-posed yet fundamental image restoration tasks. In blind settings, the degradation kernel or the noise level are unknown. This makes restoration even more challenging, notably for learningbased methods, as they tend to overfit to the degradation seen during training.We present an analysis, in the frequency domain, of degradation-kernel overfitting in super-resolution and introduce a conditional learning perspective that extends to both super-resolution and denoising. Building on our formulation, we propose a stochastic frequency masking of images used in training to regularize the networks and address the overfitting problem. Our technique improves stateof-the-art methods on blind super-resolution with different synthetic kernels, real super-resolution, blind Gaussian denoising, and real-image denoising.

超分辨率和去噪是病态图像恢复的基本任务。在盲设置中,退化核或噪声水平是未知的。这使得恢复更具挑战性,特别是对于基于学习的方法,因为它们倾向于过度拟合训练期间看到的退化。我们在频域对超分辨率下的退化核过拟合进行了分析,并引入了一种扩展到超分辨率和去噪的条件学习视角。在我们的公式的基础上,我们提出了训练中使用的图像的随机频率掩蔽,以正则化网络并解决过拟合问题。该技术改进了不同合成核的盲超分辨率、真实超分辨率、盲高斯去噪和真实图像去噪。

Intro

Our contributions are summarized as follows. We present a frequency-domain analysis of the degradation-kernel overfitting of SR networks, and highlight the implicit conditional learning that, as we also show, extends to denoising. We present a novel technique, SFM, that regularizes the learning of SR and denoising networks by only filtering the training data. It allows the networks to better restore frequency components and avoid overfitting. We empirically show that SFM improves the results of state-ofthe-art learning methods on blind SR with different synthetic degradations, real-image SR, blind Gaussian denoising, and real-image denoising on high noise levels.

我们的贡献总结如下。我们提出了SR网络的退化核过拟合的频域分析,并强调了可扩展到去噪的隐式条件学习。我们提出了一种新的技术,SFM,它通过只过滤训练数据来正则化SR和去噪网络的学习。它允许网络更好地恢复频率成分,避免过拟合。我们的经验表明,SFM改进了目前最先进的学习方法在不同合成退化的盲超分、真实图像超分、盲高斯去噪和高噪声水平下的真实图像去噪上的结果。

Conclusion

We analyze the degradation-kernel overfitting of SR networks in the frequency domain.Our frequency-domain insights reveal an implicit conditional learning that also extends to denoising, especially on high noise levels. Building on our analysis, we present SFM, a technique to improve SR and denoising networks, without increasing the size of the training set or any cost at test time. We conduct extensive experiments on state-of-theart networks for both restoration tasks. We evaluate SR with synthetic degradations, real-image SR, Gaussian denoising and real-image Poisson-Gaussian denoising, showing improved performance, notably on generalization, when using SFM.

在频域分析了SR网络的退化核过拟合问题。我们的频域见解揭示了一种隐式条件学习,它也可以扩展到去噪,特别是在高噪声水平下。基于我们的分析,我们提出了SFM,一种改进SR和去噪网络的技术,不增加训练集的大小或测试时的任何成本。我们在最先进的网络上为这两项恢复任务进行了广泛的实验。我们使用合成退化、实像SR、高斯去噪和实像泊松-高斯去噪来评估SR,当使用SFM时,显示出改进的性能,特别是在泛化方面。

Stochastic Frequency Masking (SFM)

Motivation and implementation

The objective of SFM is to improve the networks’ prediction of high frequencies given lower ones, whether for SR or denoising. We achieve this by stochastically masking high-frequency bands from some of the training images in the learning phase, to encourage the conditional learning of the network. Our masking is carried out by transforming an image to the frequency domain using the Discrete Cosine Transform (DCT) type II [3,47], multiplying channel-wise by our stochastic mask, and lastly transforming the image back (Fig. 1). See Supplementary Material for the implementation details of the DCT type we use. We define frequency bands in the DCT domain over quarterannulus areas, to cluster together similar-magnitude frequency content. Therefore, the SFM mask is delimited with a quarter-annulus area by setting the values of its inner and outer radii. We define two masking modes, the central mode and the targeted mode.

无论是SR还是去噪,SFM的目标都是在给定较低频率的情况下提高网络对高频的预测能力。我们通过在学习阶段随机屏蔽一些训练图像的高频波段来实现这一点,以鼓励网络的条件学习。我们的掩模是通过使用II型离散余弦变换(DCT)将图像转换到频域来实现的[3,47],将通道方向乘以我们的随机掩模,最后将图像转换回来(图1)。参见补充材料,了解我们使用的DCT类型的实现细节。我们在四分之一环区域的DCT域中定义频带,将相似量级的频率内容聚类在一起。 因此,通过设置其内外半径的值,以四分之一环面积划分SFM掩模。我们定义了两种掩蔽模式,中心模式和目标模式。

In the central mode, the inner and outer radius limits r I r_I rI and r O r_O rO of the quarterannulus are selected uniformly at random from [ 0 , r M ] [0, r_M] [0,rM], where r M = a 2 + b 2 r_M = \sqrt{a^2 + b^2} rM=a2+b2 is the maximum radius, with (a; b) being the dimensions of the image. We ensure that r I < r O r_I < r_O rI<rO by permuting the values if r I > r O r_I > r_O rI>rO.With this mode, the resulting probability of a given frequency band r ω r_ω rω to be masked is 在这里插入图片描述
which means the central bands are the more likely ones to be masked, with the likelihood slowly decreasing the higher or the lower the frequencies are.
In the targeted mode,the frequency rC is always masked, and the frequencies away from rC are less and less likely to be masked, with a normal distribution decay.

在中心模式下,四分环的内外半径极限 r I r_I rI r O r_O rO均匀随机地从 [ 0 , r M ] [0, r_M] [0,rM]选择,其中 r M = a 2 + b 2 r_M = \sqrt{a^2 + b^2} rM=a2+b2 为最大半径,其中(a;b)是图像的尺寸(PS:因为DCT它变换之后的尺寸和原来的图片的尺寸是一样的,所以ab其实就是原图的长宽,那rm就是对角线长度)。我们通过排列 r I > r O r_I > r_O rI>rO的值来确保 r I < r O r_I < r_O rI<rO。在这种模式下,得到给定频带 r ω r_ω rω被mask的概率为式3。这意味着中心频带更有可能被掩盖,随着频率的高低,可能性逐渐降低。
在目标模式下,频率rC总是被屏蔽,远离rC的频率越来越不可能被屏蔽,呈正态分布衰减。

We use the central mode for SR networks, and the targeted mode with a high target rC for denoisers (Fig. 1). The former has a slow concave probability decay that allows to cover wider bands, while the latter has an exponential decay adapted for targeting very specific narrow bands. In both settings, the highest frequencies are most likely masked, and lower ones are masked with decaying probability. The central mode indeed masks the highest frequencies in SR, because the central-band frequencies are the highest ones remaining in the HR image after the anti-aliasing filter is applied. It is also worth noting for SR that SFM actually simulates the effect of different blur kernels by stochastically masking different frequency bands.

我们对SR网络使用中心模式,对去噪器使用具有高目标rC的目标模式(图1)。前者具有缓慢的凹概率衰减,可以覆盖更宽的频带,而后者具有指数衰减,适合针对非常特定的窄带。在这两种情况下,最高的频率很可能被屏蔽,而较低的频率则以衰减概率被屏蔽。中心模式确实掩盖了SR中的最高频率,因为在应用抗混叠滤波器后,中心频带频率是HR图像中剩余的最高频率。对于SR来说,值得注意的是,SFM实际上是通过随机屏蔽不同频段来模拟不同模糊核的效果。

Learning SR with SFM

在这里插入图片描述

Learning denoising with SFM

在这里插入图片描述

  • 22
    点赞
  • 27
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值