Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

超越高斯降噪器:图像降噪的深度卷积神经网络的剩余学习

Abstract—Discriminative model learning for image
摘要—用于图像去噪的辨别模型学习,由于它的良好的去噪性能而吸引了大量的注意。
denoising has been recently attracting considerable attentions due to its favorable denoising performance. In this paper, we take one step forward by investigating the
在这篇文章中,我们通过研究前反馈去噪卷积神经网络的结构向前迈了一小步,
construction of feed-forwarddenoising convolutional neural networks (DnCNNs) to embrace the progress in very deep architecture, learning algorithm, and regularization method
将非常深的结构,学习算法和正则化方法使用到图像去噪中。
into image denoising. Specifically, residual learning and
特别的,残差学习和批量归一化被用来加速训练过程和提高去噪性能。
batch normalization are utilized to speed up the training process as well as boost the denoising performance.Different from the existing discriminative
与现存的在特定的噪声水平上为加性高斯白噪声(AWGN)训练一个特定模型的判别模型不同,
denoising models which usually train a specific model for additive white Gaussian noise(AWGN) at a certain noise level, our DnCNN model is able to handle Gaussian
我们的DnCNN模型可以处理未知噪声水平的高斯去噪(即盲高斯去噪)。
denoising with unknown noise level (i.e.,blind Gaussian denoising). With the residual learning strategy,DnCNN
利用残差学习策略,DnCNN在隐藏层隐式地移除了潜在的干净图像。
implicitly removes the latent clean image in the hidden layers. This property motivates us to train a single DnCNN model
这种属性激励我们去训练单个DnCNN模型来处理一些一般图像去噪任务,例如高斯噪声,单一图像超分辨率和JPEG图像去块。
to tackle with several general image denoising tasks such as Gaussian denoising, single image super-resolution and JPEG image deblocking. Our extensive experiments demonstrate that our DnCNN model can not only exhibit
我们额外的实验证明了我们的DnCNN模型不仅在一些一般的图像去噪任务中展示高效性,还可以通过GPU的计算获得有效实现。
high effectiveness in several general image denoising tasks, but also be efficiently implemented by benefiting from GPU computing.

Index Terms—Image Denoising, Convolutional Neural Net-works, Residual Learning, Batch Normalization
索引条款:图像去噪,卷积神经网络,残差学习,批量归一化

1.介绍

Image denoising is a classical yet still active topic in
图像去噪在低级水平视觉中是一个传统但是仍然活跃的话题,因为他在许多实际应用中是不可或缺的一步。
low level vision since it is an indispensable step in many practical applications. The goal of image denoising is to
图像去噪的目标是利用图像降解模型y=x+v,从噪声观察y中恢复清晰图像x。
recover a clean image x from a noisy observation y which follows an image degradation model y=x+v. One common assumption is that v is additive white Gaussian noise(AWGN)
一个共同的假设是,v是标准偏差 σ下的加性白色高斯噪声。
with standard deviation σ. From a Bayesian viewpoint, when the likelihood is known, the image prior modeling will play a
从贝叶斯的观点来看,当可能性已知时,图像先验模型将在图像去噪中起核心作用。
central role in image denoising. Over the past few decades, various models have been exploited for
在过去几十年里,各种模型已经被用于图像先验建模,包括非局部自相似模型,稀疏模型,梯度模型和马尔可夫随机场模型。
modeling image priors, including non local self-similarity (NSS) models [1], [2], [3], [4], sparse models [4], [5], [6], gradient models [7], [8], [9] and Markov random field (MRF) models [10], [11], [12]. In particular, the NSS models are
特别的,非局部自相似模型在最先进的方法中很受欢迎,例如
BM3D, LSSC, NCSR 和WNNM。
popular in state-of-the-art methods such as BM3D [2], LSSC [4], NCSR [6] and WNNM [13].Despite their high denoising quality, most of the image prior-based methods
尽管他们的高去噪质量,大部分的基于图像先验方法都有两个主要缺点。
typically suffer from two major draw-backs. First, those methods generally involve a complex optimization problem
首先,这些方法一般在测试阶段都涉及复杂的优化问题,使得去噪过程耗时。因此,大多数先验为基础的方法在不牺牲计算效率时都很难获得高性能。
in the testing stage, making the denoising process time-consuming [6], [13]. Thus, most of the prior-based methods can hardly achieve high performance without sacrificing computational efficiency. Second, the models in general are non-convex and involve several manually
此外,一般的模型都是非凸的,并且设计一些手动选择参数,提供一些余地来提高去噪性能。
chosen parameters, providing some leeway to boost denoising performance.
To overcome the limitations of prior-based
为了克服先验为基础的方法的局限性,最近开发了一些辨别学习方法来在截断推理程序的背景下学习图像先验模型。
approaches,several discriminative learning methods have been recently developed to learn image prior models in the context of truncated inference procedure. The resulting models are able to get rid of the iterative optimization
结果模型能够摆脱测试阶段的迭代优化过程。
procedure in the test phase.Schmidt and Roth [14] proposed a cascade of shrinkage fields(CSF) method that unifies the
Schmidt 和Roth提出了一种收缩级联(CSF)方法,该方法将基于随机场的模型和展开的半二次优化算法统一到单一学习框架中。
random field-based model and the unrolled half-quadratic optimization algorithm into a single learning framework. Chen etal. [15], [16] proposed a trainable nonlinear reaction
陈等人提出了一种可训练的非线性反应扩散(TNRD)模型,该模型通过展开固定数量的梯度下降推断步骤来学习改进的专家领域。
diffusion (TNRD) model which learns a modified fields of experts [12] image prior by unfolding a fixed number of gradient descent inference steps. Some of the other related work can be found in [17], [18]. Although CSF and TNRD
尽管收缩级联方法和可训练的非线性反应扩散模型已经在计算效率和去噪质量之间的差距有了很好的效果,
have shown promising results toward bridging the gap between computational efficiency and denoising quality, their performance are inherently restricted to the specified forms of
他们的性能本质上还是受限于特定形式的先验。
prior. To be specific, the priors adopted in CSF and
具体来说,收缩级联和可训练的非线性反应扩散的先验是基于分析模型,这就在捕获图像结构的所有特征上受到限制。
TNRD are based on the analysis model, which is limited in capturing the full characteristics of image structures. In addition, the parameters are learned by stage-wise greedy
此外,通过阶段性的贪婪训练和所有阶段的联合微调来学习参数,并且许多手动参数被涉及到。
training plus joint fine-tuning among all stages, and many handcrafted parameters are involved. Another nonnegligible
另一个不可忽视的缺点是他们为一个特定的噪声水平训练了一个特定的模型,并且模型被盲图像去噪限制。
drawback is that they train a specific model for a certain noise level, and are limited in blind image denoising.
In this paper, instead of learning a discriminative model
在这篇论文中,我们将图像去噪当成简单的辨别学习问题,而不是学习有明确的图像先验的辨别模型,即通过前反馈卷积神经网络(CNN)将噪声从噪声图像中分离。
with an explicit image prior, we treat image denoising as a plain discriminative learning problem, i.e., separating the noise from a noisy image by feed-forward convolutional neural networks(CNN). The reasons of using CNN are three-fold.
使用前反馈卷积神经网络的原因主要有三个方面。
First, CNN with very deep architecture [19] is effective in
首先,有着非常深结构的CNN在提高利用图像特征的容量和灵活性上很有效。
increasing the capacity and flexibility for exploiting image characteristics.Second, considerable advances have been
其次,在训练CNN的学习方法和正则化上已经取得了很大的进步,包括整流器线性单元(RELU)[20],批量归一化[21]和残差学习[22]。
achieved on regularization and learning methods for training CNN, including Rectifier Linear Unit (ReLU) [20], batch normalization [21]and residual learning [22]. These methods can be adopted in CNN to speed up the training
这些方法能够在CNN中被采用来加速训练过程,并且提高去噪性能。
process and improve the denoising performance. Third, CNN is well-suited for parallel computation on modern
第三,CNN非常适合在现代强大的GPU中进行并行计算,可以用它来提高运行时间性能。
powerful GPU, which can be exploited to improve the run time performance.We refer to the proposed denoising
我们称这个被提出的去噪卷积神经网络为DnCNN。不是直接输出噪声图像x,我们提出的DnCNN被设计成预测剩余图像v,即噪声观察和潜在的干净图像之间的差异。
convolutional neural network as DnCNN. Rather than directly outputing the denoised imageˆx, the proposed DnCNN is designed to predict the residual imageˆv, i.e., the difference between the noisy observation and the latent clean image. In other words, the proposed DnCNN
换句话说,被提出的DnCNN通过隐藏层的操作隐藏的移除潜在的干净图像。
implicitly removes the latent clean image with the operations in the hidden layers. The batch normalization
进一步引入批量归一化技术来稳定和加强DnCNN的训练性能。
technique is further introduced to stabilize and enhance the training performance of DnCNN. It turns out that residual
这证明残差学习和批量归一化可以互惠互利,并且他们的结合在加速训练和提高去噪性能上很有效。
learning and batch normalization can benefit from each other,and their integration is effective in speeding up the training and boosting the denoising performance.

While this paper aims to design a more effective Gaussian
尽管这篇论文只在设计一个更加有效的高斯去噪器,我们观察到当v是地面真实高分辨率图像和低分辨率图像的双三次上采样之间的差异时,
denoiser, we observe that when v is the difference between the ground truth high resolution image and the bicubic upsampling of the low resolution image, the image degradation model for Guassian denoising can be
高斯降噪的图像恶劣模型可以被转化为单一图像超分辨率问题。
converted to a single image super-resolution (SISR) problem; analogously, the JPEG image deblocking
类似的,JPEG图像去块问题可以通过相同的图像恶劣模型来建模,这种建模是将v作为一般的图像和压缩图像之间的差异。
problem can be modeled by the same image degradation model by taking v as the difference between the original image and the compressed image. In this sense, SISR and JPEG image deblocking can be treated as two special cases of
从这个意义上讲,SISR和JPEG图像去块可以被作为一般图像去噪问题的两种特殊情况,尽管在SISR和JPEG去块中噪声vs与AWGN(加性白高斯噪声)中噪声很不一样
a “general” image denoising problem, though in SISR and JPEG deblocking the noise vs are much different from AWGN. It is natural to ask whether is it possible to train
那么很自然的会接着问,训练一个CNN模型来处理一般的图像去噪问题是否是可能的?
a CNN model to handle such general image denoising problem? By analyzing the connection between DnCNN
通过分析去噪卷积神经网络(DncNN)和可训练的非线性反应扩散(TNRD)之间的联系,我们提出了扩展去噪卷积神经网络来处理一些一般的图像去噪任务,包括高斯去噪,单一图像超分辨率和JPEG图像去块。
and TNRD [16], we propose to extend DnCNN for handling several general image denoising tasks, including Gaussian denoising, SISR and JPEG image deblocking.
Extensive experiments show that, our DnCNN trained with a
扩展试验表明,我们的去噪卷积神经网络在特定的噪声水平下训练产生的高斯去噪结果比最先进的方法如BM3D,WNNM和TNRD更好。
certain noise level can yield better Gaussian denoising results than state-of-the-art methods such as BM3D [2], WNNM [13]and TNRD [16]. For Gaussian denoising with unknown
对于未知噪声水平的高斯去噪(盲高斯去噪),有着单一模型的DncNN仍然比为特定噪声水平训练的BM3D和TNRD表现的更好。
noise level (i.e., blind Gaussian denoising), DnCNN with a single model can still outperform BM3D [2] and TNRD [16] trained for a specific noise level. The DnCNN can also obtain
当被延伸到一些一般的图像去噪任务时,DnCNN也能获得很好的结果。
promising results when extended to several general image denoising tasks. Moreover, we show the effectiveness of
此外,我们展现了仅仅一个DnCNN模型对于三个一般图像去噪任务的高效性,即高斯去噪,有着多重升级因子的SISR和有着不同质量因子的JPEG去块问题。
training only a single DnCNN model for three general image denoising tasks,i.e., blind Gaussian denoising, SISR with multiple upscaling factors, and JPEG deblocking with different quality factors.
The contributions of this work are summarized as follows:
这项工作的贡献被总结如下:(

  1. We propose an end-to-end trainable deep CNN for Gaussian denoising. In contrast to the existing deep
    neural network-based methods which directly estimate the latent clean image, the network adopts the residual learning strategy to remove the latent clean image from noisy observation.
    1)我们为高斯去噪提出了一个端到端的可训练的很深的卷积神经网络。和直接证明了潜在的干净图像的基于很深的神经网络的方法不同,我们的网络采用了残差学习策略来从噪声观察中移除潜在的干净图像。
  2. We find that residual learning and batch normalization can greatly benefit the CNN learning as they can not only speed up the training but also boost the denoising performance. For Gaussian denoising with a certain noise level, DnCNN outperforms state-of-the-art methods in terms of both quantitative metrics and visual quality.
    (2)我们发现残差学习和批量归一化可以使卷积神经网络学习极大的获益,因为他们不仅可以加速训练还可以提高去噪性能。对于有确定噪声水平的高斯去噪,DnCNN在量化指标和视觉质量上都比最先进的方法表现的更好。
  3. Our DnCNN can be easily extended to handle general image denoising tasks. We can train a single DnCNN model for blind Gaussian denoising, and achieve better performance than the competing methods trained for a specific noise level. Moreover, it is promising to solve three general image denoising tasks, i.e., blind Gaussian denoising, SISR, and JPEG deblocking, with only a single DnCNN model.
    (3)我们的DnCNN可以轻松扩展以处理一般的图像去噪任务。我们可以为高斯去噪训练一个单个DnCNN模型,并且和那些为特定噪声水平训练的竞争方法相比,可以有更好的性能。此外,这个方法可以解决三个一般图像去噪任务,即盲高斯去噪,SISR和JPEG去块,只需要单个DnCNN模型。
    The remainder of the paper is organized as follows. Section II provides a brief survey of related work. Section III first presents the proposed DnCNN model, and then extends it to general image denoising. In Section IV, extensive experiments are conducted to evaluate DnCNNs. Finally, several concluding remarks are given in Section V
    论文的剩余部分结构如下。第2节简单介绍了相关工作,第三节,首先展现了被提出的DnCNN模版,然后将它延展到一般的图像去噪上,在第四节中用一些延展的实验来评估DnCNN,最后,在第五节上做出一些总结。

II. RELATED WORK

2.相关工作

A.Deep Neural Networks for Image Denoising.
A.图像去噪的深度神经网络

There have been several attempts to handle the
为了利用很深的神经网络处理去噪问题,已经有过一些尝试。
denoising problem by deep neural networks. In [23], Jain and Seung proposed to use convolutional neural networks
在[23]中,Jain和Seung提出使用(CNN)卷积神经网络进行图像去噪,并且称CNN和MRF模型相比,有着相似甚至更好的表示能力。
(CNNs) for image denoising and claimed that CNNs have similar or even better representation power than the MRF model. In [24], the multi-layer perceptron (MLP) was
在[24]中,多层感知器(MLP)被成功应用到图像去噪中。
successfully applied for image denoising. In [25], stacked sparse denoising autoencoders method was adopted to
在[25]中,采用堆叠稀疏去噪自动编码器方法来处理高斯噪声去除,和K-SVD相比,可以实现相当的结果。
handle Gaussian noise removal and achieved comparable results to K-SVD [5]. In [16], a trainable nonlinear reaction
在[16]中,一个可训练的非线性反应扩散模型被提出,他可以通过展开一个固定数量的梯度下降推断步骤来表示为前反馈很深的网络。
diffusion (TNRD) model was proposed and it can be expressed as a feed-forward deep network by unfolding a fixed number of gradient descent inference steps.Among the above deep neural networks based methods, MLP and TNRD
在上述基于深度神经网络方法,MLP 和 TNRD可以有良好的性能,并且可以和BM3D竞争。
can achieve promising performance and are able to compete with BM3D. However, for MLP [24] and TNRD [16],a specific
然而,对于多层感知器(MLP)和可训练的非线性反应扩散(TRND),针对特定的噪声水平训练模型。
model is trained for a certain noise level. To the best of our
据我们所知,开发CNN进行一般图像去噪仍然没有被研究过。
knowledge, it remains uninvestigated to develop CNN for general image denoising.

B. Residual Learning and Batch Normalization
B.残差学习和批量归一化

Recently, driven by the easy access to large-scale
最近,由于易于访问大规模数据集和深度学习方法的进步,卷积神经网络在处理不同的视觉任务中展现了巨大的成功。
dataset and the advances in deep learning methods, the convolutional neural networks have shown great success in handling various vision tasks. The representative achievements in training CNN models include Rectified Linear
在训练CNN模型中取得的代表性的成就包括整流线性模型单元(ReLU),深度和宽度之间的权衡,参数初始化,基于梯度的优化算法,批量归一化和残差学习。
Unit (ReLU) [20], trade off between depth and width [19], [26], parameter initialization [27],gradient-based optimization algorithms [28], [29], [30], batch normalization [21] and residual learning [22]. Other factors,such as the efficient
其他因素,例如在现代强大的GPU的高效训练,也促进了CNN的成功。
training implementation on modern powerful GPUs, also contribute to the success of CNN. For Gaussian
对于高斯去噪,从一系列高质量的图像中产生足够多的训练数据是很简单的。
denoising, it is easy to generate sufficient training data from a set of high quality images. This work focuses on
这项工作集中于为图像去噪设计和训练CNN。接下来,我们简单的回顾两个和我们的DnCNN有关的方法,即残差学习和批量归一化。
the design and learning of CNN for image denoising. In the following, we briefly review two methods related to our DnCNN, i.e., residual learning and batch normalization.

  1. Residual Learning:
    残差学习:
    Residual learning [22] of CNN was originally proposed to
    CNN的残差学习最初是被提出来解决性能退化问题,即,即使训练的精度随着网络深度的增加而开始下降。
    solve the performance degradation problem, i.e., even the training accuracy begins to degrade along with the increasing of network depth. By assuming that the
    通过假设残差映射比原始未引用映射更容易学习,残差网络明确的学习了一些堆叠层的残差映射。
    residual mapping is much easier to be learned than the original unreferenced mapping, residual network explicitly learns a residual mapping for a few stacked layers. With such a residual learning strategy, extremely deep CNN can be
    利用残差学习策略,可以很容易的训练很深的CNN,并且提高的精度已经可以被用于图像分类和物体检测。被提出的DnCNN模型也采用了残差学习形式。
    easily trained and improved accuracy has been achieved for image classification and object detection [22].The proposed DnCNN model also adopts the residual learning formulation. Unlike the residual network [22] that
    不像残差网络用来许多残差单元那样(即同一快捷方式),我们的DnCNN使用单一残差单元来预测残差图像。
    uses many residual units (i.e., identity shortcuts), our DnCNN employs a single residual unit to predict the residual image. We further explain the rationale of residual learning
    我们通过分析它和TNRD的联系继续解释残差学习形式的根本原因,并且将它延伸到解决一些一般图像去噪问题。
    formulation by analyzing its connection with TNRD [16], and extend it to solve several general image denoising tasks. It should be noted that, prior to the residual
    需要被注意的是,在残差网络之前,预测残差图像的策略已经被应用到一些低水平图像问题中,例如单一图像超分辨率和彩色图像去马赛克。
    network [22], the strategy of predicting the residual image has already been adopted in some low-level vision problems such as single image super-resolution [31] and color image demosaicking [32]. However,to the best of our knowledge, there is no work which directly predicts the
    然而,据我们所知,并没有直接为去噪预测残差图像的工作。
    residual image for denoising.
  2. Batch Normalization
    批量归一化
    Mini-batch stochastic gradient descent (SGD) has been
    小批量随机梯度下降(SGD)已经被广泛应用于训练CNN。尽管小批量SGD的简单性和高效性,其内部协变量位移极大的减少了他的训练效率。
    widely used in training CNN models.Despite the simplicity and effectiveness of mini-batch SGD,its training efficiency is largely reduced by internal covariate shift [21], i.e., changes
    即,训练期间内部非线性输入的分配改变。
    in the distributions of internal non-linearity inputs during training. Batch normalization [21] is proposed to alleviate
    批量归一化被提出来减轻内部协变量位移,通过在每一个层中非线性化之前引入归一化步骤和缩放和移位步骤。
    the internal covariate shift by incorporating a normalization step and a scale and shift step before the nonlinearity in each layer. For batch normalization, only
    对于批量归一化,每次激活仅仅加入两个参数,他们可以通过反向传播算法被更新。
    two parameters per activation are added, and they can be updated with back-propagation. Batch normalization
    批量归一化有一些好处,例如快速训练,更好的性能和对于初始化更低的敏感度。对于批量归一化更多的细节,请参考[21]
    enjoys several merits, such as fast training, better performance, and low sensitivity to initialization. For further details on batch normalization, please refer to [21].By far, no work has been done on studying batch
    至今,还没有研究基于CNN图像去噪的批量归一化。从经验上看,我们可以发现残差学习和批量归一化的结合能过偶让训练更加快速稳定,让去噪性能更好。
    normalization for CNN-based image denoising. We empirically find that,the integration of residual learning and batch normalizationcan result in fast and stable training and better denoising performance.

III. THE PROPOSED DENOISING CNN MODEL

3.DnCNN模型

In this section, we present the proposed denoising CNN
在这一节中,我们展示了被提出的去噪卷积神经网络模型。即DnCNN,并且将它延伸至处理一些一般的图像去噪任务。
model, i.e., DnCNN, and extend it for handling several general image denoising tasks. Generally, training a deep CNN
一般而言,为一个特定的任务训练一个深度卷积神经网络模型一般包括两步网络结构设计和从训练的数据中模学习模型。
model for a specific task generally involves two steps: (i) network architecture design and (ii) model learning from training data. For network architecture design, we modify
对于网络结构设计,我们调整了VGG的网络来使他适合图像去噪,并根据最先进的去噪方法上使用的有效色块大小设置网络深度。
the VGG network [19] to make it suitable for image denoising, and set the depth of the network based on the effective patch sizes used in state-of-the-art denoising methods. For model learning,we adopt the residual learning
对于模型学习,我们采取残差学习形式,与批量归一化结合,得到快速训练和提高去噪性能。最后我们讨论DnCNN和TNRD之间的联系,为一些一般图像去噪任务来延伸DnCNN。
formulation, and incorporateit with batch normalization for fast training and improved denoising performance. Finally, we discuss the connection between DnCNN and TNRD [16], and extend DnCNN for several general image denoising tasks.

A. Network Depth
A.网络深度

Following the principle in [19], we set the size of con-
按照[19]中的原则,我们将卷积滤波器的大小设置为3*3,但是去除了所有的池化层。
volutional filters to be3×3,but remove all pooling layers.Therefore, the receptive field of DnCNN with depth
因此,深度为d的DnCNN的感受野为(2d+1)×(2d+1)。增加感受野的大小可以充分利用更大的图像区域的上下文信息。
of d should be(2d+1)×(2d+1). Increasing receptive field size can make use of the context information in larger image region.For better tradeoff between performance and
为了更好的权衡性能和效率,在结构设计中一个很重要的因素是为DnCNN设置一个合适的深度。
esfficiency, one important issue in architecture design is to set a proper depth for DnCNN.It has been pointed out
已经指出,去噪神经网络感受野的大小和去噪方法的有效斑块大小有关。
that the receptive field size of denoising neural networks correlates with the effective patch size of denoising methods [23], [24]. Moreover, high noise level usually
然而,高噪声水平通常要求更大的有效斑块大小来捕捉更多的上下文信息来进行修复。
requires larger effective patch size to capturemore context information for restoration [34]. Thus, by fixing the
因此,通过固定噪声水平σ= 25,我们分析一些主要的去噪方法的有效斑块大小来引导我们的DnCNN的深度设计。
noise level σ= 25, we analyze the effective patch size of several leading denoising methods to guide the depth design of our DnCNN. In BM3D [2], the non-local similar patches are
在BM3D中,非局部相似贴片在尺寸为25×25的局部窗口中自适应搜索两次,因此最后的有效贴片尺寸为49×49.
adaptively searched in a local widow of size 25×25 for two times, and thus the final effective patch size is 49×49. Similar to BM3D, WNNM [13] uses a larger searching window
和BM3D相似,WNNM用一个更大的搜索窗口并迭代的进行非局部搜索,从而产生很大的有效贴片大小(361×361)。
and performs non-local searching iteratively, resulting in a quite large effective patch size (361×361). MLP [24] first
多层感知器(MLP)首先使用一个39×39的贴片产生预测贴片,然后采用尺寸为9×9的滤波器来对输出贴片进行平均,因此他的有效贴片尺寸是47×47.
uses a patch of size 39×39 to generate the predicted patch, and then adopts a filter of size 9×9 to average the output patches, thus its effective patch size is 47×47. The CSF [14] and TNRD [16]with five stages involves a total of ten
有五个阶段的收缩级联方法(CSF)和可训练的非线性反应扩散(TNRD)一共涉及十个滤波器大小为7×7的卷积层,他们的有效贴片尺寸是61×61.
convolutional layers with filter size of7×7, and their effective patch size is61×61.
Table I summarizes the effective patch sizes adopted
表1总结了噪声水平 σ= 25时被不同的方法所采用的有效贴片尺寸。
indifferent methods with noise level σ= 25. It can be seen that the effective patch size used in EPLL [33] is the
我们可以看到在EPLL中使用的有效贴片大小是最小的,即36×36。有趣的是证明是否EPLL相似的感受野大小的DnCNN可以和领先的去噪方法相竞争。
smallest,i.e.,36×36. It is interesting to verify whether DnCNN with the receptive field size similar to EPLL can compete against the leading denoising methods. Thus, for Gaussian
因此,对于有着确定噪声水平的高斯去噪,我们设定DnCNN的感受野大小为35×35,同时相应的深度是17。
denoising with a certain noise level, we set the receptive field size of DnCNN to 35×35 with the corresponding depth of 17. For other general image denoising tasks, we adopt a
对于其他的一般图像去噪任务,我们采取以一个更大的感受野,并且设定深度为20。
larger receptivefield and set the depth to be 20.
在这里插入图片描述

B. Network Architecture
B.网络结构

The input of our DnCNN is a noisy observation y=x+v.
我们的DnCNN的输入时噪声观察y=x+v.,辨别去噪模型,例如多层感受器和收缩级联方法目的是学习映射函数F(y) =x来预测潜在的干净图像。
Discriminative denoising models such as MLP [24] and CSF [14] aim to learn a mapping function F(y) =x to predict the latent clean image. For DnCNN, we adopt the
对于DnCNN,我们采用了残差学习公式来训练残差映射R(y)≈v,然后我们可以得到 x=y− R(y)。形式上,期望的残差图像和从噪声输入中被预估的残差图像之间的平均均方误差
residual learning formulation to train a residual mapping R(y)≈v,and then we have x=y− R(y). Formally, the averaged mean squared error between the desired residual images and estimated ones from noisy input
在这里插入图片描述

can be adopted as the loss function to learn the
可以采用损失函数来学习DnCNN 中被训练的参数θ。
trainable parameters Θ in DnCNN. Here
在这里插入图片描述

represents N noisy-clean training image (patch) pairs. Fig. 1
这里…代表N个噪声-干净训练图像(补丁)对,图1示出了用于学习 R(y)被提出的DnCNN的结构。接下来,我们解释DnCNN的结构以及减少边界伪影的策略。
illustrates the architecture of the proposed DnCNN for learning R(y). In the following, we explain the architecture of DnCNN and the strategy for reducing boundary artifacts.

在这里插入图片描述

  1. Deep Architecture:
    (1)深度架构
    Given the DnCNN with depth D,there are three types of
    给定深度为D的DnCNN,由三种不同类型的层,在Fig.1中用三种不同颜色表示。
    layers, shown in Fig. 1 with three different colors.
    (i) Conv+ReLU: for the first layer, 64 filters of size 3×3×c are
    (i)Conv+ReLU:第一层,使用64个大小为3×3×c的滤波器产生64个特征图,然后将整流线性单元 (ReLU,max(0,·)) 用于非线性。在这里c代表图像的通道数,即c=1代表灰色图像,c=3代表彩色图像。
    used to generate 64 feature maps, and rectified linear units (ReLU,max(0,·)) are then utilized for nonlinearity. Here c represents the number of image channels,i.e.,c= 1for gray image and c= 3for color image. (ii)Conv+BN+ReLU:
    (ii)Conv+BN+ReLU:64个大小为3×3×64的滤波器被使用,同时批量归一化被加在卷积和整流线性单元之间。
    for layers 2∼(D−1), 64 filters of size 3×3×64 are used, and batch normalization [21] is added between convolution and ReLU. (iii) Conv: for the last layer,c filters of size3×3×64
    (iii)Conv:对于最后一层,c个大小为3×3×64个滤波器被用来重建输出。
    are used to reconstruct the output.To sum up, our DnCNN
    总结来说,我们的DnCNN模型有两个主要的特征:采用残差学习公式来学习R(y),同时结合批量归一化来加速训练和提高去噪性能。
    model has two main features: the residual learning formulation is adopted to learn R(y), and batch normalization is incorporated to speed up training as well as boost the denoising performance. By incorporating
    通过将卷积和整流线性单元结合,DnCNN可以通过隐藏层将图像结构从噪声观察中分离出来。
    convolution with ReLU, DnCNN can gradually separate image structure from the noisy observation through the hidden layers.Such a mechanism is similar to the iterative noise
    这种机制类似于在EPLL和WNNA中采用的迭代噪声移除策略,但是我们的DnCNN以端到端的形式进行训练。
    removal strategy adopted in methods such as EPLL and WNNM, but our DnCNN is trained in an end-to-end fashion. Later we will give more discussions on the rationale of
    之后,我们将给出把残差学习和批量归一化结合起来的基本原理。
    combining residual learning and batch normalization.
  2. Reducing Boundary Artifacts:
    (2)减少边界伪影
    In many low level vision applications, it usually requires
    在许多低水平视觉应用中,通常要求输出的图像大小需要和输入的大小保持一致。这可能就会导致边界伪影。在多层感知器中,噪声输入图像的边界在预处理阶段被对象填充,而相同的填充策略在CSF和TNRD的每个阶段都执行。
    that the output image size should keep the same as the input one. This may lead to the boundary artifacts. In MLP [24], boundary of the noisy input image is symmetrically padded in the preprocessing stage,whereas the same padding strategy is carried out before everyst
    age in CSF [14] and TNRD [16]. Different from the above
    和上述方法不同,我们在卷积之前直接填充0来确保中间层的每个特征图像和输入图像相比都有相同的大小。
    methods, we directly pad zeros before convolution to make sure that each feature map of the middle layers has the same size as the input image. We find that the simple zero
    我们发现简单的令填充策略不会导致任何的边界伪影。这个好的特性可能归功于DnCNN的强大能力。
    padding strategy does not result in any boundary artifacts. This good property is probably attributed to the powerful ability of the DnCNN

C. Integration of Residual Learning and Batch Normalization for Image Denoising
残差学习和批量归一化对图像去噪的结合

The network shown in Fig. 1 can be used to train either the
在图1中被展现的网络可以被用于训练原始映射F(y)来预测x,或者残差映射R(y)来预测v。
original mapping F(y) to predict x or the residual mapping R(y) to predict v. According to [22], when the original map-
根据[22],当原始映射更像识别映射时,残差映射将更容易被优化。
ping is more like an identity mapping, the residual mapping will be much easier to be optimized. Note that the noisy observation y is much more like the latent clean
注意到噪声观察更像潜在的干净图像x而不是残差图像v(特别是当噪声水平很低时)。因此,F(y)将更像识别图像而不是R(y),并且残差学习公式会更适合图像去噪。
image x than the residual image v(especially when the noise level is low).Thus,F(y) would be more close to an identity mapping than R(y), and the residual learning formulation is more suitable for image denoising.Fig. 2 shows th average PSNR values obtained using these two
表2表现出基于梯度的优化算法和网络结构在相同的设置下使用这两种学习公式获得的平均PSNR值,有无批量归一化。
learning formulations with/without batch normalization under the same setting on gradient-based optimization algorithms and network architecture. Note that two gradient-based optimization algorithms are adopted: one is
注意到两种基于梯度的优化算法有两种:一种是具有动量的随机梯度下降算法(即SGD),另一种是Adam算法。
the stochastic gradient descent algorithm with momentum (i.e., SGD) and the other one is the Adam algorithm [30]. Firstly, we can observe that the residual learning
首先,我们观察到残差学习公式可以得到比原始映射学习更快更稳定的收敛。
formulation can result in faster and more stable convergence than the original mapping learning. In the
同时,没有批量归一化,简单的带有传统的SGD的残差学习无法与最先进的去噪算法如TNRD(28.92dB)进行竞争。
meanwhile, without batch normalization, simple residual learning with conventional SGD cannot compete with the state-of-the-art denoising methods such as TNRD (28.92dB).We consider that the insufficient performance
我们考虑到不充分的性能是由训练期间网络参数的改变引起内部协变量移动导致的。
should be attributed to the internal covariate shift [21] caused by the changes in network parameters during training. Accordingly,batch normalization is adopted to
因此,批量归一化可以用来解决它。
address it. Secondly, we observe that, with batch
其次,我们观察到通过批量归一化,学习残差映射(红线)收敛更快,并且表现出比学习一般的映射(蓝线)更好的去噪性能。
normalization, learning residual mapping(the red line) converges faster and exhibits better denoising performance than learning original mapping (the blue line). Inparticular, both the SGD and Adam optimization
特别地,SGD和Adam优化算法能够使带有残差学习和批量归一化的网络有最好的效果。
algorithms can enable the network with residual learning and batch normalization to have the best results. In other words, it is the integration of residual learning formulation
换句话说,是残差学习公式和批量归一化的结合而不是优化算法导致了最好的去噪性能。
and batch normalization rather than the optimization algorithms (SGD or Adam) that leads to the best denoising performance.Actually, one can notice that in Gaussian
事实上,我们可以注意到在高斯去噪中,残差图像和批量归一化都和高斯分布相关联。
denoising the residual image and batch normalization are both associated with the Gaussian distribution. It is very likely that residual learning and batch normalization can
对于高斯去噪,残差学习和批量归一化都可以相互关联。这一点可以在接下来的分析中继续被证实。
benefit from each other for Gaussian denoising 1. This point can be further validated by the following analyses.
• On the one hand, residual learning benefits from
一方面,残差学习从批量归一化中获益,这是很直接的,因为批量归一化为CNN提供了一些好处,例如减轻内部协变量移动问题。
batch normalization. This is straight forward because batch normalization offers some merits for CNNs, such as allevi-ating internal covariate shift problem. From Fig. 2, one can see that even though residual learning without
从表2中可以看出,即使没有批量归一化的残差学习(绿线)也具有快速收敛,它还不如有批量归一化的残差学习(红线)。
batch normalization (the green line) has a fast convergence, it is inferior to residual learning with batch normalization(the red line).
• On the other hand, batch normalization benefits from
另一方面,批量归一化从残差学习中获益。
residual learning. As shown in Fig. 2, without residual learning, batch normalization even has certain adverse
正如表2所示,没有残差学习,批量归一化对于收敛(蓝线)会有一定的不利影响。在由残差学习时,批量归一化可以被应用到加速训练同时也可以提高性能(红线)。
effect to the convergence (the blue line). With residual learning, batch normalization can be utilized to speed up the training as well as boost the performance (the red line). Note that each mini-bath is a small set (e.g., 128)of images.
注意,每一小批量时,一组小的图像(例如:128个)。
Without residual learning, the input intensity and the
在没有残差学习时,输入的强度和卷积的特征都和他们的邻近相关,并且层输入的分配也依赖于每次小批量训练的图像的内容。
convolutional feature are correlated with their neighbored ones, and the distribution of the layer inputs also rely on the content of the images in each training mini-batch. With residual learning, DnCNN implicitly removes the latent clean
通过残差学习,DnCNN隐含的使用隐藏层中的操作移除了潜在的干净图像。
image with the operations in the hidden layers. This makes
着让每一层的输入都是高斯分布,相关性更低,和图像的内容联系更少。
that the inputs of each layer are Gaussian-like distributed, less correlated, and less related with image content. Thus, residual learning can also help batch normalization in
因此,残差学习也可以帮助在减少内部协变量移动使的批量归一化。
reducing internal covariate shift.
To sum up, the integration of residual learning and batch normalization can not only speed up and stabilize the training process but also boost the denoising performance.
总结一下,残差学习和批量归一化的结合不仅可以加速和稳定训练过程,也可以提高去噪性能。
在这里插入图片描述
信息提取:残差学习和批量归一化的整合提高了去噪的质量,加快了训练的速度。

D. Connection with TNRD
D.和TNRD的联系
Our DnCNN can also be explained as the generalization
我们的DnCNN也可以被解释为一阶段TNRD的推广。通常,TNRD旨在从大量的降级清洁训练图像对中为下列问题训练一个辨别器。
of one-stage TNRD [15], [16]. Typically, TNRD aims to train a discriminative solution for the following problem
在这里插入图片描述from an abundant set of degraded-clean training image pairs.Here N denotes the image size,λ is the regularization
在这里N代表的是图像大小,λ 是正则化元素,fk*x代表了图像x与第k层滤波器核fk的卷积,并且ρk(·) 代表在第k个惩罚函数,它在TNRD 模型中是可调整的。
parameter,fk∗x stands for the convolution of the image x with the k-th filter kernel fk, and ρk(·) represents the k-th penalty function which is adjustable in the TNRD model. For Gaussian denoising, we set Ψ(z) =1/2‖z‖2.The diffusion
对于高斯去噪,我们设置Ψ(z) =1/2‖z‖2,第一阶段的扩散迭代可以解释为在起始点y执行一个阶梯推断步骤由下式给出
iteration of the first stage can be interpreted as performing one gradient descent inference step at starting point y, which is given by,
在这里插入图片描述

其中,f~k 是fk的伴随滤波器(即f~k是旋转滤波器fk旋转得到的),α对应于步长
在这里插入图片描述where v1 is the estimated residual of x with respect to y.
其中v1是x相对于y的估计残差。
Since the influence function φk(·) can be regarded as
由于影响函数 φk(·) 可以被视作逐点非线性应用于卷积特征映射,因此,Eqn.(4)实际上是一个两层的前反馈CNN。
point-wise nonlinearity applied to convolution feature maps, Eqn.(4) actually is a two-layer feed-forward CNN. As can be seen from Fig. 1, the proposed CNN
正如表1所示,被提出的CNN结构从三个方面进一步概括了一阶段TNRD:(i)用整流线性单元代替影响函数来简化CNN训练(ii)提高CNN深度来提高提高图像特征建模能力(iii)和批量归一化结合来提高性能
architecture further generalizes one-stage TNRD from three aspects: (i) replacing the influence function with ReLU to ease CNN training; (ii) increasing the CNN depth to improve the capacity in modeling image characteristics; (iii) incorporating with batch normalization to boost the performance. The connection with one-stage TNRD
与一阶段TNRD的联系提供了解释基于CNN的图像恢复的残差学习的使用的见解。方程(4)中的大多数参数从方程(2)的先前项分析获得。
provides insights in explaining the use of residual learning for CNN-based image restoration. Most of the parameters in Eqn. (4) are derived from the analysis prior term of Eqn. (2). In this sense, most of the parameters in DnCNN are
从这个意义上看,大多数DnCNN中的参数代表图像的先验项。
representing the image priors.
It is interesting to point out that, even the noise is not
有趣的是,即使噪声不是高斯分布(或者噪声的高斯分布是未知的) ,我们仍然可以使用方程(3)来获得v1,如果我们注意到方程(5)对许多种噪声分布,例如普遍的高斯分布,都成立的话。
Gaussian distributed (or the noise level of Gaussian is unknown),we still can utilize Eqn. (3) to obtain v1 if we have
在这里插入图片描述Note that Eqn. (5) holds for many types of noise distributions,e.g., generalized Gaussian distribution. It is natural to assume that it also holds for the noise caused by SISR and JPEG compression. It is possible to train a
我们可以自然的假设,它对于由SISR和JPEG压缩引起的噪声都成立。有可能为一些一般的图像去噪任务训练单一CNN模型,例如带有未知噪声水平的高斯去噪,具有多层放大因子的SISR,和具有不同质量因子的JPEG去块。
single CNN model for several general image denoising tasks, such as Gaussian denoising with unknown noise level, SISR with multiple upscaling factors, and JPEG deblocking with different quality factors.
Besides, Eqn. (4) can also be interpreted as the operations to remove the latent clean image x from the degraded
此外,方程(4)也可以被解释为从劣质观察y中移除潜在干净图像x的操作以估计残差图像x。
observation y to estimate the residual image v. For these tasks, even the noise distribution is complex, it can be
对于这些任务,即使噪声分布很复杂,也可以预测到我们的DnCNN也能执行,通过逐渐移除潜在的干净图像来稳定预测残留图像。
expected that our DnCNN would also perform robustly in predicting residual image by gradually removing the latent clean image in the hidden layers.
E. Extension to General Image Denoising
E.延伸到一般的图像去噪
The existing discriminative Gaussian denoising
现有的辨别高斯去噪方法,例如MLP,CSF和TNRD,全都是为了一个固定的噪声水平训练特定的模型。
methods,such as MLP, CSF and TNRD, all train a specific model for a fixed noise level [16], [24]. When applied to Gaussian denoising with unknown noise, one
当被应用到未知噪声的高斯去噪时,一个常用的方法是首先估算一个噪声水平,然后试验被相应噪声水平训练的模型。
common way is to first estimate the noise level, and then use the model trained with the corresponding noise level. This makes the denoising results affected by the accuracy of noise
这就让去噪结果被噪声估计的准确性所影响了。
estimation. In addition, those methods cannot be applied to
此外,这些方法不能被应用到噪声非高斯分布的情况,例如SISR和JPEG去块。
the cases with non-Gaussian noise distribution, e.g., SISR and JPEG deblocking.Our analyses in Section III-D have
我们在第三节D中的分析以及显示了DnCNN在一般的图像去噪中的潜能。
shown the potential of DnCNN in general image denoising. To demonstrate it, we first extend our DnCNN for
为了证明这一点,我们首先将我们的DnCNN延伸为具有未知噪声水平的高斯去噪。在训练阶段,我们使用来自各种噪声水平(例如,σ∈[0,55])的噪声图像来训练单一DnCNN模型。
Gaussian denoising with unknown noise level. In the training stage, we use the noisy images from a wide range of noise levels (e.g.,σ∈[0,55]) to train a single DnCNN model. Given a test image whose noise level belongs to the noise level range,
给出一个噪声水平在这个范围内的测试图像,那么不用估计他的噪声水平,被学习的单一DnCNN模型也能被用来对其去噪。
the learned single DnCNN model can be utilized to denoise it without estimating its noise level.We further extend our
我们学习单一模型来继续延伸我们的DnCNN,用于一些一般图像去噪任务。我们考虑三个特定的任务,即,盲高斯去噪,SISR和JPEG去块。
DnCNN by learning a single model for several general image denoising tasks. We consider three specific tasks, i.e., blind Gaussian denoising, SISR, and JPEG deblocking. In the training stage, we utilize the images with AWGN
在训练阶段,我们利用各个噪声水平的AWGN(加性高斯白噪声)图像,具有多个放大因子的下采样图像,和具有不同质量因子的JPEG图像来训练单一DnCNN模型。
from a wide range of noise levels, down-sampled images with multiple upscaling factors, and JPEG images with different quality factors to train a single
DnCNN 试验的结果显示了被学习的单一DnCNN模型可以为三个一般图像去噪任务中的任意一个产生优异的结果。
model.Experimental results show that the learned single
DnCNN model is able to yield excellent results for any of the three general image denoising tasks.

IV. EXPERIMENTAL RESULTS

四.实验结果

A. Experimental setting
A.实验设置

  1. Training and Testing Data:
    (1)训练和测试数据
    For Gaussian denoising with either known or unknown
    对于具有已知或者未知噪声水平的高斯去噪,我们跟着[16]用400张尺寸为180*180来训练。我们发现使用更大的训练数据集只可能带来微不足道的提高。
    noise level, we follow [16] to use 400 images of size 180×180 for training. We found that using a larger training dataset can only bring negligible improvements. To train DnCNN for Gaussian denoising with known noise level,
    为了训练DnCNN进行具有已知噪声水平的高斯去噪,我们可以考虑三个噪声水平,即σ= 15, 25 and 50。我们将斑块大小设置为40×40,并裁减128×1600个斑块来训练模型。
    we consider three noise levels, i.e.,σ= 15, 25 and 50. We set the patch size as 40×40, and crop 128×1,600 patches to train the model. We refer to our DnCNN model for
    我们谈到的对具有已知的特定噪声水平的高斯去噪称作为DnCNN-S。为了为盲高斯去噪训练单个DnCNN模型,我们设置噪声水平的范围为σ∈[0,55],并且斑块尺寸为50×50.
    Gaussian denoising with known specific noise level as DnCNN-S.To train a single DnCNN model for blind Gaussian denoising, we set the range of the noise levels as σ∈[0,55], and the patch size as 50×50.128×3,000 patches are cropped to train the model. We refer to our single DnCNN
    裁减128×3000个补丁来训练模型。我们将用于盲高斯去噪的单个DnCNN模型称为DnCNN-B。
    model for blind Gaussian denoising task as DnCNN-B.
    For the test images, we use two different test
    对于测试图像,我们使用两个不同的数据集来进行全面评估,一个是包含伯克利分割数据集(BSD68)的68个自然图像的测试数据集,另一个包含12个图像,如图三所示。
    datasets for thorough evaluation, one is a test dataset containing 68 natural images from Berkeley segmentation dataset (BSD68) [12] and the other one contains 12 images as shown in Fig. 3.
    在这里插入图片描述
    Note that all those images are widely used for
    注意到所有的那些图像都被广泛应用到高斯去噪方法的评估中,并且他们不被包含在训练数据集中。
    the evaluation of Gaussian denoising methods and they are not included in the training dataset.In addition to gray image denoising, we also train the
    除了灰色图像去噪,我们也可以训练盲目彩色图像去噪模型,称为CDnCNN-B。我们使用BSD68数据集的彩色版本进行测试,并且从伯克利分割数据集中的剩余432个彩色图像作为训练图像。
    blind color image denoising model referred to as CDnCNN-B. We use color version of the BSD68 dataset for testing and the remaining 432 color images from Berkeley segmentation dataset are adopted as the training images. The noise levels are also set into the range of [0,55] and
    噪声水平也被设置为【0,55】范围内,并且128×3000个尺寸为50×50的斑块被修剪来训练模型。
    128×3,000 patches of size 50×50 are cropped to train the model.
    To learn a single model for the three general
    为了学习三个一般图像去噪任务的单一模型,正如,在[35]中,我们使用的数据集包含来自[36]的91个图像和来自伯克利分割数据集中的200个训练图像。
    image denoising tasks, as in [35], we use a dataset which consists of 91 images from [36] and 200 training images from the Berkeley segmentation dataset. The noisy image is generated by adding Gaussian noise with a
    通过在[0,55]范围内增加特定噪声水平的高斯噪声,产生噪声图像。SISR输入有第一个双三次下采样生成,然后使用缩减因子2,3,4对对高分辨率图像进行双三次上采样。
    certain noise level from the range of [0,55]. The SISR input is generated by first bicubic downsampling and then bicubic upsampling the high-resolution image with downscaling factors 2, 3 and 4. The JPEG deblocking input is generated by compressing the image with a quality factor
    JPEG去块输入是通过使用 MATLAB JPEG 编码器以5到99的质量因子压缩图像生成的。所有的这些图像都被作为单一DnCNN模型的输入。
    ranging from 5 to 99 using the MATLAB JPEG encoder. All these images are treated as the inputs to a single DnCNN model. Totally, we generate 128×8,000 image patch
    总共,我们产生了128×8000个图像补丁(尺寸为50×50)对来训练。在小批量学习期间使用补丁对的基于旋转/基于FI的操作。参数用DnCNN-B进行初始化。
    (the size is 50×50) pairs for training. Rotation/flipbased operations on the patch pairs are used during mini-batch learning. The parameters are initialized with DnCNN-B. We refer to our single DnCNN model for these three general image
    我们称具有着三种一般图像去噪任务的单个DnCNN模型为DnCNN-3。为了测试DnCNN-3,我们为每一个任务采用了不同的测试集,细节描述将在第四节E中给出。
    denoising tasks as DnCNN-3. To test DnCNN-3, we adopt different test set for each task, and the detailed description will be given in Section IV-E.
  2. Parameter Setting and Network Training:
    (2)参数设置和网络训练:
    In order to capture enough spatial information for denoising, we set the network
    为了捕捉足够多的空间信息进行去噪,我们为DnCNN-S设置网络深度为17,为DnCNN-B和DnCNN-3设置网络深度为20.方程(1)中的损失函数学习残差映射R(y)来为预测残差图像v。
    depth to 17 for DnCNN-S and 20 for DnCNN-B and DnCNN-3. The loss function in Eqn. (1) is adopted to learn the residual mapping R(y) for predicting the residual v. We initialize the weights by the method in [27] and use SGD with weight decay of
    我们通过[27]中的方法初始化权重,并使用权重衰减为0.0001,动量为0.9,小批量为128的SGD。我们为我们的DnCNN模型训练50个周期。在50个迭代周期中,学习率从1e−1到1e−4衰减。
    0.0001, a momentum of 0.9 and a mini-batchsize of 128. We train 50 epochs for our DnCNN models. The learning rate was decayed exponentially from1e−1to1e−4for the 50 epochs.We use the MatConvNet package [37] to train the proposed DnCNN
    我们使用MatConvNet软件包来训练被提出的DnCNN模型。除非另有说明,否则所以的实验都将在Matlab(R2015b)环境中进行。环境在具Intel(R)Core(TM)i7-5820K CPU 3.30GHz和Nvidia Titan X GPU的PC上运行。 在GPU上分别训练DnCNN-S,DnCNNB / CDnCNN-B和DnCNN-3需要大约6小时,一天和三天。
    models. Unless otherwise specified, all the experi-ments are carried out in the Matlab (R2015b) environment running on a PC with Intel® Core™ i7-5820K CPU 3.30GHz and an Nvidia Titan X GPU. It takes about 6hours, one day and three days to train DnCNN-S, DnCNN-B/CDnCNN-B and DnCNN-3 on GPU, respectively.

B. Compared Methods
B.比较方法

We compare the proposed DnCNN method with several state-of-the-art
我们将提出的DnCNN方法和一些最先进的去噪方法进行比较,包括基于非局部相似的方法(即,BM3D和WNNM),一种生成方法(即,EPLL),三种基于辨别训练的方法(即,MLP,CSF和TNRD)。
denoising methods, including two non-local similarity based methods (i.e., BM3D [2] and WNNM [13]),one generative method (i.e., EPLL [33]), three discriminative training based methods (i.e., MLP [24], CSF [14] andTNRD [16]). Note that CSF and TNRD are highly efficient by GPU implementation while offering good
注意到CSF和TNRD通过GPU实现非常高效,同时提供良好的图像质量。实施代码从作者的网站中下载,并且默认设置被应用到我们的实验中。我们的DnCNN模型的测试代码可以在https://github.com/cszn/DnCNN.中下载。
image quality.The implementation codes are downloaded from the authors’ websites and the default parameter settings are used in our experiments. The testing code of our DnCNN models can be downloaded at https://github.com/cszn/DnCNN.
C. Quantitative and Qualitative Evaluation
C.定性和定量的评估

The average PSNR results of different methods on the BSD68 dataset
在表2中显示的是BSD68数据集上不同方法的平均PSNR结果。如图所示DnCNN-S和DnCNN-B都可以实现比竞争方法更好的PSNR结果。
are shown in Table II. As one can see, both DnCNN-S and DnCNN-B can achieve the best PSNR results than the competing methods. Compared to the benchmark BM3D, the methods MLP and TNRD have
和基准BM3D相比,方法MLP和TNRD具有显著的PSNR增益,大约0.35dB。根据[34],[38],很少有方法的表现平均超过BM3D 0.3dB。相反,我们的DnCNN-S模型在所有的三种噪声水平中超过BM3D0.6dB。
a notable PSNR gain of about 0.35dB. According to [34], [38], few methods can outperform BM3D by more than 0.3dB on average. Incontrast, our DnCNN-S model outperforms BM3D by 0.6dB on all the three noise levels. Particularly, even with a single model without known
特别的,即使是具有未知噪声水平的单个模型,我们的DnCNN-B仍然可以比为已知的特定噪声水平训练的竞争方法更优异。值得注意的是,在σ= 50时 DnCNN-S 和 DnCNN-B均优于BM3D大约0.6dB,这与[38]中BM3D (0.7dB)的估计PSNR 非常接近。
noise level, our DnCNN-B can still outperform the competing methods which is trained for the known specific noise level. It should be noted that both DnCNN-S and DnCNN-B outperform BM3D by about 0.6dB whenσ= 50, which is very close to the estimated PSNR bound over BM3D (0.7dB) in [38].
Table III lists the PSNR results of different methods on the 12 test
表3列出了用不同的方法在图3中的12个测试图像的RSNR结果。每个噪声水平的每个图像的最好的PSNR结果以粗体突出显示。可以看出,所提出的DnCNN-S在绝大多数图像中产生最高的PSNR。
images in Fig. 3. The best PSNR result for each image with each noise level is highlighted in bold. It can be seen that the proposed DnCNN-S yields the highest PSNR on most of the images. Specifically, DnCNN-S outperforms the competing methods by 0.2dB to 0.6dB on
特别的,DnCNN-S比竞争的方法表现的优异0.2dB到0.6dB,并且仅在图像“House”和“Barbara”上没有实现最好的结果,这两个图像由重复结构支配。
most of the images and fails to achieve the best results on only two images“House” and “Barbara”, which are dominated by repetitive structures. This result is consistent with the findings in [39]:non-local
这个结果和[39]中的结果一致:基于非局部均值方法通常对具有规律和重复结构的图像更好,而基于辨别训练的方法通常对具有不规则纹理的图像有更好的效果。
means based methods are usually better on images with regular and repetitive structures whereas discriminative raining based methods generally produce better results on images with irregular textures. Actually, this is intuitively
这是直观合理的,因为有着规则重复结构的图像和非局部自相似先验很好地匹配,相反,有着不规则纹理的图像会减弱这种先验项的优势,因此导致了不好的结果。
reasonable because images with regular and repetitive structures meet well with the non-local similarity prior; conversely,images with irregular textures would weaken the advantages of such specific prior, thus leading to poor results.
在这里插入图片描述
Figs. 4-5 illustrate the visual results of different methods.It can be seen
图4-5解释了不同方法的视觉结果。可以看出,BM3D,WNNM,EPLL和MLP有产生过于光滑的边界和纹理。在保护锐边和细节的同时,TNRD很可能在光滑的区域产生伪影。
that BM3D, WNNM, EPLL and MLP tend to produce over-smooth edges and textures. While preserving sharp edges and fine details, TNRD is likely to generate artifacts in the smooth region. In contrast, DnCNN-S and DnCNN-B
相反,DnCNN-S和DnCNN-B不仅可以恢复锐边和细节也可以在光滑区域产生视觉上令人愉悦的结果。
can not only recover sharp edges and fine details but also yield visually pleasant results in the smooth region.For color image denoising, the visual
对于彩色图像去噪,CDnCNN-B和基准CBM3D的对比被展示在图6-7中。可以看到CBM3D在某些区域产生假色伪影,而CDnCNN-B可以恢复更多自然色彩的图像。
comparisons between CDnCNN-B and the benchmark CBM3D are shown in Figs. 6-7. One can see that CBM3D generates false color artifacts in some regions whereas CDnCNN-B can recover images with more natural color. In addition, CDnCNN-B can generate images with more details and sharper
此外,CDnCNN-B可以产生比CBM3D具有更多细节和锐边的图像。
edges than CBM3D.
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
Fig. 8 shows the average PSNR improvement over BM3D/CBM3D with
图8显示了相对于DnCNN-B/CDnCNN-B模型的不同噪声水平,BM3D/CBM3D的平均PSNR改善。可以看出我们DnCNN-B/CDnCNN-B模型在广泛的噪声水平一直比BM3D/CBM3D表现的优异很多。这个实验结果证明了训练的单个DnCNN-B模型对在广泛的噪声水平内处理盲高斯去噪的灵活性。
respect to different noise levels by DnCNN-B/CDnCNN-B model. It can be seen that our DnCNN-B/CDnCNN-B models consistently outperform BM3D/CBM3D by a large margin on a wide range of noise levels. This experimental result demonstrates the feasibility of training a single DnCNN-B model for handling blind Gaussian denoising within a wide range of noise levels.
在这里插入图片描述
D. Run Time
D.运行时间

In addition to visual quality, another important aspect for an image restoration
除了视觉质量,对于图像恢复方法的另一个重要的方面就是测试速度。表IV显示了不同的方法对于噪声水平为25,尺寸为256 × 256, 512 × 512 and 1024 × 1024的图像去噪的时间。
method is the testing speed. Table IV shows the run times of different methods for denoising images of sizes 256 × 256, 512 × 512 and 1024 × 1024 with noise level 25. Since CSF, TNRD and our DnCNN methods are well-suited for
由于 CSF, TNRD和我们的 DnCNN方法非常适合GPU上的并行计算,我们也给出了在GPU上相应的运行时间。我们使用Nvidia cuDNNv5深度学习库来加速所提出的DnCNN的GPU计算。正如在[16]中所示,我们不计算在CPU和GPU之间的内存传输时间。
parallel computation on GPU, we also give the corresponding run times on GPU. We use the Nvidia cuDNNv5 deep learning library to accelerate the GPU computation of the proposed DnCNN. As in [16], we do not count the memory transfer time between CPU and GPU. It can be seen that the proposed
可以看到被提出的DnCNN可以在CPU中有相对高的速度,它比两个辨别方法更快,MLP和CSF。尽管它比BM3D和TNRD更慢,通过将图像质量的提高考虑进去,我们的DnCNN在CPU实现方面仍然很具有竞争力。
DnCNN can have a relatively high speed on CPU and it is faster than two discriminative models, MLP and CSF. Though it is slower than BM3D and TNRD, by taking the image quality improvement into consideration, our
DnCNN is still very competitive in CPU implementation. For the GPU time, the proposed DnCNN achieves very appealing computational efficiency, e.g., it
对于GPU时间,提出的DnCNN实现了非常吸引人的计算效率,例如:它可以在60ms内,对具有未知噪声水平的尺寸为512×512的图像去噪,这是相对于TNRD的明显优势。
can denoise an image of size 512 × 512 in 60ms with unknown noise level, which is a distinct advantage over TNRD.
在这里插入图片描述
E. Experiments on Learning a Single Model for Three General
Image Denoising Tasks
E.三种一般图像去噪任务学习单模型的实验

In order to further show the capacity of the proposed DnCNN model, a single
为了更深一步显示被提出的DnCNN模型的容量,针对三个一般图像去噪任务训练单一DnCNN-3模型,包括盲高斯去噪,SISR和JPEG图像去块。
DnCNN-3 model is trained for three general image denoising tasks, including blind Gaussian denoising, SISR and JPEG image deblocking. To the best of
据我们所知,现存的方法中没有被爆出能用单个模型处理这三个任务的。因此,对于每一个任务,我们将DnCNN-3与特定的最先进的方法进行比较。
our knowledge, none of the existing methods have been reported for handling these three tasks with only a single model. Therefore, for each task, we compare DnCNN-3 with the specific state-of-the-art methods. In the following,
接下里,我们为每一个任务描述了被比较的方法和测试数据集:
we describe the compared methods and the test dataset for each task:
• For Gaussian denoising, we use the state-of-the-art BM3D and TNRD for
对于高斯去噪,我们使用最先进的BM3D和TNRD来作比较。BSD68数据集被用来测试性能。对于BM3D和TNRD,我们假设噪声水平是已知的。
comparison. The BSD68 dataset are used for testing the performance. For BM3D and TNRD, we assume that the noise level is known.
• For SISR, we consider two state-of-the-art methods, i.e.,TNRD and VDSR
对于SISR,我们考虑两个最先进的方法,即TNRD和VDSR。TNRD为每一个上升因子训练了特定的方法,同时VDSR为所有的三种上升因子(即2,3,4)训练了单一模型。我们采用了四种被用在[35]中的测试数据集(即Set5 and Set14, BSD100 and Urban100 )
[35]. TNRD trained a specific model for each upscalling factor while VDSR [35] trained a single model for all the three upscaling factors (i.e., 2, 3and 4). We adopt the four testing datasets (i.e., Set5 and Set14, BSD100 and Urban100 [40]) used in [35].
• For JPEG image deblocking, our DnCNN-3 is compared with two state-of-
对于JPEG图像去块,我们的DnCNN-3被用来和两种最先进的方法比较,即AR-CNN [41]和TNRD [16]。AR-CNN方法为了JPEG质量因子10,20,30和40训练了四种特定模型。对于TNRD,为了JPEG质量因子10,20,30训练了三种模型。正如在[41]中,我们采用了Classic5 和 LIVE1作为测试数据集。
the-art methods, i.e., AR-CNN [41] and TNRD [16]. The AR-CNN method trained four specific models for the JPEG quality factors 10, 20, 30 and 40,
respectively. For TNRD, three models for JPEG quality factors 10, 20 and 30 are trained. As in [41], we adopt the Classic5 and LIVE1 as test datasets.
Table V lists the average PSNR and SSIM results of different methods for
表V列出了不同的方法对于不同的一般图像去噪任务的平均PSNR和 SSIM结果。
different general image denoising tasks.
As one can see, even we train a single DnCNN-3 model for the three different
正如我们所见,即使我们为三种不同的任务训练单个DnCNN-3模型,它仍然优于非盲TNRD和 BM3D进行高斯去噪。对于SISR,它大大超过了TNRD,和VDSR相当。对于JPEG图像去块,DnCNN-3在PSNR优于AR-CNN大约0.3dB,并且在所有的品质因子上比TNRD具有0.1dB的PSNR增益。
tasks, it still outperforms the nonblind TNRD and BM3D for Gaussian denoising. For SISR, it surpasses TNRD by a large margin and is on par with VDSR. For JPEG image deblocking, DnCNN-3 outperforms AR-CNN by about
0.3dB in PSNR and has about 0.1dB PSNR gain over TNRD on all the quality factors.
在这里插入图片描述
Fig. 9 and Fig. 10 show the visual comparisons of different methods for SISR.
图9和图10显示了对于SISR不同方法的不同视觉比较。可以看到DnCNN-3和VDSR都会产生锐边和细节,而TNRD倾向于产生模糊的边缘和扭曲的线条。
It can be seen that both DnCNN-3 and VDSR can produce sharp edges and fine details whereas TNRD tend to generate blurred edges and distorted lines.
Fig. 11 shows the JPEG deblocking results of different methods. As one can
图11显示了不同方法的JPEG去块结果。正如所见,我们的DnCNN-3能恢复直线,而AR-CNN 和 TNRD容易产生扭曲的线条。图12给出了附加的例子来显示被提出的模型的容量。我们可以看到DnCNN-3可以产生视觉上令人愉悦的输出结果及时输入的图像被不同区域的不同的若干失真破坏。
see, our DnCNN-3 can recover the straight line whereas AR-CNN and TNRD are prone to generate distorted lines. Fig. 12 gives an additional example to show the capacity of the proposed model. We can see that DnCNN-3 can produce visually pleasant output result even the input image is corrupted by several distortions with different levels in different regions.
在这里插入图片描述在这里插入图片描述

V. CONCLUSION

V.结论

In this paper, a deep convolutional neural network was proposed for image
本文提出了一个深度卷积神经网络进行图像去噪,并采用残差学习从噪声观察中分离噪声。批量归一化和残差学习被用来加速训练过程同时也能提高去噪性能。
denoising, where residual learning is adopted to separating noise from noisy observation. The batch normalization and residual learning are integrated to speed up the training process as well as boost the denoising performance. Unlike traditional discriminative models which train specific models for certain
和传统的辨别模型不同,他为特定的噪声水平训练特定模型,我们的单个DnCNN模型有能力处理具有未知噪声水平的盲高斯去噪。此外,我们显示了训练单个DnCNN模型来处理三个一般图像去噪任务的灵活性,包括具有未知噪声水平的高斯去噪,具有多个放大因子的单一图像超分辨率,和具有不同质量因子的JPEG去块。
noise levels, our single DnCNN model has the capacity to handle the blind Gaussian denoising with unknown noise level. Moreover, we showed the feasibility to train a single DnCNN model to handle three general image
denoising tasks, including Gaussian denoising with unknown noise level, single image super-resolution with multiple upscaling factors, and JPEG image deblocking with different quality factors. Extensive experimental results
延伸的实验结果证明了被提出的方法不仅可以定性定量的产生良好的图像去噪性能,而且通过GPU实现具有良好的运行时间。
demonstrated that the proposed method not only produces favorable image
denoising performance quantitatively and qualitatively but also has promising run time by GPU implementation

  • 4
    点赞
  • 25
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值