论文阅读笔记(五十六):Image Super-Resolution Using Deep Convolutional Networks

Abstract—We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve tradeoffs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.

Index Terms—Super-resolution, deep convolutional neural networks, sparse coding

1 INTRODUCTION
Single image super-resolution (SR) [20], which aims at recovering a high-resolution image from a single low-resolution image, is a classical problem in computer vision. This problem is inherently ill-posed since a multiplicity of solutions exist for any given low-resolution pixel. In other words, it is an underdetermined inverse problem, of which solution is not unique. Such a problem is typically mitigated by constraining the solution space by strong prior information. To learn the prior, recent state-of-the-art methods mostly adopt the example-based [46] strategy. These methods either exploit internal similarities of the same image [5], [13], [16], [19], [47], or learn mapping functions from external low and high-resolution exemplar pairs [2], [4], [6], [15], [23], [25], [37], [41], [42], [47], [48], [50], [51]. The external example-based methods can be formulated for generic image super-resolution, or can be designed to suit domain specific tasks, i.e., face hallucination [30], [50], according to the training samples provided.

The sparse-coding-based method [49], [50] is one of the representative external example-based SR methods. This method involves several steps in its solution pipeline. First, overlapping patches are densely cropped from the input image and pre-processed (e.g.,subtracting mean and normalization). These patches are then encoded by a low-resolution dictionary. The sparse coefficients are passed into a high-resolution dictionary for reconstructing high-resolution patches. The overlapping reconstructed patches are aggregated (e.g., by weighted averaging) to produce the final output. This pipeline is shared by most external example-based methods, which pay particular attention to learning and optimizing the dictionaries [2], [49], [50] or building efficient mapping functions [25], [41], [42], [47]. However, the rest of the steps in the pipeline have been rarely optimized or considered in an unified optimization framework.

In this paper, we show that the aforementioned pipeline is equivalent to a deep convolutional neural network [27] (more details in Section 3.2). Motivated by this fact, we consider a convolutional neural network that directly learns an end-to-end mapping between low and high-resolution images. Our method differs fundamentally from existing external example-based approaches, in that ours does not explicitly learn the dictionaries [41], [49], [50] or manifolds [2], [4] for modeling the patch space. These are implicitly achieved via hidden layers. Furthermore, the patch extraction and aggregation are also formulated as convolutional layers, so are involved in the optimization. In our method, the entire SR pipeline is fully obtained through learning, with little pre/post-processing.

We name the proposed model Super-Resolution Convolutional Neural Network (SRCNN)1. The proposed SRCNN has several appealing properties. First, its structure is intentionally designed with simplicity in mind, and yet provides superior accuracy2 compared with state-of-the-art example-based methods. Figure 1 shows a comparison on an example. Second, with moderate numbers of filters and layers, our method achieves fast speed for practical on-line usage even on a CPU. Our method is faster than a number of example-based methods, because it is fully feed-forward and does not need to solve any optimization problem on usage. Third, experiments show that the restoration quality of the network can be further improved when (i) larger and more diverse datasets are available, and/or (ii) a larger and deeper model is used. On the contrary, larger datasets/models can present challenges for existing example-based methods. Furthermore, the proposed network can cope with three channels of color images simultaneously to achieve improved super-resolution performance.

Overall, the contributions of this study are mainly in three aspects:

1) We present a fully convolutional neural network for image super-resolution. The network directly learns an end-to-end mapping between low and high-resolution images, with little pre/post-processing beyond the optimization.

2) We establish a relationship between our deep learning-based SR method and the traditional sparse-coding-based SR methods. This relationship provides a guidance for the design of the network structure.

3) We demonstrate that deep learning is useful in the classical computer vision problem of super-resolution, and can achieve good quality and speed.

A preliminary version of this work was presented SC / 25.58 dB SRCNN / 27.95 dB earSClie/r25.[5181dB]. The pSrReCsNeNn/t2w7.9o5 drBk adds to the initial version in significant ways. Firstly, we improve the SRCNN by introducing larger filter size in the non-linear mapping layer, and explore deeper structures by adding non-linear mapping layers. Secondly, we extend the SRCNN to process three color channels (either in YCbCr or RGB color space) simultaneously. Experimentally, we demonstrate that performance can be improved in comparison to the single-channel network. Thirdly, considerable new analyses and intuitive explanations are added to the initial results. We also extend the original experiments from Set5 [2] and Set14 [51] test images to BSD200 [32] (200 test images). In addition, we compare with a number of recently published methods and confirm that our model still outperforms existing approaches using different evaluation metrics.

5 CONCLUSION
We have presented a novel deep learning approach for single image super-resolution (SR). We show that conventional sparse-coding-based SR methods can be reformulated into a deep convolutional neural network. The proposed approach, SRCNN, learns an end-to-end mapping between low and high-resolution images, with little extra pre/post-processing beyond the optimization. With a lightweight structure, the SRCNN has achieved superior performance than the state-of-the-art methods. We conjecture that additional performance can be further gained by exploring more filters and different training strategies. Besides, the proposed structure, with its advantages of simplicity and robustness, could be applied to other low-level vision problems, such as image deblurring or simultaneous SR+denoising. One could also investigate a network to cope with different upscaling factors.

这里写图片描述

Fig. 2. Given a low-resolution image Y, the first convolutional layer of the SRCNN extracts a set of feature maps. The second layer maps these feature maps nonlinearly to high-resolution patch representations. The last layer combines the predictions within a spatial neighbourhood to produce the final high-resolution image F (Y).

这里写图片描述

Fig. 3. An illustration of sparse-coding-based methods in the view of a convolutional neural network.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
### 回答1: ECA-Net是一种用于深度卷积神经网络的高效通道注意力机制,可以提高模型的性能和效率。它通过对每个通道的特征图进行加权,使得网络可以更好地学习到重要的特征。ECA-Net的设计简单,易于实现,并且可以与各种深度卷积神经网络结构相结合使用。 ### 回答2: ECA-Net是一种用于深度卷积神经网络的高效通道注意力机制。 ECA-Net通过提出一种名为"Efficient Channel Attention"(ECA)的注意力机制,来增强深度卷积神经网络的性能。通道注意力是一种用于自适应调整不同通道的特征响应权重的机制,有助于网络更好地理解和利用输入数据的特征表示。 相比于以往的注意力机制,ECA-Net采用了一种高效且可扩展的方式来计算通道注意力。它不需要生成任何中间的注意力映射,而是通过利用自适应全局平均池化运算直接计算出通道注意力权重。这种方法极大地降低了计算和存储开销,使得ECA-Net在实际应用中更具实用性。 在进行通道注意力计算时,ECA-Net引入了两个重要的参数:G和K。其中,G表示每个通道注意力的计算要考虑的特征图的大小;K是用于精细控制计算量和模型性能之间平衡的超参数。 ECA-Net在各种视觉任务中的实验结果表明,在相同的模型结构和计算资源下,它能够显著提升网络的性能。ECA-Net对不同层级的特征表示都有显著的改进,能够更好地捕捉不同特征之间的关联和重要性。 总之,ECA-Net提供了一种高效并且可扩展的通道注意力机制,可以有效提升深度卷积神经网络的性能。它在计算和存储开销上的优势使得它成为一个非常有价值的工具,可在各种计算资源受限的应用中广泛应用。 ### 回答3: "eca-net: efficient channel attention for deep convolutional neural networks" 是一种用于深度卷积神经网络的高效通道注意力模块。这一模块旨在提高网络对不同通道(特征)之间的关联性的理解能力,以提升网络性能。 该方法通过引入了一个新的注意力机制来实现高效的通道注意力。传统的通道注意力机制通常是基于全局池化操作来计算通道之间的关联性,这种方法需要较高的计算成本。而ECA-Net则通过引入一个参数化的卷积核来计算通道之间的关联性,可以显著减少计算量。 具体来说,ECA-Net使用了一维自适应卷积(adaptive convolution)来计算通道注意力。自适应卷积核根据通道特征的统计信息来调整自身的权重,从而自适应地计算每个通道的注意力权重。这样就可以根据每个通道的信息贡献度来调整其权重,提高网络的泛化能力和性能。 ECA-Net在各种图像分类任务中进行了实验证明了其有效性。实验结果显示,ECA-Net在相同计算预算下,相比其他通道注意力方法,可以获得更高的分类精度。同时,ECA-Net还具有较少的额外计算成本和模型大小,使得其在实际应用中更加高效。 总结而言,"eca-net: efficient channel attention for deep convolutional neural networks" 提出了一种高效通道注意力方法,通过引入自适应卷积核来计算通道注意力,从而提高了深度卷积神经网络的性能。这一方法在实验中取得了良好的效果,并且具有较少的计算成本和模型大小。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值