[判别学习]Discriminative Transfer Learning for General Image Restoration 文章翻译与分析
文章翻译与分析)
本文来自
Xiao, Lei & Heide, Felix & Heidrich, Wolfgang & Schölkopf, Bernhard & Hirsch, Michael. (2017). Discriminative Transfer Learning for General Image Restoration. IEEE Transactions on Image Processing. PP. 10.1109/TIP.2018.2831925.
本文较长,可以直接跳到 related work
Abstract
Recently, several discriminative learning approaches have been proposed for effective image restoration, achieving convincing trade-off between image quality and computational efficiency. However, these methods require separate training for each restoration task (e.g., denoising, deblurring, demosaicing) and problem condition (e.g., noise level of input images). This makes it time-consuming and difficult to encompass all tasks and conditions during training. In this paper, we propose a discriminative transfer learning method that incorporates formal proximal optimization and discriminative learning for general image restoration. The method requires a single-pass training and allows for reuse across various problems and conditions while achieving an efficiency comparable to previous discriminative approaches. Furthermore, after being trained, our model can be easily transferred to new likelihood terms to solve untrained tasks, or be combined with existing priors to further improve image restoration quality.
近年来,人们提出了几种判别学习方法来进行有效的图像恢复,从而在图像质量和计算效率之间取得令人信服的平衡。然而,这些方法需要对每个恢复任务(如去噪、去模糊、去噪)和问题条件(如输入图像的噪声水平)进行单独的训练。这就使得在训练中包含所有的任务和条件既耗时又困难。
本文提出了一种结合常规的近端优化和判别学习的判别转移学习方法,用于一般图像的恢复。
该方法需要单次训练,允许跨各种问题和条件进行重用,与以前的判别方法相比更加有效率。此外,经过训练后,我们的模型可以很容易地转换成新的似然项来解决未经训练的任务,或者与现有的先验相结合来进一步提高图像的恢复质量。
Introduction
Low-level vision problems, such as denoising, deconvolution and demosaicing, have to be addressed as part of most imaging and vision systems. Although a large body of work covers these classical problems, low-level vision is still a very active area. The reason is that, from a Bayesian perspective, solving them as statistical estimation problems does not only rely on models for the likelihood (i.e. the reconstruction task), but also on natural image priors as a key component.
Low-Level的视觉问题,如去噪、反卷积和去噪,必须作为大多数成像和视觉系统的一部分加以解决。虽然有大量的工作涉及这些经典问题,但Low-Level视觉仍然是一个非常活跃的领域。原因是,从贝叶斯的角度来看,将它们作为统计估计问题来解决,不仅依赖于似然模型(即重建任务),还依赖于自然图像先验作为关键成分。
A variety of models for natural image statistics have been explored in the past. Traditionally, models for gradient statistics [27, 17], including total-variation, have been a popular choice. Another line of works explores patch-based image statistics, either as per-patch sparse model [11, 35] or modeling non-local similarity between patches [9, 10, 13]. These prior models are general in the sense that they can be applied for various likelihoods, with the image formation and noise setting as parameters. However, the resulting optimization problems are prohibitively expensive, rendering them impractical for many real-time tasks especially on mobile platforms.
在过去,人们探索了各种各样的自然图像统计模型。传统上,包括全变差(total-variation)在内的梯度统计模型[27,17]一直是流行的选择。另一项工作是研究基于patch的图像统计,可以是每个patch的稀疏模型[11,35],也可以是对patch之间的非局部相似性建模[9,10,13]。这些先前的模型是通用的,因为它们可以应用于各种可能性,以图像形成和噪声设置为参数。然而,由此产生的优化问题代价高昂,对于许多实时任务来说是不切实际的,尤其是在移动平台上。
Recently, a number of works [29, 8] have addressed this issue by truncating the iterative optimization and learning discriminative image priors, tailored to a specific reconstruction task (likelihood) and optimization approach.
While these methods allow to trade-off quality with the computational budget for a given application, the learned models are highly specialized to the image formation model and noise parameters, in contrast to optimization-based approaches. Since each individual problem instantiation requires costly learning and storing of the model coefficients, current proposals for learned models are impractical for vision applications with dynamically changing (often continuous) parameters. This is a common scenario in most realworld vision settings, as well as applications in engineering and scientific imaging that rely on the ability to rapidly prototype methods.
最近,许多文献[29,8]通过截断迭代优化和学习有区别的图像先验,针对特定的重建任务(似然)和优化方法,解决了这一问题。
虽然这些方法允许在给定应用程序的计算预算中权衡质量,但与基于优化的方法相比,所学习的模型高度专门化于图像形成模型和噪声参数。由于每个单独的问题实例化都需要昂贵的模型系数的学习和存储,因此对于具有动态变化(通常是连续的)参数的视觉应用程序来说,当前对学习模型的建议是不切实际的。这对于大多数现实世界的视觉设置是一个常见的场景,以及在工程和科学成像的应用,依赖于快速原型方法的能力。
In this paper, we combine discriminative learning techniques with formal proximal optimization methods to learn generic models that can be truly transferred across problem domains while achieving comparable efficiency as previous discriminative approaches. Using proximal optimization methods [12, 23, 3] allows us to decouple the likelihood and prior which is key to learn such shared models. It also means that we can rely on well-researched physically motivated models for the likelihood, while learning priors from example data. We verify our technique using the same model for a variety of diverse low-level image reconstruction tasks and problem conditions, demonstrating the effectiveness and versatility of our approach. After training, our approach benefits from the proximal splitting techniques, and can be naturally transferred to new likelihood terms for untrained restoration tasks, or it can be combined with existing state-of-the-art priors to further improve the reconstruction quality. This is impossible with previous discriminative methods. In particular, we make the following contributions:
• We propose a discriminative transfer learning technique for general image restoration. It requires a single-pass training and transfers across different restoration tasks and problem conditions.
• We show that our approach is general by demonstrating its robustness for diverse low-level problems, such as denoising, deconvolution, inpainting, and for varying noise settings.
• We show that, while being general, our method achieves comparable computational efficiency as previous discriminative approaches, making it suitable for processing high-resolution images on mobile imaging systems.
• We show that our method can naturally be combined with existing likelihood terms and priors after being trained. This allows our method to process untrained restoration tasks and take advantage of previous successful work on image priors (e.g., color and non-local similarity priors)
在本文中,我们将判别学习技术与形式近似优化方法相结合,以学习能够真正跨问题域转移的通用模型,同时获得与以前的判别方法相当的效率。使用近似优化方法[12,23,3]允许我们解耦似然和先验,这是学习此类共享模型的关键。这也意味着,我们可以依靠研究充分的物理动机模型来获得可能性,同时从示例数据中学习先验。我们验证了我们的技术使用相同的模型,用于各种不同的低水平图像重建任务和问题条件,证明了我们的方法的有效性和通用性。经过训练,我们的方法受益于近端分割技术,可以很自然地转化为新的似然项,用于未经训练的恢复任务,也可以与现有的最先进的先验相结合,进一步提高重建质量。这是不可能的与以前的区别方法。我们作出以下贡献:
•提出了一种用于一般图像恢复的判别转移学习方法。它需要单次训练,并在不同的恢复任务和问题条件之间进行转换。
•我们展示了我们的方法是通用的,通过展示它对各种低级问题的鲁棒性,如去噪、反卷积、图像修补和各种噪声设置。
•我们证明,虽然我们的方法是通用的,但与以前的判别方法相比,我们的计算效率是类似的,这使得它适用于处理移动成像系统上的高分辨率图像。
•我们证明,经过训练,我们的方法可以很自然地与现有的似然项和先验相结合。这使得我们的方法可以处理未经训练的恢复任务,并利用以前在图像先验(例如,颜色和非局部相似先验)上的成功工作。
Related Work
Image restoration aims at computationally enhancing the quality of images by undoing the adverse effects of image degradation such as noise and blur. As a key area of image and signal processing it is an extremely well studied problem and a plethora of methods exists, see for example [22] for a recent survey. Through the successful application of machine learning and data-driven approaches, image restoration has seen revived interest and much progress in recent years. Broadly speaking, recently proposed methods can be grouped into three classes: classical approaches that make no explicit use of machine learning, generative approaches that aim at probabilistic models of undegraded natural images and discriminative approaches that try to learn a direct mapping from degraded to clean images. Unlike classical methods, methods belonging to the latter two classes depend on the availability of training data.
图像恢复的目的是通过消除图像退化的负面影响,如噪声和模糊,从而在计算上提高图像的质量。作为图像和信号处理的一个关键领域,它是一个研究非常充分的问题,存在大量的方法,如[22]最近的调查。近年来,随着机器学习和数据驱动方法的成功应用,图像恢复引起了人们的兴趣,并取得了很大的进展。广义地说,最近提出的方法可以分为三类:不明确使用机器学习的经典方法,针对未退化自然图像概率模型的生成方法,以及试图学习从退化图像到清晰图像的直接映射的鉴别方法。与经典方法不同,属于后两个类的方法依赖于训练数据的可用性。
Classical models focus on local image statistics and aim at maintaining edges. Examples include total variation [27], bilateral filtering [32] and anisotropic diffusion models [34]. More recent methods exploit the non-local statistics of images [1, 9, 21, 10, 13, 31]. In particular the