基于贡献的低秩自适应预训练模型----用于真实图像恢复

Phoenixtree_DongZhao

于 2024-08-05 17:04:02 发布

阅读量983

点赞数 18

分类专栏： Large Model MyDLNote-Enhancement 文章标签：大模型图像增强

本文链接：https://blog.csdn.net/u014546828/article/details/140926259

版权

MyDLNote-Enhancement 同时被 2 个专栏收录

52 篇文章 10 订阅

订阅专栏

Large Model

37 篇文章 0 订阅

订阅专栏

2408.01099 (arxiv.org)

CoLoRA: Contribution-based Low-Rank Adaptation with Pre-training Model for Real Image Restoration (janeyeon.github.io)

Abstract

Recently, pre-trained model and efficient parameter tuning have achieved remarkable success in natural language processing and high-level computer vision with the aid of masked modeling and prompt tuning. In low-level computer vision, however, there have been limited investigations on pre-trained models and even efficient fine-tuning strategy has not yet been explored despite its importance and benefit in various real-world tasks such as alleviating memory inflation issue when integrating new tasks on AI edge devices.

Here, we propose a novel efficient parameter tuning approach dubbed contribution-based low-rank adaptation (CoLoRA) for multiple image restorations along with effective pre-training method with random order degradations (PROD). Unlike prior arts that tune all network parameters, our CoLoRA effectively fine-tunes small amount of parameters by leveraging LoRA (low-rank adaptation) for each new vision task with our contribution-based method to adaptively determine layer by layer capacity for that task to yield comparable performance to full tuning. Furthermore, our PROD strategy allows to extend the capability of pre-trained models with improved performance as well as robustness to bridge synthetic pre-training and real-world fine-tuning.

Our CoLoRA with PROD has demonstrated its superior performance in various image restoration tasks across diverse degradation types on both synthetic and real-world datasets for known and novel tasks. We believe that our CoLoRA with PROD can be a promising solution for efficient parameter tuning in low-level computer vision tasks with pre-trained models.

在自然语言处理和高级计算机视觉领域，预训练模型和高效参数调整已经取得了显著的成功。然而，在低级计算机视觉领域，尽管其在各种实际任务中具有重要性和好处（例如在AI边缘设备上集成新任务时减轻内存膨胀问题），但对于预训练模型甚至没有进行有效的微调策略研究。

因此，本文提出了一种名为基于贡献的低秩适应（CoLoRA）的新方法，用于多图像恢复，并结合随机顺序退化（PROD）的有效预训练方法。与以前需要调整所有网络参数不同，提出的CoLoRA通过利用每个新视觉任务的低秩适应来有效地微调少量参数，并使用基于贡献度量法来自适应地确定该任务每层容量，从而产生与全面微调相当性能。

此外，提出的PROD策略允许扩展预训练模型的能力，并改善性能以及对桥接合成预训练和实际微调具有鲁棒性。本文已经在各种图像恢复任务中展示了卓越表现，在合成数据集和真实世界数据集上跨不同退化类型进行了测试，并针对已知和新领域任务取得了可比较的性能。我们相信我们提出的CoLoRA与PROD可以作为一个有前景解决方案，在低级计算机视觉任务中使用预训练模型进行高效参数优化。

Method

The overview of our proposed CoLoRA with PROD. (a) Our PROD leverages high-quality, clean images and synthetic, degraded, low-quality images for pre-training the model. (b) Our proposed contribution-based efficient LoRA (CoLoRA) for new IR tasks. The proposed CoLoRA is configured to have a different ratio of learnable network parameters (δ) for each layer based on quantified contributions, enabling efficient fine-tuning for new tasks. (c) CoLoRA can be adjusted according to contribution.

(a) 提出的PROD使用高质量、清晰图像以及合成的降质、低质量图像对模型进行预训练。

(b) 提出了一种基于贡献度的高效LoRA（即CoLoRA），用于处理新兴IR任务。通过定量化贡献度，将CoLoRA配置为每个层具有不同比例可学习网络参数（δ），从而实现对新任务的有效微调。

Performance comparison based on the scale of training data for 6 IR tasks. In the graph, the results of the 6 IR tasks are averaged for comparison. The x-axis represents the number of training data points, and the y-axis is the average PSNR. In the radar graph, we compare the results of 6 IR tasks with normalized PSNR at a training data size of 128. (a) and (b) present experimental results corresponding to pre-training and fine-tuning methods, respectively. (c) and (d) experimental results for Our CoLoRA with PROD in NAFNet and Restormer. Our proposed CoLoRA (7%) has much fewer tuned network parameters compared to the full fine-tuning (100%) of NAFNet.

基于训练数据规模的6种红外任务性能比较。在图中，6个IR任务的结果取平均值以进行比较。x轴表示训练数据点的个数，y轴表示平均PSNR。在雷达图中，在训练数据大小为128的情况下，比较了6个具有归一化PSNR的IR任务的结果。(a)和(b)分别为预训练和微调方法对应的实验结果。(c)和(d)含PROD的Our CoLoRA在NAFNet和Restormer中的实验结果。与NAFNet的完全微调(100%)相比，提出的CoLoRA(7%)具有更少的调谐网络参数。