【RAG综述】Retrieval-Augmented Generation for AI-Generated Content: A Survey

最新推荐文章于 2025-04-25 15:24:01 发布

Arachis_X

最新推荐文章于 2025-04-25 15:24:01 发布

阅读量2.6k

点赞数 25

分类专栏： nlp 文章标签：人工智能 nlp 自然语言处理

本文链接：https://blog.csdn.net/Arachis_X/article/details/136591595

版权

nlp 专栏收录该内容

24 篇文章

订阅专栏

本文详细回顾了RAG技术在AIGC中的应用，探讨了其如何通过信息检索提升准确性与鲁棒性。文章分类了RAG基础、概述了增强方法和应用领域，同时指出了当前系统的局限及未来研究方向。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Retrieval-Augmented Generation for AI-Generated Content: A Survey 人工智能生成内容的检索增强生成综述

Abstract

The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by advancements in model algorithms, scalable foundation model architectures, and the availability of ample high-quality datasets. While AIGC has achieved remarkable performance, it still faces challenges, such as the difficulty of maintaining up-to-date and long-tail knowledge, the risk of data leakage, and the high costs associated with training and inference. Retrieval-Augmented Generation (RAG) has recently emerged as a paradigm to address such challenges. In particular, RAG introduces the information retrieval process, which enhances AIGC results by retrieving relevant objects from available data stores, leading to greater accuracy and robustness. In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios. We first classify RAG foundations according to how the retriever augments the generator. We distill the fundamental abstractions of the augmentation methodologies for various retrievers and generators. This unified perspective encompasses all RAG scenarios, illuminating advancements and pivotal technologies that help with potential future progress. We also summarize additional enhancements methods for RAG, facilitating effective engineering and implementation of RAG systems. Then from another view, we survey on practical applications of RAG across different modalities and tasks, offering valuable references for researchers and practitioners. Furthermore, we introduce the benchmarks for RAG, discuss the limitations of current RAG systems, and suggest potential directions for future research. Project: https://github.com/hymie122/RAG-Survey

人工智能生成内容（AIGC）的发展得益于模型算法的进步、可扩展的基础模型架构以及大量高质量数据集的可用性。

虽然 AIGC 已经取得了令人瞩目的成绩，但它仍然面临着各种挑战，例如难以维护最新的长尾知识、数据泄漏的风险以及与训练和推理相关的高昂成本。

检索增强生成（RAG）是最近出现的一种应对这些挑战的范例。特别是，RAG 引入了信息检索过程，通过从可用数据存储中检索相关对象来增强 AIGC 结果，从而提高准确性和鲁棒性。

在本文中，我们全面回顾了将 RAG 技术集成到 AIGC 场景中的现有工作。