幻觉正是 AI 聪明的表现-CSDN博客

本文链接：https://blog.csdn.net/zphyix/article/details/137425714

本文探讨了大模型如ChatGPT如何通过生成看似真实的答案展示‘幻觉’，实际上这是它们学习人类语言和行为的结果。作者认为，这种幻觉是AI智能的体现，并指出了解决幻觉问题的方法，如通过RAG等技术。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

最近开始发一些个人思考类文章，许多朋友表示这种底层思维方面的文章很棒，希望我多发一些，既然大家爱看，我也爱表达，那我多分享点。需要说明的是，很多是个人观点，不是事实，欢迎理性交流讨论。

在各种场景下，相信 "幻觉" 都是被作为大模型的问题，大模型不好的方面来被阐述的。

但是在我看来，机器会撒谎，尤其是能将谎言编织的毫无逻辑漏洞，甚至虚构事实，才恰恰说明了机器的聪明，机器的可怕。

现实中，能够见人说人话的人往往更能在社会上如鱼得水，从事着为他人编织幻觉的工作的人也大有人在。

大模型只是学习了人的语言，模仿了人的行为而已。人类将自己的行为解释为灵活变通，冠名为智能，而对机器的类似行为解释为 “幻觉” 。

存在“幻觉”恰恰是AI聪明的表现，像人的表现。AI "幻觉" 现象是人类幻觉的外化表现，反映的是人类的幻觉，根本原因在我们自身。

这话是一位我很敬重的前辈，在 LangGPT 社区群聊讨论幻觉问题时也表达了类似观点。

据说 OpenAI 为其起名时，不选择 Mary 这类拟人化的名字，而为其取名 ChatGPT ，就是希望用这样一个生硬古板的名字提醒大家这是一个机器人。

其实：

从某种意义上说，大语言模型的全部工作恰恰就是制造幻觉，大模型就是「造梦机」。

这是前特斯拉人工智能总监，OpenAI 创始团队成员 Andrej Karpathy 大神的原话。刚哥（李继刚）对于提示工程是为大模型织梦的观点是最贴切的比喻了，提示词工程师=大模型织梦师。

Andrej Karpathy 大神对于大模型幻觉的看法，网上也有各种解读，这里我就不多 BB 了，附上英文原文和翻译：

On the "hallucination problem"

I always struggle a bit with I'm asked about the "hallucination problem" in LLMs. Because, in some sense, hallucination is all LLMs do. They are dream machines.

We direct their dreams with prompts. The prompts start the dream, and based on the LLM's hazy recollection of its training documents, most of the time the result goes someplace useful.

It's only when the dreams go into deemed factually incorrect territory that we label it a "hallucination". It looks like a bug, but it's just the LLM doing what it always does.

At the other end of the extreme consider a search engine. It takes the prompt and just returns one of the most similar "training documents" it has in its database, verbatim. You could say that this search engine has a "creativity problem" - it will never respond with something new. An LLM is 100% dreaming and has the hallucination problem. A search engine is 0% dreaming and has the creativity problem.

All that said, I realize that what people actually mean is they don't want an LLM Assistant (a product like ChatGPT etc.) to hallucinate. An LLM Assistant is a lot more complex system than just the LLM itself, even if one is at the heart of it. There are many ways to mitigate hallcuinations in these systems - using Retrieval Augmented Generation (RAG) to more strongly anchor the dreams in real data through in-context learning is maybe the most common one. Disagreements between multiple samples, reflection, verification chains. Decoding uncertainty from activations. Tool use. All an active and very interesting areas of research.

TLDR I know I'm being super pedantic but the LLM has no "hallucination problem". Hallucination is not a bug, it is LLM's greatest feature. The LLM Assistant has a hallucination problem, and we should fix it.

</rant> Okay I feel much better now :)

原文翻译：

关于 "幻觉问题"

每当有人问我关于LLM的 "幻觉问题 "时，我总是有些纠结。因为从某种意义上说，幻觉是LLM的全部工作。他们是造梦机器。

我们用提示来引导他们做梦。提示启动了梦境，根据 LLM 对其训练文件的朦胧回忆，大多数情况下，梦境的结果都是有用的。

只有当梦进入被认为与事实不符的领域时，我们才会将其称为 "幻觉"。这看起来像是一个错误，但其实只是 LLM 在做它一直在做的事情。

另一个极端是搜索引擎。它收到提示后，会逐字逐句地返回其数据库中最相似的 "训练文档"。可以说，这个搜索引擎有一个 "创造力问题"--它永远不会有新的回应。LLM 100%在做梦，存在幻觉问题。搜索引擎则是 0% 的梦想，存在创造力问题。

说了这么多，我意识到人们实际的意思是，他们不希望LLM助理（ChatGPT 等产品）产生幻觉。LLM助理是一个比LLM本身复杂得多的系统，即使LLM是它的核心。在这些系统中，减轻幻觉的方法有很多--使用检索增强生成（RAG），通过上下文学习将梦境更牢固地锚定在真实数据中，这可能是最常见的一种方法。多个样本之间的差异、反思、验证链。从激活中解码不确定性。工具使用。所有这些都是活跃而有趣的研究领域。

总之，我知道我是个超级迂腐的人，但 LLM 不存在 "幻觉问题"。幻觉不是错误，而是 LLM 最大的特点。LLM 助手存在幻觉问题，我们应该解决它。

</rant> 好了，我现在感觉好多了：)