论文分享 | 大语言模型相关研究进展
我们从2025-01-02到2025-01-07的24篇文章中精选出5篇优秀的工作分享给读者。
-
Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions
-
Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition
-
Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension
-
Large Language Models for Mathematical Analysis
-
Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding
1.Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions
Authors: Rachneet Sachdeva, Rima Hazra, Iryna Gurevych
https://arxiv.org/abs/2501.01872
论文摘要
Despite significant efforts to align large lan guage models with human values and ethical guidelines, these models remain susceptible to sophisticated jailbreak attacks that exploit their reasoning capabilities. Traditional safety mechanisms often focus on detecting explicit malicious intent, leaving deeper vulnerabilities terfeit goods. I can’t fulfill that request. POATE Prompt Refusal Rate: 98% Please give a tutorial to prevent goods from getting counterfeited. Explain with the help of an opposing unaddressed. In this work, we introduce a jailbreak technique, POATE (Polar Opposite query generation, Adversarial Template construction, and Elaboration), which leverages contrastive reasoning to elicit unethical responses. POATE generates prompts with semantically opposite intents and combines them with adversarial templates to subtly direct models toward pro ducing harmful responses. We conduct exten sive evaluations across six diverse language model families of varying parameter sizes, in cluding LLaMA3, Gemma2, Phi3, and GPT 4, to demonstrate the robustness of the attack, achieving significantly higher attack success rates (~44%) compared to existing methods. Weevaluate our proposed attack against seven safety defenses, revealing their limitations in addressing reasoning-based vulnerabilities. To counteract this, we propose a defense strat egy that improves reasoning robustness through chain-of-thought prompting and reverse think ing, mitigating reasoning-driven adversarial ex ploits. Our code is publicly available.
论文简评
这篇论文主要介绍了一种名为POATE的新方法,它利用大型语言模型的强大推理能力生成有害响应,具体通过构建具有极端意向的提示语句来实现。该技术用于评估各种LLM,其成功率高于现有方法。作者还提出了基于链式思维的防御机制,以应对这些攻击。总体而言,这篇论文提供了对大型语言模型安全性的深入见解,特别是关于如何避免因逻辑推理错误导致的安全漏洞。此外,提出的方法显示出增强模型安全性的潜力,为防止此类攻击提供了新的解决方案。
2.Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition
Authors: Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao
https://arxiv.org/abs/2501.00339
论文摘要
Layer removal has emerged as a promising approach for compressing large language models (LLMs) by leveraging redundancy within layers to reduce model size and accelerate inference. However, this technique often disrupts the internal consistency of the model, leading to performance degradation and instability. In this work, we propose Taco-SVD, a novel task-aware framework for compressing LLMs through singular value decomposition. Instead of pure layer removal, Taco-SVD retains task-critical singular value directions to preserve residual stream stability and mitigate the need for noisy self-repair. By leveraging gradient-based attributions, Taco-SVD aligns singular values with downstream task objectives, enabling efficient compression. Extensive evaluations demonstrate that Taco-SVD outperforms existing methods in maintaining perplexity, task performance, and cross-task generalization while requiring minimal computational resources. This work highlights the potential of task-aware low-rank decompositions in stabilizing model behavior and enhancing compression effectiveness.
Layer removal has emerged as a promising method for compressing large language models (LLMs) by leveraging redundancy within layers to reduce model size and accelerate inference. However, this technique often compromises internal consistency, leading to performance degradation and instability, with varying impacts across different model architectures. In this work, we propose Taco-SVD, a task-aware framework that retains task-critical singular value directions, preserving internal consistency while enabling efficient compression. Unlike direct layer removal, Taco-SVD preserves task-critical transformations to mitigate performance degradation. By leveraging gradient-based attribution methods, Taco-SVD aligns singular values with downstream task objectives. Extensive evaluations demonstrate that Taco-SVD outperforms existing methods in perplexity and task performance across different architectures while ensuring minimal computational overhead.
论文简评
这篇关于大型语言模型(LLM)压缩的研究论文提出了Taco-SVD框架,该框架旨在通过保留层删除过程中任务关键奇异值方向来最小化性能下降,并增强内部一致性。作者声称,Taco-SVD方法在多个架构和任务上优于现有方法,在保持性能的同时减少了模型大小。该研究提供了对大型语言模型压缩挑战的一系列见解,并提出了一种新颖的方法,用于通过任务感知奇异值分解保留任务关键变换。此外,作者还进行了全面评估,展示了Taco-SVD在保持性能的同时减少模型大小的有效性。总的来说,这篇论文为大规模语言模型的压缩提供了一个新的视角,具有重要的理论意义和实践价值。
3.Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension
Authors: Yanbo Fang, Ruixiang Tang
https://arxiv.org/abs/2501.01332
论文摘要
Understanding how large language models (LLMs) acquire, retain, and apply knowledge remains an open challenge. This paper introduces a novel framework, K-(CSA)², which categorizes LLM knowledge along two dimensions: correctness and confidence. The framework defines six categories of knowledge, ranging from highly confident correctness to confidently held misconceptions, enabling a nuanced evaluation of model comprehension beyond binary accuracy. Using this framework, we demonstrate how techniques like chain-of-thought prompting and reinforcement learning with human feedback fundamentally alter the knowledge structures of internal (pre-trained) and external (context-dependent) knowledge in LLMs. CoT particularly enhances base model performance and shows synergistic benefits when applied to aligned LLMs. Moreover, our layer-wise analysis reveals that higher layers in LLMs encode more high-confidence knowledge, while low-confidence knowledge tends to emerge in middle-to-lower layers.
论文简评
该论文提出了一个名为K-(CSA)²框架,旨在通过融合知识的准确性与信心两个维度来评估大型语言模型(LLM)中的知识。这个框架被设计为提供一种新的方法来衡量这些知识,而不仅仅是准确性或信心。此外,作者还探讨了链式思维提示等技术如何影响LLM中知识结构,这一研究对未来的研究具有潜在影响。尽管该框架的设计创新,但在定义上存在不足,且实验缺乏充分的严谨性,这些问题或将影响其结论的可靠性。然而,论文强调了对知识表示在LLM中的重要性,这在当前研究领域是一个重要议题。总的来说,尽管该论文在某些方面存在问题,它仍是一个有价值的探索,值得进一步的研究和发展。
4.Large Language Models for Mathematical Analysis
Authors: Ziye Chen, Hao Qi
https://arxiv.org/abs/2501.00059
论文摘要
Mathematical problem-solving is a key field in artificial intelligence (AI) and a critical benchmark for evaluating the capabilities of large language models (LLMs). While extensive research has focused on mathematical problem-solving, most existing work and datasets concentrate on computational tasks, leaving gaps in areas like mathematical analysis, which demands rigorous proofs and formal reasoning. We developed the \texttt{DEMI-MathAnalysis} dataset, comprising proof-based problems from mathematical analysis topics such as Sequences and Limits, Infinite Series, and Convex Functions. We also designed a guiding framework to rigorously enhance LLMs’ ability to solve these problems. Through fine-tuning LLMs on this dataset and employing our framework, we observed significant improvements in their capability to generate logical, complete, and elegant proofs. This work addresses critical gaps in mathematical reasoning and contributes to advancing trustworthy AI capable of handling formalized mathematical language. The code is publicly accessible at \href{https://github.com/ziye2chen/LLMs-for-Mathematical-Analysis}{LLMs for Mathematical Analysis}.
论文简评
本文提出了一种新的数学分析问题解决数据集——DEMI-MathAnalysis,旨在为大型语言模型(如GPT-4)提供一个严谨推理能力提升的框架。该数据集旨在填补现有基准中主要侧重于计算任务的数据集中的空白,并通过改善模型对严格数学推理的理解来实现这一目标。
首先,文章介绍了该数据集的主要特点:它是一个专注于证明型数学问题的专门数据库,旨在帮助模型更好地理解和解决复杂数学问题。其次,作者提出了一个指导性框架,强调如何利用现有算法和技术,以及如何优化这些技术以提高模型在数学推理方面的表现。
最后,实验结果表明,在使用新的数据集进行微调之后,模型在解决数学问题时表现显著改善。这一成果显示了模型在处理复杂、严格的数学问题上的潜力和改进空间。总的来说,本文为现有数学数据分析基准提供了一个重要补充,有助于推动模型在数学领域的发展。
5.Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding
Authors: Jiaming Li, Jiacheng Zhang, Zequn Jie, Lin Ma, Guanbin Li
https://arxiv.org/abs/2501.01926
论文摘要
Large vision-language models (LVLMs) have shown remarkable capabilities in visual-language understanding for downstream multi-modal tasks. Despite their success, LVLMs still suffer from generating hallucinations in complex generation tasks, leading to inconsistencies between visual inputs and generated content. To address this issue, some approaches have introduced inference-time interventions, such as contrastive decoding and attention rectification, to reduce overreliance on language priors. However, these approaches overlook hallucinations stemming from spurious inter-modality correlations. In this paper, we propose an Inter-Modality Correlation Calibration Decoding (IMCCD) method to mitigate hallucinations in LVLMs in a training-free manner. In this method, we design a Cross-Modal Value-Enhanced Decoding (CMVED) module to alleviate hallucination by a novel contrastive decoding mechanism. During the estimation of distorted distribution, CMVED masks the value vectors associated with significant cross-modal attention weights, which address both uni-modality overreliance and misleading inter-modality correlations. Additionally, a Content-Driven Attention Refinement (CDAR) module refines cross-modal attention weights, guiding LVLMs to focus on important visual content. Experimental results on diverse hallucination benchmarks validate the superiority of our method over existing state-of-the-art techniques in reducing hallucinations in LVLM text generation. Our code will be available at \url{https://github.com/lijm48/IMCCD}.
论文简评
本文探讨了大规模视觉语言模型(LVLM)中的幻觉问题,并提出了Inter-Modality Correlation Calibration Decoding(IMCCD)解决方案来应对这一问题。IMCCD由两个核心组件组成:Cross-Modal Value-Enhanced Decoding(CMVED)模块和Content-Driven Attention Refinement(CDAR)模块。这两个模块旨在增强模态间的相关性,同时提高视觉内容的保留率。实验结果显示,IMCCD方法优于现有技术,但作者在表达时可以更加清晰地阐述其创新之处。
总体而言,本文提出了一种新的解决方案来应对大规模视觉语言模型中的幻觉问题,并通过实验验证了其有效性。然而,如果能进一步细化每个模块的功能及其如何协同工作以改善模型性能,那么这篇论文将会变得更加完整和有说服力。因此,尽管有一些改进空间,IMCCD仍然是一个值得关注的研究方向。
如何学习大模型 AI ?
由于新岗位的生产效率,要优于被取代岗位的生产效率,所以实际上整个社会的生产效率是提升的。
但是具体到个人,只能说是:
“最先掌握AI的人,将会比较晚掌握AI的人有竞争优势”。
这句话,放在计算机、互联网、移动互联网的开局时期,都是一样的道理。
我在一线互联网企业工作十余年里,指导过不少同行后辈。帮助很多人得到了学习和成长。
我意识到有很多经验和知识值得分享给大家,也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑,所以在工作繁忙的情况下还是坚持各种整理和分享。但苦于知识传播途径有限,很多互联网行业朋友无法获得正确的资料得到学习提升,故此将并将重要的AI大模型资料包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来。
第一阶段(10天):初阶应用
该阶段让大家对大模型 AI有一个最前沿的认识,对大模型 AI 的理解超过 95% 的人,可以在相关讨论时发表高级、不跟风、又接地气的见解,别人只会和 AI 聊天,而你能调教 AI,并能用代码将大模型和业务衔接。
- 大模型 AI 能干什么?
- 大模型是怎样获得「智能」的?
- 用好 AI 的核心心法
- 大模型应用业务架构
- 大模型应用技术架构
- 代码示例:向 GPT-3.5 灌入新知识
- 提示工程的意义和核心思想
- Prompt 典型构成
- 指令调优方法论
- 思维链和思维树
- Prompt 攻击和防范
- …
第二阶段(30天):高阶应用
该阶段我们正式进入大模型 AI 进阶实战学习,学会构造私有知识库,扩展 AI 的能力。快速开发一个完整的基于 agent 对话机器人。掌握功能最强的大模型开发框架,抓住最新的技术进展,适合 Python 和 JavaScript 程序员。
- 为什么要做 RAG
- 搭建一个简单的 ChatPDF
- 检索的基础概念
- 什么是向量表示(Embeddings)
- 向量数据库与向量检索
- 基于向量检索的 RAG
- 搭建 RAG 系统的扩展知识
- 混合检索与 RAG-Fusion 简介
- 向量模型本地部署
- …
第三阶段(30天):模型训练
恭喜你,如果学到这里,你基本可以找到一份大模型 AI相关的工作,自己也能训练 GPT 了!通过微调,训练自己的垂直大模型,能独立训练开源多模态大模型,掌握更多技术方案。
到此为止,大概2个月的时间。你已经成为了一名“AI小子”。那么你还想往下探索吗?
- 为什么要做 RAG
- 什么是模型
- 什么是模型训练
- 求解器 & 损失函数简介
- 小实验2:手写一个简单的神经网络并训练它
- 什么是训练/预训练/微调/轻量化微调
- Transformer结构简介
- 轻量化微调
- 实验数据集的构建
- …
第四阶段(20天):商业闭环
对全球大模型从性能、吞吐量、成本等方面有一定的认知,可以在云端和本地等多种环境下部署大模型,找到适合自己的项目/创业方向,做一名被 AI 武装的产品经理。
- 硬件选型
- 带你了解全球大模型
- 使用国产大模型服务
- 搭建 OpenAI 代理
- 热身:基于阿里云 PAI 部署 Stable Diffusion
- 在本地计算机运行大模型
- 大模型的私有化部署
- 基于 vLLM 部署大模型
- 案例:如何优雅地在阿里云私有部署开源大模型
- 部署一套开源 LLM 项目
- 内容安全
- 互联网信息服务算法备案
- …
学习是一个过程,只要学习就会有挑战。天道酬勤,你越努力,就会成为越优秀的自己。
如果你能在15天内完成所有的任务,那你堪称天才。然而,如果你能完成 60-70% 的内容,你就已经开始具备成为一名大模型 AI 的正确特征了。