大模型: 基于自然交互的人机协同软件开发与演化工具带来的挑战-CSDN博客

来源：软件学报

作者:李戈，彭鑫，王千祥 ，谢涛，金芝，王戟，马晓星 ，李宣东

DOI:

10.13328/j.cnki.jos.007008

中图分类号:

TP311

摘要:

以自然语言生成为核心的大模型技术正在人工智能领域掀起热潮, 并持续向更多的领域穿透其影响力. 以ChatGPT为代表的自然语言生成大模型(以下简称大模型), 已经在软件工程的多项活动中展示出其通过自然交互方式给人提供一定程度帮助的能力和潜力, 正在发展成为一种基于自然交互的人机协同软件开发与演化工具. 从人机协同软件开发与演化的视角, 大模型作为一种软件工具呈现出了两大特征: 其一是基于自然语言的人机交互, 在相当大程度上拓展了人机协同的工作空间、提高了人机协同的效率和灵活性; 其二是基于已积累的软件开发和演化知识、针对给定软件开发和演化任务的预测性内容生成, 可以对软件开发和演化工作提供一定程度的支持和帮助. 然而, 由于大模型本质是基于概率与统计原理和训练数据所形成的数学模型, 具有不可解释性和内生不确定性, 其生成的是缺失可信性判断的预测性内容, 而人在软件开发与演化中所需要完成的是具有可信保障的决策性任务, 所以大模型作为一种软件工具, 在人机协同的软件开发和演化工作环境中给人提供帮助的同时, 也带来了诸多的挑战. 围绕如何构造对软件开发与演化更有帮助的代码大模型、如何引导大模型生成对软件开发与演化更有帮助的预测性内容、如何基于大模型生成的预测性内容开发与演化高质量的软件系统等大模型带来的挑战进行分析和阐述.

关键词:软件开发与演化;大语言模型;人机协同

Abstract:

The generative pertained transformer-based large language models (LLMs) are setting off a wave in the field of artificial intelligence and continue to penetrate their influence into more fields. The LLMs such as ChatGPT have demonstrated their ability and potential to provide people with a certain degree of assistance through natural language-based interaction in many software engineering tasks, and they are developing into a natural language-based human-machine collaborative tool for software development and evolution. From the perspective of human-machine collaborative software development and evolution, the LLMs, as a software tool, present two major features. One is the natural language-based human-machine interaction, which greatly expands the human-machine collaboration workspace and improves the efficiency and flexibility of human-machine collaboration. The second is to generate predictive contents based on accumulated knowledge of software development and evolution, targeting a given software development and evolution task, which can provide a certain degree of support and assistance for the software development and evolution task. However, since LLMs are essentially mathematical models based on probability and statistical principles and training date, with inexplicability and uncertainty, the contents generated by LLMs are predictive and lack the judgments for trustworthiness. As opposed to the tasks that humans need to perform in software development and evolution, which are typically decision-making tasks with trustworthiness guarantees, LLMs, as a software tool, not only provide assistance to people in software development and evolution featuring human-machine collaboration but also bring many challenges. This study analyzes and clarifies the challenges brought by the LLMs, such as how to construct LLMs that are more helpful for software development and evolution, how to guide LLMs to generate predictive contents that are more helpful for software development and evolution, and how to develop and evolve high-quality software systems based on the predictive contents generated by LLMs.

Key words:software development and evolution;large language model (LLM);human-machine collaborative

访问统计

参考文献

[1] 马晓星, 刘譞哲, 谢冰, 余萍, 张天, 卜磊, 李宣东. 软件开发方法发展回顾与展望. 软件学报, 2019, 30(1): 3–21. http://www.jos.org.cn/1000-9825/5650.htm

Ma XX, Liu XZ, Xie B, Yu P, Zhang T, Bu L, Li XD. Software development methods: Review and outlook. Ruan Jian Xue Bao/Journal of Software, 2019, 30(1): 3–21 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5650.htm

[2] 张效祥. 计算机科学技术百科全书. 第3版, 北京: 清华大学出版社, 2018.

Zhang XX. Encyclopedia of Computer Science and Technology. 3rd ed., Beijing: Tsinghua University Press, 2018 (in Chinese).

[3] Wood D. Theory of Computation. New York: Harper & Row, 1987.

[4] Wolfram S. What Is ChatGPT Doing…and Why Does It Work? 2023. https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work

[5] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser ?, Polosukhin I. Attention is all you need. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.

[6] Ouyang L, Wu J, Jiang X, Almeida D, Wainwright CL, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J, Hilton J, Kelton F, Miller L, Simens M, Askell A, Welinder P, Christiano P, Leike J, Lowe R. Training language models to follow instructions with human feedback. arXiv:2203.02155, 2022.

[7] Liu PF, Yuan WZ, Fu JL, Jiang ZB, Hayashi H, Neubig G. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv:2107.13586, 2021.

相似文献

[1]马晓星,刘譞哲,谢冰,余萍,张天,卜磊,李宣东.软件开发方法发展回顾与展望[J].软件学报,2019,30(1):3-21.

[2]王飞,刘井平,刘斌,钱铁云,肖仰华,彭智勇.代码知识图谱构建及智能化软件开发方法研究[J].软件学报,2020,31(1):47-66.

[3]高廷丽,陶建华,杨明浩,张大伟,巢林林,李昊,车浩,李雅,刘斌.面向自然交互的多通道人机对话系统中答句自动生成方法[J].软件学报,2015,26(S2):177-188.

[4]魏峻,王栩,李京.一种基于对象序列图的组件交互协议设计方法[J].软件学报,2001,12(7):996-1006.

[5]刘胜航,朱嘉奇,邓昌智,罗雄飞,王宏安.基于人机协同的潜在意图检测模型[J].软件学报,2016,27(S2):82-90.

[6]李可俊.面向电子政务领域的软件一体化开发[J].计算机光盘软件与应用,2012(23):18-20.

[7]董文莉,方卫宁.自动化信任的研究综述与展望[J].自动化学报,2021,47(6):1183-1200.

[8]江济良,屠大维,张国栋,赵其杰.基于认知模型的室内移动服务机器人人机耦合协同作业机制[J].智能系统学报,2012,7(3):251-258.

[9]王博,甘淋玲,沙川.基于MS-CNN算法的虚拟运动场景人机协同仿真[J].计算机仿真,2021,38(6):306-310.

[10]邹海洋,李明东,李俊.基于谓词公式和线性代数的软件开发模型[J].数字社区&智能家居,2007,3(14):482-483.

[11]江济良,屠大维.智能空间助老助残服务机器人人机协作导航[J].智能系统学报,2014(5):560-568.