微软AutoGen原理、代码（更换LLMs）和差异性分析

有梦想的鱼

已于 2024-05-21 15:14:17 修改

阅读量1.3k

点赞数 14

文章标签： microsoft

于 2024-05-21 12:03:59 首次发布

本文链接：https://blog.csdn.net/qq_38148600/article/details/139088519

版权

开始使用 |自动生成 (microsoft.github.io)

上面是官方代码文档参考链接

这是AutoGen论文2308.08155 (arxiv.org)

1.原理

AutoGen通过两个主要步骤简化了复杂应用的开发：

（1）定义一组具有特定能力和角色的可对话智能体；

（2）通过以对话为中心的计算和控制编程智能体之间的交互行为。

这两个步骤都可以通过融合自然语言和编程语言来实现，以构建具有广泛对话模式和智能体行为的应用。

2.代码

先安装AutoGen

pip install pyautogen

下面是官方运行代码

import autogen
from autogen import AssistantAgent, UserProxyAgent

llm_config = {"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}
assistant = AssistantAgent("assistant", llm_config=llm_config)

user_proxy = UserProxyAgent(
    "user_proxy", code_execution_config={"executor": autogen.coding.LocalCommandLineCodeExecutor(work_dir="coding")}
)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Plot a chart of NVDA and TESLA stock price change YTD.",
)

很遗憾，假如你使用的非官方的openai或者其它LLMs，由于缺少api_base（LLMs链接），这个方法是无法正常调用。

这个文件夹（Python311\site-packages\ autogen\oai\client.py）下面是有设置api_base的(第61行)。

所以需要在这个文件夹里面调整

3.和其它代理方法的区别

下面是官方自己对自己模型的总结：

3.1与CAMEL 的区别

我们首先澄清，CAMEL与AutoGen有不同的关注点和定位：CAMEL主要定位为一个研究不同角色智能体协作行为的框架，而不是像AutoGen那样支持LLM应用程序开发的基础设施。这种差异可以从两个框架在解决任务时的不同行为中更好地理解。例如，在CAMEL官方GitHub仓库中展示的“使用pygame设计自定义游戏”任务中，我们比较了AutoGen和CAMEL，并在一个匿名文档中总结了结果。从比较中，我们可以看到，CAMEL主要是在模拟一个扮演“计算机程序员”角色的AI智能体和一个扮演“游戏玩家”角色的智能体之间的对话，但实际上并没有创建一个有意义的游戏；而AutoGen能够实际使用pygame创建游戏，并将创建的游戏保存到一个文件中，这样它就可以直接被执行和播放。

更根本的区别，也带来了深刻的技术挑战，是AutoGen对具有超过2个智能体（N > 2）的多智能体系统的通用支持。CAMEL基于提示的角色扮演框架目前主要支持具有两个AI智能体的系统，可能还会有一个评论者在循环中。目前没有对超过2个智能体的系统或其他对话模式提供通用支持。

请注意，从2个智能体有效迁移到N个（N > 2）智能体以支持LLM应用程序是非常不平凡的，技术上具有挑战性：当N = 2时，智能体之间的通信是直接的。一般而言，支持N > 2需要仔细的抽象和实现，以便框架可以（1）具有灵活性以满足各种应用需求（几乎没有一种模式适合所有情况），并（2）支持可以使任务完成取得有意义进展的对话模式。到目前为止，AutoGen是唯一一个在两方面都做得不错的框架。

对应英文原文如下：

We would like to first clarify that CAMEL has a different focus and positioning from AutoGen: CAMEL is primarily positioned as a framework for studying the cooperative behaviors of agents of different roles, not an infrastructure to support the development of LLM applications as the case in AutoGen. This difference can be better understood from the two frameworks’ different behaviors in solving a task. E.g., under the task “Design a custom game using pygame” demonstrated in CAMEL’s official GitHub repo, we compared AutoGen vs. CAMEL and summarized in results in this anonymized document. From the comparison, we can see that CAMEL is primarily simulating a conversation between an AI agent with the role “Computer Programmer", and an agent with the role “gamer”, but is not actually creating a meaningful game; while AutoGen is able to actually create games with pygame and save the created game into a file such that it can be directly executed and played.

One more fundamental distinction, which poses profound technical challenges, is AutoGen’s general support for multi-agent systems with an agent number N > 2. CAMEL’s inception prompting based role-playing framework currently primarily supports systems with two AI agents with potentially a critic in the loop. There is no general support for systems with more than 2 agents or with other conversation patterns.

Note that moving from 2 to N (N > 2) in an effective way that could support LLM applications is highly non-trivial and is technically challenging: When N = 2, the communication between the agents is straightforward. Supporting N > 2, in general, requires careful abstraction and implementation so that the framework can (1) possess the flexibility to meet various application needs (there is hardly a one-fit-all pattern), and (2) support conversation patterns that can make meaningful progress in task completion. AutoGen is so far the only framework that realizes both objectives decently well.

3.2与ReAct区别

AutoGen双智能体设置与单智能体（使用ReAct）相当。然而，一个三智能体设置（带有一个定位智能体）的表现超过了ReAct智能体和双智能体设置。我们注意到，多智能体设置的性能将取决于智能体设计。因此，可以想象，一个设计糟糕的多智能体解决方案可能会表现不如单个智能体。

对应英文原文如下：

For example, we see that on ALFWorld (Figure 4.c), a two-agent setup is comparable to a single-agent (with ReAct). However, a three-agent setup (with a grounding agent) outperforms both the React agent and the 2-agent setup. We note the performance of multi-agent settings will vary depending on the agent design. Hence, it is conceivable that a badly designed multi-agent solution could underperform a single agent.