AutoGen学习笔记系列(五)Tutorial -Human-in-the-Loop

这篇文章瞄准的是AutoGen框架官方教程中的 Tutorial 章节中的 Tutorial -Human-in-the-Loop 小节,主要介绍了在使用Agent时 “人在回路” 该如何操作、更细节的Team流程控制,包括以下三种主要的流程控制模式:

  1. 人工介入 UserProxyAgent
  2. 关键字触发 TextMentionTermination
  3. 控制权转移触发 HandoffTermination

官网链接:https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/human-in-the-loop.html# ;


Human-in-the-Loop

我第一次听说 “Human-in-the-Loop 人在回路” 是在2023年的电影《流浪地球2》里的Moss,那个时候还在折腾传统算法机器人,没接触大语言模型以及Agent框架。人在回路简单来说就是指在Agent运行过程中,人在关键位置决策控制作用,用于确保系统的安全性、可靠性和适应性,尤其是在高风险或复杂环境中。

如果你的目标是尽可能减少人工干预,HITL 也可以逐步演变为 人类-监控-自适应闭环(Human-on-the-Loop, HOTL) 或 完全自主(Human-out-of-the-Loop, HOOTL) 形式。

AutoGen库提供了 UserProxyAgent 对象用来在Agent或者Team使用run()运行时进行人在回路交互,如果你想直接体验Web和UI的这部分内容,可以用他们提供的链接,这几个都是Github仓库,我会在后续的番外篇介绍如何部署并使用,整体流程比较简单,感兴趣的话可以提前试试:


Providing Feedback During a Run

AutoGen库提供了一个特殊的Agent:UserProxyAgent,这Agent用来实现你与Team之间的交互,但是要注意:Team决定何时致电UserProxyAgent以征求用户的反馈,这样做是合理的,因为人在回路本身就是在 关键位置 让人进行决策,如果Team的每一步都需要人来决策,那么就退化成纯人编码了。

下面这张图展示了 AutoGen 在什么阶段会触发 UserProxyAgent
在这里插入图片描述

【Note】:如果在Team运行时触发 UserProxyAgent 交互会一直 阻塞等待 ,在此期间Team处于不稳定状态,不能进行暂停、恢复动作的。

这里说的不稳定指的是整个Team流程,该工具在设计时是期望你提供 短决策,最好是类似 “批准”“不批准” 这类 明确且排他 的动作,尽量避免重新给一个任务,如果真的需要重新给一个任务应该使用的是前一篇笔记中提到的暂停、恢复。

from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
import os, asyncio

os.environ["OPENAI_API_KEY"] = "你的OpenAI API Key"
model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")
assistant = AssistantAgent("assistant", model_client=model_client)
# 定义一个UserProxyAgent
user_proxy = UserProxyAgent("user_proxy", input_func=input)

# 创建一个文本控制器,当用户输入 APPROVE 的时候终止轮询
termination = TextMentionTermination("APPROVE")

# 将UserProxyAgent和任务Agent同时绑定在Team中
team = RoundRobinGroupChat([assistant, user_proxy], termination_condition=termination)
stream = team.run_stream(task="Write a 4-line poem about the ocean.")
asyncio.run(
    Console(stream)
)

运行结果如下,在运行时我对第一次生成的内容不满意并输入了 not great 触发了第二次LLM轮询,然后输入上面定义的文本检测器关键字 APPROVE 结束这次生成:

$ python demo.py

---------- user ----------
Write a 4-line poem about the ocean.
---------- assistant ----------
Endless waves in hues of blue,  
Whispers of a world so true,  
Secrets held in depths unknown,  
Beneath the sun’s warm, golden throne.  
TERMINATE
Enter your response: not great
---------- user_proxy ----------
not great
---------- assistant ----------
The ocean's heart beats wild and free,  
A dance of tides, a symphony.  
Its vast embrace, both fierce and deep,  
Guarding dreams in its secrets keep.  
TERMINATE
Enter your response: APPROVE
---------- user_proxy ----------
APPROVE
(LLM) ~/Desktop

在这里插入图片描述

从这个demo中可以发现,将评价诗歌质量的准则变成了人工输入,上一篇文章使用的是另一个Agent,这就是人在回路与Agent全权控制的区别。

官网在此处给了一个基于FastAPI的示例,我后面在番外篇完成后再补充这部分。


Providing Feedback to the Next Run

下面这张图介绍了如何Team流如何将交互结果反馈给下一次运行。
在这里插入图片描述

通过以下两个方式结束这个循环:

  1. 设置Team最大轮训次数 max_turns
  2. 设置TextMentionTerminationHandoffTermination 轮询终止条件;

这两个方案结合起来就是在创建Team的时候使用:

team = RoundRobinGroupChat(
	[assistant, user_proxy], 
	termination_condition=termination, max_turns=2)

说实话我没有弄明白官方这一小段的含义,如果你已经实操了上面的demo那理解这个图就非常容易的,怎样打断这个Team循环在上面也实操过。有可能这一小节是为了引出后面的 Managing State 章节。


Using Max Turns

这一小段就实操了如何设置最大轮询次数 max_turns

【注意】:在Team到达最大轮询次数后会停止运行,但是保留上下文环境与状态,如果想要重置所有状态可以用前一篇文章 AutoGen学习笔记系列(四)Tutorial -Teams 中提到的 reset() 函数。

重置整个Team状态:

team = RoundRobinGroupChat([assistant, user_proxy], termination_condition=termination, max_turns=2)
...
team.reset()

对官网demo进行稍微修改如下:

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
import asyncio, os

os.environ["OPENAI_API_KEY"] = "你的OpenAI API Key"

model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")
assistant = AssistantAgent("assistant", model_client=model_client)
# 定义一个Team并制定最大轮询次数
team = RoundRobinGroupChat([assistant], max_turns=1)

task = "Write a 4-line poem about the ocean."
async def main():
    global task
    while True:
        stream = team.run_stream(task=task)
        await Console(stream)
        task = input("Enter your feedback (type 'exit' to leave): ")
        if task.lower().strip() == "exit":
            break
        
asyncio.run(main())

运行效果如下,因为最大轮询次数会保留上下文,所以这里你可以持续询问直到得到你满意的答案:

$ python demo.py

---------- user ----------
Write a 4-line poem about the ocean.
---------- assistant ----------
Endless waves in a dance so divine,  
Whispers of secrets in waters so fine.  
Beneath the blue, where the mysteries lie,  
The ocean’s embrace, under vast, open sky.
Enter your feedback (type 'exit' to leave): exit

Using Termination Conditions

前面已经介绍并实操了如何使用文本控制对象 TextMentionTermination,官方在这里又介绍了一个新的流程控制对象 HandoffTermination。同样作为流程控制对象,两者之间最大的差异在于:

HandoffTerminationTextMentionTermination
触发方式代理将控制权交给另一个代理时终止代理在对话中检测到特定文本后终止
适用于多代理协作任务关键词触发终止,如用 APPROVE
示例代码生成代理 → 交给测试代理 → 交给执行代理用户输入 “STOP”,代理停止运行
  • 如果你的多代理系统需要在不同代理之间切换,使用 HandoffTermination
  • 如果你需要基于关键词终止代理,则使用TextMentionTermination

官网示例中同时用到了这两个流程控制对象:

  1. 定义一个惰性 lazy_agent 并绑定 Handoff 其作用是:如果无法完成任务,它会将控制权交给用户user,并输出 “Transfer to user.”;
  2. 定义了流程终止条件 HandoffTermination:当触发控制权转移给 user 时终止轮询;
  3. 定义了流程终止条件 TextMentionTermination:当结果中有TERMINATE 字段时终止轮询;
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.base import Handoff
from autogen_agentchat.conditions import HandoffTermination, TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
import os, asyncio

os.environ["OPENAI_API_KEY"] = "你的OpenAI API Key"
model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
)

# 定义惰性Agent
lazy_agent = AssistantAgent(
    "lazy_assistant",
    model_client=model_client,
    handoffs=[Handoff(target="user", message="Transfer to user.")],
    system_message="If you cannot complete the task, transfer to user. Otherwise, when finished, respond with 'TERMINATE'.",
)

# 流程终止条件一:HandoffTermination
handoff_termination = HandoffTermination(target="user")
# 流程终止条件二:TextMentionTermination
text_termination = TextMentionTermination("TERMINATE")

lazy_agent_team = RoundRobinGroupChat([lazy_agent], termination_condition=handoff_termination | text_termination)

task = "What is the weather in New York?"
asyncio.run(Console(lazy_agent_team.run_stream(task=task), output_stats=True))

运行结果如下:

$ python demo.py

---------- user ----------
What is the weather in New York?
---------- lazy_assistant ----------
[FunctionCall(id='call_p0qhTnM9vKVkYI8z0XM8mrZs', arguments='{}', name='transfer_to_user')]
[Prompt tokens: 69, Completion tokens: 12]
---------- lazy_assistant ----------
[FunctionExecutionResult(content='Transfer to user.', call_id='call_p0qhTnM9vKVkYI8z0XM8mrZs', is_error=False)]
---------- lazy_assistant ----------
Transfer to user.
---------- Summary ----------
Number of messages: 4
Finish reason: Handoff to user from lazy_assistant detected.
Total prompt tokens: 69
Total completion tokens: 12
Duration: 2.39 seconds
Human-in-the-Loop Machine Learning lays out methods for humans and machines to work together effectively. Summary Most machine learning systems that are deployed in the world today learn from human feedback. However, most machine learning courses focus almost exclusively on the algorithms, not the human-computer interaction part of the systems. This can leave a big knowledge gap for data scientists working in real-world machine learning, where data scientists spend more time on data management than on building algorithms. Human-in-the-Loop Machine Learning is a practical guide to optimizing the entire machine learning process, including techniques for annotation, active learning, transfer learning, and using machine learning to optimize every step of the process. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Machine learning applications perform better with human feedback. Keeping the right people in the loop improves the accuracy of models, reduces errors in data, lowers costs, and helps you ship models faster. About the book Human-in-the-Loop Machine Learning lays out methods for humans and machines to work together effectively. You'll find best practices on selecting sample data for human feedback, quality control for human annotations, and designing annotation interfaces. You'll learn to create training data for labeling, object detection, and semantic segmentation, sequence labeling, and more. The book starts with the basics and progresses to advanced techniques like transfer learning and self-supervision within annotation workflows. What's inside Identifying the right training and evaluation data Finding and managing people to annotate data Selecting annotation quality control strategies Designing interfaces to improve accuracy and efficiency About the author Robert (Munro) Monarch is a data scientist and engineer who has built machine learning data for companies such as
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值