AutoGen实战AI Agent开发

最近,关于 AI 智能体或AI代理(AI Agent)的讨论很多。受这种炒作的影响,我阅读了一些资料,偶然发现了 AutoGen,这是一个构建 AI 代理的很棒的库。但 AI 代理到底是什么?为什么它们很重要?让我们深入研究一下。

LLM 已经存在了几年,它们正在迅速向 AI 代理和 Agentic 工作流程发展。别误会我的意思,LLM 很棒,但它们在自行实现自动化方面仍然效率不高。通过消耗大量语言数据,LLM 与其他工具相结合是一种非常有效的方式,可以充分利用 LLM 所拥有的通用智能。AI 代理是可以自主执行任务的软件实体,可以与环境交互以做出决策。它们正在彻底改变我们解决问题的方式,从客户服务聊天机器人到复杂的数据分析。今天,我很高兴向你展示如何使用 AutoGen 构建一个简单的 AI 代理。

准备好了吗?让我们开始吧!

NSDT工具推荐: Three.js AI纹理开发包 - YOLO合成数据生成器 - GLTF/GLB在线编辑 - 3D模型格式在线转换 - 可编程3D场景编辑器 - REVIT导出3D模型插件 - 3D模型语义搜索引擎 - Three.js虚拟轴心开发包 - 3D模型在线减面 - STL模型在线切割 

1、为什么 AI 代理很重要

AI 代理带来了几个好处:

  • 自动化:它们无需人工干预即可处理重复性任务。
  • 效率:它们处理数据和执行任务的速度比人类更快。
  • 可扩展性:它们可以全天候运行,提供服务而无需停机。
  • 智能:它们从交互中学习和改进,随着时间的推移变得越来越好。

鉴于这些优势,AI 代理正在被从医疗保健到金融等各个行业采用。

2、使用 AutoGen 构建一个简单的 AI 代理

让我们动手使用 AutoGen 构建一个基本的 AI 代理。为了进行演示,我们将解决 LeetCode 中的“在二叉树中分配硬币”问题。这个问题是编码面试中的经典问题,你需要在数组中找到两个相加等于目标的数字。Colab 链接请点击这里

2.1 安装 AutoGen

首先,我们需要安装 AutoGen 库。在你的环境中运行以下命令:

!pip install pyautogen -q --progress-bar off

2.2 设置你的环境

我使用 Google Colab 进行此演示。以下是如何设置你的环境并安全地加载你的 OpenAI API 密钥:

import os
import autogen
from google.colab import userdata

# Load your OpenAI API key
userdata.get('OPENAI_API_KEY')

llm_config = {
    "config_list": [{"model": "gpt-3.5-turbo", "api_key": userdata.get('OPENAI_API_KEY')}],
    "cache_seed": 0,
    "temperature": 0,
}

2.3 定义问题

我们将使用 LeetCode 中的 LC Medium“在二叉树中分配硬币”问题:

LEETCODE_QUESTION = """
Title : Distribute Coins in a binary tree

You are given the root of a binary tree with n nodes where each node in the tree has node.val coins. There are n coins in total throughout the whole tree.

In one move, we may choose two adjacent nodes and move one coin from one node to another. A move may be from parent to child, or from child to parent.

Return the minimum number of moves required to make every node have exactly one coin.

 

Example 1:


Input: root = [3,0,0]
Output: 2
Explanation: From the root of the tree, we move one coin to its left child, and one coin to its right child.
Example 2:


Input: root = [0,3,0]
Output: 3
Explanation: From the left child of the root, we move two coins to the root [taking two moves]. Then, we move one coin from the root of the tree to the right child.

Follow-up: Can you come up with an algorithm that has a better time complexity?
"""

4、设置代理

AutoGen 使用两种类型的代理: AssistantAgent 和 UserProxyAgent

AssistantAgent 建议代码解决方案,而 UserProxyAgent 执行代码并检查其是否正常工作:

SYSTEM_MESSAGE = """You are a helpful AI assistant. Solve tasks using your coding and language skills..."""

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
    system_message=SYSTEM_MESSAGE
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=4,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
    code_execution_config={
        "work_dir": "coding",
        "use_docker": False,
    },
)

human_input_mode 设置为“NEVER”,因为我们不打算自己提供任何输入,而 max_consecutive_auto_reply 限制了对话轮次。

5、发起对话

我们将通过从 UserProxyAgent 向我们的助手发送消息来开始对话。

with autogen.Cache.disk(cache_seed=7) as cache:
  chat_res = user_proxy.initiate_chat(
      assistant,
      message="""Solve the following LeetCode problem and also comment on its time and space complexity:\n\n""" + LEETCODE_QUESTION
  )

以下是输出对话的片段:

user_proxy (to assistant):

Solve the following LeetCode problem and also comment on its time and space complexity:


Title : Distribute Coins in a binary tree

You are given the root of a binary tree with n nodes where each node in the tree has node.val coins. There are n coins in total throughout the whole tree.

In one move, we may choose two adjacent nodes and move one coin from one node to another. A move may be from parent to child, or from child to parent.

Return the minimum number of moves required to make every node have exactly one coin.

 

Example 1:


Input: root = [3,0,0]
Output: 2
Explanation: From the root of the tree, we move one coin to its left child, and one coin to its right child.
Example 2:


Input: root = [0,3,0]
Output: 3
Explanation: From the left child of the root, we move two coins to the root [taking two moves]. Then, we move one coin from the root of the tree to the right child.

Follow-up: Can you come up with an algorithm that has a better time complexity?


--------------------------------------------------------------------------------
assistant (to user_proxy):

To solve this problem, we can perform a depth-first search (DFS) traversal of the binary tree and calculate the excess coins at each node. The excess coins at a node can be calculated as the sum of coins at the node and the excess coins from its left and right children, minus 1 (since the node itself should have 1 coin).

Here is the Python code to solve the problem:

```python
class Solution:
    def distributeCoins(self, root: TreeNode) -> int:
        self.moves = 0
        
        def dfs(node):
            if not node:
                return 0
            
            left_excess = dfs(node.left)
            right_excess = dfs(node.right)
            
            self.moves += abs(left_excess) + abs(right_excess)
            
            return node.val + left_excess + right_excess - 1
        
        dfs(root)
        
        return self.moves
```

Time complexity analysis:
- The DFS traversal visits each node once, so the time complexity is O(n), where n is the number of nodes in the binary tree.

Space complexity analysis:
- The space complexity is O(h), where h is the height of the binary tree. This is due to the recursive calls in the DFS traversal. In the worst case, the space complexity can be O(n) for a skewed tree.

To improve the time complexity, we can modify the algorithm to avoid redundant calculations and achieve a linear time complexity. This can be done by returning both the excess coins and the number of moves from the DFS function. This way, we can eliminate the need to traverse the tree twice.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
user_proxy (to assistant):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "", line 1, in <module>
    class Solution:
  File "", line 2, in Solution
    def distributeCoins(self, root: TreeNode) -> int:
NameError: name 'TreeNode' is not defined

注意 UserProxyAgent 是如何出错的。它会回调另一个代理来获取解决方案:

--------------------------------------------------------------------------------
assistant (to user_proxy):

It seems like the code snippet I provided is missing the definition of the `TreeNode` class. Here is an updated version of the code that includes the definition of the `TreeNode` class:

```python
# Definition for a binary tree node.
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def distributeCoins(self, root: TreeNode) -> int:
        self.moves = 0
        
        def dfs(node):
            if not node:
                return 0
            
            left_excess = dfs(node.left)
            right_excess = dfs(node.right)
            
            self.moves += abs(left_excess) + abs(right_excess)
            
            return node.val + left_excess + right_excess - 1
        
        dfs(root)
        
        return self.moves
```

You can now use this updated code to solve the LeetCode problem "Distribute Coins in a binary tree".

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
user_proxy (to assistant):

exitcode: 0 (execution succeeded)
Code output: 


--------------------------------------------------------------------------------
assistant (to user_proxy):

Great! I'm glad to hear that the code executed successfully. If you have any more questions or need further assistance, feel free to ask!

--------------------------------------------------------------------------------
user_proxy (to assistant):



--------------------------------------------------------------------------------
assistant (to user_proxy):

If you have any more questions or need help with anything else, feel free to ask. I'm here to help!

--------------------------------------------------------------------------------
user_proxy (to assistant):



--------------------------------------------------------------------------------
assistant (to user_proxy):

If you have any questions in the future or need assistance, don't hesitate to ask. Have a great day!

--------------------------------------------------------------------------------

6、结束语

通过使用 AutoGen 的可对话代理,我们自动验证了 LLM 建议的 Python 代码是否有效,并创建了一个框架,LLM 可以通过读取控制台输出来响应语法或逻辑错误。


原文链接:AutoGen实战AI智能体 - BimAnt

  • 7
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要在 Unity 中开发录屏工具,并调用 FFmpeg 库进行视频编码,可以使用 FFmpeg.AutoGen 库来实现。以下是一个简单的示例代码: ```csharp using System; using System.Runtime.InteropServices; using FFmpeg.AutoGen; using UnityEngine; public class ScreenRecorder : MonoBehaviour { private const int FPS = 30; private const int BIT_RATE = 4000000; private const string OUTPUT_FILE = "output.mp4"; private int frameCount = 0; private AVCodecContext* codecContext; private AVFormatContext* formatContext; private AVStream* stream; private void Start() { AVDictionary* options = null; ffmpeg.av_dict_set(&options, "framerate", FPS.ToString(), 0); ffmpeg.av_dict_set(&options, "video_size", $"{Screen.width}x{Screen.height}", 0); ffmpeg.av_dict_set(&options, "preset", "ultrafast", 0); ffmpeg.av_dict_set(&options, "tune", "zerolatency", 0); ffmpeg.av_dict_set(&options, "crf", "25", 0); ffmpeg.av_dict_set(&options, "bitrate", BIT_RATE.ToString(), 0); AVCodec* codec = null; codec = ffmpeg.avcodec_find_encoder(AVCodecID.AV_CODEC_ID_H264); if (codec == null) { Debug.LogError("Failed to find H.264 codec!"); return; } codecContext = ffmpeg.avcodec_alloc_context3(codec); codecContext->width = Screen.width; codecContext->height = Screen.height; codecContext->time_base = new AVRational { num = 1, den = FPS }; codecContext->framerate = new AVRational { num = FPS, den = 1 }; codecContext->pix_fmt = AVPixelFormat.AV_PIX_FMT_YUV420P; codecContext->flags |= ffmpeg.AV_CODEC_FLAG_GLOBAL_HEADER; if ((codec->capabilities & ffmpeg.AV_CODEC_CAP_TRUNCATED) != 0) { codecContext->flags |= ffmpeg.AV_CODEC_FLAG_TRUNCATED; } int ret = ffmpeg.avcodec_open2(codecContext, codec, &options); if (ret < 0) { Debug.LogError($"Failed to open codec! Error code: {ret}"); return; } formatContext = ffmpeg.avformat_alloc_context(); formatContext->oformat = ffmpeg.av_guess_format(null, OUTPUT_FILE, null); if (formatContext->oformat == null) { Debug.LogError("Failed to guess output format!"); return; } ret = ffmpeg.avio_open(&formatContext->pb, OUTPUT_FILE, ffmpeg.AVIO_FLAG_WRITE); if (ret < 0) { Debug.LogError($"Failed to open file '{OUTPUT_FILE}'! Error code: {ret}"); return; } stream = ffmpeg.avformat_new_stream(formatContext, codec); ret = ffmpeg.avcodec_parameters_from_context(stream->codecpar, codecContext); if (ret < 0) { Debug.LogError($"Failed to copy codec parameters! Error code: {ret}"); return; } ret = ffmpeg.avformat_write_header(formatContext, &options); if (ret < 0) { Debug.LogError($"Failed to write format header! Error code: {ret}"); return; } } private void OnDestroy() { ffmpeg.av_write_trailer(formatContext); if (codecContext != null) { ffmpeg.avcodec_close(codecContext); ffmpeg.avcodec_free_context(&codecContext); } if (formatContext != null) { if ((formatContext->oformat->flags & ffmpeg.AVFMT_NOFILE) == 0 && formatContext->pb != null) { ffmpeg.avio_close(formatContext->pb); } ffmpeg.avformat_free_context(formatContext); } } private void LateUpdate() { AVFrame* frame = ffmpeg.av_frame_alloc(); if (frame == null) { Debug.LogError("Failed to allocate frame!"); return; } ffmpeg.av_image_alloc(frame->data, frame->linesize, codecContext->width, codecContext->height, codecContext->pix_fmt, 16); int size = Screen.width * Screen.height * 3; byte[] buffer = new byte[size]; GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned); IntPtr ptr = handle.AddrOfPinnedObject(); GL.ReadPixels(0, 0, Screen.width, Screen.height, UnityEngine.GL.RGB, UnityEngine.GL.UNSIGNED_BYTE, ptr); handle.Free(); for (int i = 0; i < codecContext->height; i++) { byte* row = (byte*)frame->data[0] + i * frame->linesize[0]; for (int j = 0; j < codecContext->width; j++) { row[3 * j] = buffer[3 * (i * codecContext->width + j) + 2]; row[3 * j + 1] = buffer[3 * (i * codecContext->width + j) + 1]; row[3 * j + 2] = buffer[3 * (i * codecContext->width + j)]; } } frame->pts = frameCount++; ffmpeg.avcodec_send_frame(codecContext, frame); AVPacket* packet = ffmpeg.av_packet_alloc(); ffmpeg.av_init_packet(packet); while (ffmpeg.avcodec_receive_packet(codecContext, packet) >= 0) { packet->stream_index = stream->index; packet->pts = packet->dts = frameCount++; packet->duration = ffmpeg.av_rescale_q(ffmpeg.av_make_q(1, FPS), stream->time_base, formatContext->streams[0]->time_base); packet->pos = -1; ffmpeg.av_interleaved_write_frame(formatContext, packet); ffmpeg.av_packet_unref(packet); } ffmpeg.av_frame_free(&frame); } } ``` 在上述代码中,我们首先定义了一些常量,如帧率、输出文件名等。在 Start() 方法中,我们使用 AVDictionary 来设置 FFmpeg 的编码参数,并打开 H.264 编码器。然后,我们创建了一个 AVFormatContext 对象,并设置输出文件格式和音视频流等参数。接着,我们写入文件头,并进入 LateUpdate() 方法,开始每帧截屏并编码。最后,在 OnDestroy() 方法中,我们关闭编码器并释放相关资源。 请注意,上述代码仅适用于 Windows 平台,并且需要将 FFmpeg 库文件复制到 Unity 项目中,并在项目属性中设置库文件的引用路径。同时,为了避免因为异常退出导致编码器资源无法释放,建议在代码中添加相关的异常处理机制。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值