用AI Agent代理和Claude 3.7自动生成和运行代码完成复杂任务

本文链接：https://blog.csdn.net/m0_66628975/article/details/146303047

亚马逊云科技Amazon Bedrock代理生成代码执行复杂任务的Code Interpreter功能介绍——附源代码

对于目前AI和大型语言模型的技术里，一个长期存在的问题是它们在处理复杂数学运算和数据分析时的能力不足。虽然大语言模型在自然语言理解和生成方面表现出色，但在执行精确计算或处理大型数据集时往往力不从心。今天我们将介绍如何利用Amazon Bedrock代理的代码生成功能Code Interpreter，来优化和解决这一限制。

AI代理的代码执行能力

大型语言模型的特点就是能力的通用性，但它并非专为复杂数学计算或数据分析设计。然而它们在编写代码方面非常擅长。如果我们将大语言模型的代码生成能力与一个安全的代码运行环境相结合，就能创造出具备高级分析能力的AI代理。下面就让我们深入了解如何在Amazon Bedrock代理中实现这一强大功能。

设置具有Code Interpreter功能的代理

首先我们创建一个具备Code Interpreter功能的代理。以下是在亚马逊云科技控制台中配置该代理的教程，之后我们会深度分析具体示例代码：

选择Code Interpreter：在代理配置的高级设置中启用Code Interpreter功能。
选择模型：本实验中我们将使用Claude 3.7 Sonnet模型。
设置代理指令：为代理提供详细的指令、提示词，提示词里我们想利用Code Interpreter工具，完成复杂数学问题的解答、数据分析等复杂任务。(示例请见下方代码)
配置代理：在Amazon Bedrock控制台中创建一个新的代理，并选择Claude 3.7 Sonnet作为AI模型。
启用Code Interpreter：在代理的设置中启用Code Interpreter功能，开启功能后将连接到一个Python的代码执行沙盒环境到代理。

使用Code Interpreter的编程功能的源代码

大家可以通过控制台设置代理，如果大家想将该功能和AI应用集成，大家可以使用Python和亚马逊云科技SDK(boto3)以编程方式完成这一操作，这种方法可以更轻松地集成到大家的应用程序中。

步骤1：导入必要的库

import boto3 # pip install boto3==1.34.144

# The following are just use for our demo.
import json
import time
import random 
import uuid
import string
import matplotlib.pyplot as plt
import io

步骤2：创建和配置Bedrock代理

region_name = 'us-east-1'
bedrock_agent = boto3.client(service_name='bedrock-agent', region_name=region_name)
iam = boto3.client('iam')

# Create IAM role and policy (code omitted for brevity)

response = bedrock_agent.create_agent(
    agentName=f"{agentName}-{randomSuffix}",
    foundationModel=foundationModel,
    instruction=instruction,
    agentResourceRoleArn=roleArn,
)

agentId = response['agent']['agentId']

步骤3：添加Code Interpreter功能到代理的action group

response = bedrock_agent.create_agent_action_group(
    actionGroupName='CodeInterpreterAction',
    actionGroupState='ENABLED',
    agentId=agentId,
    agentVersion='DRAFT',
    parentActionGroupSignature='AMAZON.CodeInterpreter'
)

actionGroupId = response['agentActionGroup']['actionGroupId']

步骤4：上线代理并创建别名用于应用集成

bedrock_agent.prepare_agent(agentId=agentId)

response = bedrock_agent.create_agent_alias(
    agentAliasName='test',
    agentId=agentId
)

agentAliasId = response['agentAlias']['agentAliasId']

与配置Code Interpreter的代理交互

现在我们的代理已经设置和创建完成，让我们创建一段代码与其交互：

def invoke(inputText, showTrace=False, endSession=False):
    # Invoke the Bedrock agent with the given input text
    response = bedrock_agent_runtime.invoke_agent(
        agentAliasId=agentAliasId,
        agentId=agentId,
        sessionId=sessionId,
        inputText=inputText,
        endSession=endSession,
        enableTrace=True,
    )
    
    # Get the event stream from the response
    event_stream = response["completion"]
    
    # Process each event in the stream
    for event in event_stream:
        # Handle text chunks
        if 'chunk' in event:
            chunk = event['chunk']
            if 'bytes' in chunk:
                text = chunk['bytes'].decode('utf-8')
                print(f"Chunk: {text}")
        
        # Handle file outputs
        if 'files' in event:
            files = event['files']['files']
            for file in files:
                name = file['name']
                type = file['type']
                bytes_data = file['bytes']
                
                # Display PNG images using matplotlib
                if type == 'image/png':
                    img = plt.imread(io.BytesIO(bytes_data))
                    plt.figure(figsize=(10, 10))
                    plt.imshow(img)
                    plt.axis('off')
                    plt.title(name)
                    plt.show()
                    plt.close()
                # Save other file types to disk
                else:
                    with open(name, 'wb') as f:
                        f.write(bytes_data)
                    print(f"File '{name}' saved to disk.")

这个代码主要用于向代理发送请求、处理响应，并处理代理在复杂任务中生成的文件(如图表等)。

测试代理完成复杂数据分析

下面我们测试我们创建好的Code Interpreter代理，执行一个复杂的数据分析任务。例如我们可以请求它分析某些账单数据并生成可视化图表：

invoke("""Using the billing data provided below, create a bar 
graph that shows the total spend by product category (cat). 
The graph should have the category names on the x-axis and 
the total spend amount on the y-axis.

Billing Data: 
date,cat,id,product_name,cost 
2023-01-15,Electronics,E001,Smartphone,799.99 
2023-01-20,Home,H001,Vacuum Cleaner,199.99 
2023-02-03,Electronics,E002,Laptop,1299.99 
2023-02-10,Clothing,C001,Winter Jacket,129.99 
2023-02-25,Home,H002,Coffee Maker,89.99 
2023-03-05,Electronics,E003,Wireless Earbuds,159.99 
2023-03-12,Clothing,C002,Running Shoes,99.99 
2023-03-30,Home,H003,Blender,79.99 
2023-04-08,Electronics,E004,Smart Watch,299.99 
2023-04-15,Clothing,C003,Jeans,59.99 
2023-04-28,Home,H004,Toaster Oven,129.99 
2023-05-10,Electronics,E005,Tablet,499.99 
2023-05-18,Clothing,C004,Dress Shirt,49.99 
2023-05-25,Home,H005,Air Purifier,199.99 
2023-06-02,Electronics,E006,Gaming Console,399.99

Ensure that the graph is clearly labeled and easy to read. 
After generating the graph, provide a brief interpretation 
of the results, highlighting which category has the highest 
total spend and any other notable observations.""")

代理完成任务的执行流程：

解析用户提供的数据
编写Python代码分析数据并创建柱状图
在沙盒安全环境中执行代码
生成图像文件作为输出
提供包含图像的数据分析结果

输出结果将包含生成的图表以及对我们定义的任务里，对不同产品品类的支出成本的文字分析。

Agent对于分析的文本输出如下：

The bar chart shows the total spend amount for each product category based on the provided billing data. The x-axis displays the category names (Electronics, Home, and Clothing), and the y-axis represents the total spend amount in dollars.From the chart, we can observe that the Electronics category has the highest total spend at around $4,360. This is significantly higher than the other two categories, Home and Clothing, which have total spends of around $700 and $340, respectively.One notable observation is the large difference in spend between the Electronics category and the other two categories. This could indicate that the customer or business has a higher demand or preference for electronic products compared to home and clothing items during the given time period.Overall, the chart provides a clear visual representation of the spending patterns across different product categories, highlighting the dominance of the Electronics category in terms of total spend.

其他应用场景

Code Interpreter的功能不仅限于创建图表。在我们的其他实验中，我门使用它执行了各种数据处理任务，包括训练机器学习(ML)模型和进行ML推理。我们还可以利用它进行其他数据处理任务。例如，我们可以让它将数据格式化为JSON：

invoke("Format the sales data into a JSON format. And save to a file.")

这个命令将创建一个包含结构化数据的JSON文件，该文件可以轻松用于应用中后续的数据处理步骤或集成到其他应用程序中。

结论

在Amazon Bedrock代理中新增Code Interpreter功能，标志着AI基础能力的重大进步。通过结合大型语言模型的自然语言理解能力与代码生成、执行能力，我们可以构建强大的数据分析、可视化和复杂问题求解AI应用。

这一功能为企业和开发者带来了非常大的便利。从财务分析到科学计算，将代码执行与自然语言交互无缝集成，可以极大优化工作流程，并从数据中挖掘出新的洞察。

在项目中实现Code Interpreter时，请注意以下几点：

安全性：尽管执行环境是沙盒化的，但大家仍需谨慎的处理敏感数据，确保数据安全。
性能：复杂计算可能需要较长时间，因此要合理规划应用架构，提升响应速度。
用户体验：考虑如何以对终端用户更容易理解、更直观方式展示代码执行的结果。

Amazon Bedrock代理中的Code Interpreter功能，弥补了自然语言AI模型与其缺少复杂计算能力之间的鸿沟。随着AI技术的不断探索和发展，我们正朝着创建真正智能且多功能的AI大语言模型迈进。