AI代理与AI管道：构建LLM应用的实用指南

果冻人工智能

于 2024-10-10 10:04:41 发布

阅读量618

点赞数 15

文章标签：人工智能

本文链接：https://blog.csdn.net/JellyAI/article/details/142813367

版权

这里我们用CrewAI来创建应用，展示一下如何为你的LLM（大语言模型）应用选择合适的架构。
你可以把AI代理想象成一个能够使用外部工具的LLM。它会在一个循环中运行，每次迭代时决定要做什么、用什么工具来解决问题。通过这种方式，代理能处理比传统LLM应用更复杂的问题。（我在下面的文章中也探讨了如何从零开始构建这样的代理。）

AI代理功能强大，远远超越了传统的聊天应用程序，但并不总是最好的解决方案。
有时候，用更传统的方式，一系列按顺序执行的功能可能更合适。可以把这种类型的应用想象成一个管道，解决问题的过程是通过将一个功能的输出作为下一个功能的输入来一步步完成的。
在这篇文章里，我们会聊聊AI代理和管道的使用，以及它们各自适合什么样的应用场景。我们会用CrewAI这个开源框架来创建LLM逻辑，还会在Streamlit里做一个简单的前端，用于在线应用。

适用于代理的应用 vs. 适用于管道的应用
咱们先来想几个不同的应用场景吧。
假设有个客户登录到一家制造商的网站，想解决洗衣机的问题。这个时候，一个AI代理会跳出来迎接他，问他一些问题来帮忙解决：洗衣机是什么品牌和型号？有什么问题？衣服还脏吗？地板上有没有水？机器是不是不转了？
为了搞清楚客户的问题，代理可能会问一连串的问题，而且后面的问题会根据前面的问题的答案来决定。
在这种情况下，AI代理就是个不错的解决方案。它会在一个循环里运行，不断地收集信息，直到能给出一个解决方案，或者如果真的搞不定，就把问题交给人工操作员来处理。
下面的图展示了这种类型的代理的操作流程。

这是另一个场景。
一家在伦敦经营咖啡店的小公司的销售总监，想要制作一份关于几家分店业绩的报告。她经常需要查看金融区的City和以旅游、零售为主的West End的门店之间的业绩差异。她手头上有各分店的销售数据表格，希望把这些数据合并起来，并为董事会编写一份易于理解的报告，附带一些图表，展示各家咖啡店的相对表现。
在这种情况下，总监可以使用LLM应用将每个月的销售数据合并成一个整体，写一段关于不同门店表现的评论，做出相应的图表，然后把这些元素整合成一份完整的报告。每个月的流程都是一样的，唯一的变量就是表格里的数据。因此，尽管每次报告的内容会有所不同，但整个处理过程是固定的。
和前面的代理示例不同，这里是一个简单的顺序流程。

AI代理非常适合解决第一个问题。
只要有关于洗衣机操作的信息、可能的故障模式和这些故障的症状，代理就能智能地搜索相关信息，并根据客户的回答提出新的问题，一直问到找到解决方案为止。对于这种情况，AI代理可以提供一个灵活又智能的解决办法。

销售报告就是另外一回事了。
销售数据是会变化的，报告可能会因为季节变化、游客或上班族的数量不同而发现业绩的差异。所以，虽然对这些变化的数据进行智能分析是很重要的，但生成报告的流程其实是固定的：先合并数据，再分析销售结果，接着创建图表，最后撰写完整的报告。
当然，你可以编写一个提示，让代理按照步骤去执行任务，但如果能设计一系列任务，每个任务的输出都是下一个任务的输入，这样可能会带来更一致的结果。

CrewAI
接下来，我们将使用CrewAI来创建两个示例应用。CrewAI为我们提供了一种简单的方法来创建代理、定义任务并把它们连接起来。虽然它不是唯一的开源框架，但在我们这里用来展示主题非常合适。
我们会从使用Jupyter Notebook开始，之后再把它转换成独立的应用程序。
至于使用什么AI聊天API都可以，我之前的文章里用的是Claude 3.5来构建ReAct代理，不过CrewAI默认使用OpenAI，所以我们这里也继续用OpenAI。
需要注意的是，如果你想跟着这篇文章一起编写代码，你需要一个OpenAI的API密钥，而且可能会因为使用而产生费用——像我们这里用的简单API调用只需要几美分，但调用次数多了费用也会跟着涨，所以一定要在OpenAI的仪表板上监控自己的使用情况。
CrewAI有企业版和开源版两种选择。我们这里会用开源版本，所以不需要API密钥或任何费用。
安装CrewAI的过程和其他库差不多，不过因为我们会用到外部工具，所以还需要额外添加一些功能。

pip install crewai crewai-tools

我们要用PyPi上的Wikipedia库给CrewAI创建一个工具，所以还得先把这个库装上。

pip install wikipedia

我们要开发的第一个应用程序叫‘智能维基百科’（Intelligent Wikipedia）。它的功能就是接受一个查询，然后用维基百科工具找到答案。这个过程会包括对查询进行推理，搞清楚需要查找哪些信息，进行合适的搜索，然后分析结果，最后给出答案。听起来可能有点复杂，但其实并不难。

智能维基百科
就像我说的，一开始我们会用Jupyter Notebook，所以我会把代码分成一个个单独的Notebook单元格来展示。
第一个单元格会导入CrewAI并设置OpenAI。

from crewai import Agent, Task, Crew 

# Omit the next two lines if OPENAI_API_KEY is already set 
import os 
os.environ["OPENAI_API_KEY"] = "your key here" 
llm = "gpt-4o"

CrewAI会从环境变量OPENAI_API_KEY里找到OpenAI的密钥，所以我们需要在这里设置一下。如果你之前已经设置过了，那这段代码就可以省略。
你可以选择任何你喜欢的模型，本文选择GPT，‘gpt-4o’看起来是最好的选择。

工具
接下来，我们要指定维基百科工具。所谓工具，就是CrewAI代理在需要时可以调用的一个函数——它执行某个任务，然后返回一个结果。定义工具最简单的方法就是用装饰器。所以，下一单元格的代码是这样的：


from crewai_tools import tool
import wikipedia

@tool("wikipedia_lookup")
def wikipedia_lookup(q: str) -> str:
    """Look up a query in Wikipedia and return the result"""
    return wikipedia.page(q).summary

导入库之后，我们写了一个简单的函数，这个函数接受一个字符串参数，把它发到维基百科，然后返回对应页面的摘要。注意，我们还给这个函数加了描述，以及参数和返回值的类型提示，这样可以帮助CrewAI知道怎么用这个工具。
现在，进入应用程序的主要部分：我们要定义一个代理（agent），它负责帮我们执行任务，然后创建并运行一个‘crew’。这个‘crew’就是把代理和任务结合在一起，变成一个可以

执行的操作。在第一个应用程序里，我们只有一个代理和一个任务，但之后会介绍更复杂的‘crew’。

代理（Agents）
先来看一下代理。下面的代码会创建一个名为researcher_agent的代理变量。它会指定代理的角色、目标、背景故事，还有代理可以使用的工具列表，同时也会指定使用的LLM模型。接下来，这就是下一单元格的代码了：


# Define the agent
researcher_agent = Agent(
        role="Researcher",
        goal="You research topics using Wikipedia and report on the results",
        backstory="You are an experienced writer and editor",
        tools=[wikipedia_lookup],
        llm=llm
    )

代理可以设置更多的属性，不过这些是最基本的。LLM和工具这些东西是显而易见的，肯定是需要明确指定的东西。但可能有人觉得，角色、目标和背景故事听起来更像是魔法咒语，而不是编程的需求。然而，正如你所猜的，这些字符串会在CrewAI为LLM构建提示时派上用场。
所以，代理（Agent）就是一些告诉LLM它的目的的信息、一个工具列表，还有一个LLM模型。

任务（Tasks）
接下来就是任务了。任务告诉我们代理需要做什么。对我们这个应用来说，就是响应一个查询。当我们创建一个正式的应用时，会提示用户输入查询。但现在这是一个原型的Jupyter Notebook，为了简化操作，我们会把查询硬编码到代码里。
query = "What is the EU's largest city"
这个问题无法直接从维基百科页面上找到答案——需要进行一些推理。
下面是将包含该查询的任务规范。

task1 = Task(
    description=query,
    expected_output='A short text based on the tool output.',
    agent=researcher_agent,
    tools=[wikipedia_lookup]
)

任务的描述就是查询

的内容，我们会解释预期的输出是什么（这又是一个会在提示里用到的‘魔法’字符串），把任务和代理关联起来，并列出可能会用到的工具。
现在，我们用‘crew’把这些都组合在一起。

Crews
‘Crew’就是告诉我们哪些代理需要执行哪些任务。下面是这个应用程序的‘crew’配置。

# Define the crew
crew = Crew(
    agents=[researcher_agent],
    tasks=[task1],
    verbose=True
)

它包括一个代理列表和一个任务列表。在这个例子里，就是一个任务和一个代理。当设置了verbose标志时，CrewAI会显示它的执行过程——我们很快就能看到这一点。
现在，我们要做的就是运行整个流程，看看结果如何。

执行‘crew’
如果这听起来有点吓人，像是在说什么严厉的措施，别担心，我的意思只是运行‘crew’代表的代码（不管结果有多糟糕，都不会有什么严厉的惩罚）。
用CrewAI的术语来说，就是启动‘crew’，然后在返回值里获取响应。下面是下一单元格的代码。

result = crew.kickoff()

# Accessing the task output
task_output = task1.output

print(f"Task Description: {task_output.description}")
print(f"Task Summary: {task_output.summary}")
print(f"Raw Output: {task_output.raw}")
正如你所看到的，我们调用了kickoff方法，然后打印出所生成的各种输出。
现在，如果我们运行完整的Notebook，这些打印语句的结果是：
Task Description: What is the EU's largest city

Task Summary: What is the EU's largest city...Raw Output: Berlin is the largest city in the European Union in terms of 
population within city administrative boundaries.

这是一个良好且正确的响应。更有趣的可能是‘crew’是如何找到解决方案的。请记住，我们将verbose属性设置为True。这意味着代理的‘思考’过程在Notebook运行时被记录并显示出来。你可以在下面看到这些内容。

# Agent: Researcher
## Task: What is the EU's largest city

# Agent: Researcher
## Thought: I need to look up information about the largest city in the European Union (EU).
## Using tool: wikipedia_lookup
## Tool Input: 
{
  "q": "Largest city in the European Union"
}
## Tool Output: 
This is a list of the largest cities in the European Union according to 
the population within their city boundary. The cities listed all have 
populations over 300,000. The list deals exclusively with the areas 
within city administrative boundaries as opposed to urban areas or 
larger urban zones (metropolitan areas), which are generally larger in 
terms of population than the main city (although they can also be 
smaller, in some of the cases presented).
As some cities have a very narrow boundary and others a very wide one, 
the list may not give an accurate view of the comparative magnitude of 
entire urban areas, and thus the figures in the list should be treated 
with caution. For example Paris is the most populous urban area in the 
European Union; however, the city proper of the French capital has a 
lower population than top-ranked Berlin and a couple of other cities, 
as shown in the table. Likewise the City of Brussels is only one of a 
total of 19 municipalities making up the greater Brussels Capital Region, 
and by itself does not reach the population threshold to be listed here. 
On the other hand, the Municipality of Sintra, listed on the table as the 
second most populous Portuguese city includes in it the cities of 
Agualva-Cacém and Queluz, in addition to the town of Sintra and other 
urban centers.

# Agent: Researcher
## Final Answer: 
Berlin is the largest city in the European Union in terms of population 
within city administrative boundaries.

你可以看到，CrewAI遵循了一个非常类似代理的过程。它对其响应进行推理，收集适当的信息，然后给出合适的答案。

现在制作一个应用
将其转换为应用程序非常简单。我们可以添加一个简单的Python前端，或者使用Streamlit或Taipy创建一个更漂亮的GUI。以下是简单的Python版本。

from crewai import Agent, Task, Crew

import os
os.environ["OPENAI_API_KEY"] = "your api key"
llm = "gpt-4o"

# Create the wikipedia tool
from crewai_tools import tool
import wikipedia

@tool("wikipedia_lookup")
def wikipedia_lookup(q: str) -> str:
    "Look up a query in Wikipedia and return a summary"
    return wikipedia.page(q).summary

# Define the agent
intelligent_wikipedia = Agent(
        role="Researcher",
        goal="You research topics using Wikipedia and report on the results",
        backstory="You are an experienced writer and editor",
        tools=[wikipedia_lookup],
        llm=llm
    )

# The run function sets the task and executes the crew
# and returns the result
def run(s: str):
    task = Task(
        description=s,
        expected_output='A short text based on the tool output',
        agent=intelligent_wikipedia,
        tools=[wikipedia_lookup]
    )

    # Define the crew
    crew = Crew(
        agents=[intelligent_wikipedia],
        tasks=[task],
        verbose=True
    )

    result = crew.kickoff()
    task_output = task.output
    return task_output.raw

###############################

print("Intelligent Wikipedia")

q = input("Enter your question: ") 
answer = run(q)
print(answer)

Notebook中的所有单元格已经合并，任务、‘crew’和kickoff功能都被整合到一个run函数中。
在注释行下面，我们打印一个标题，提示用户输入查询，调用run函数，并打印出响应。
如果你更喜欢使用Streamlit或Taipy构建一个更复杂的GUI，那么可以将注释行下面的代码替换为如下示例之一：

Streamlit:

import streamlit as st

st.header("Intelligent Wikipedia")

if q := st.text_input("Enter your question"):
    answer = run(q)
    st.markdown( answer)

Taipy:

from taipy.gui import Gui
import taipy.gui.builder as tgb

user_input = ""
result = ""

def input_change(state):
    answer = run(state.user_input)
    state.result = answer

with tgb.Page() as page:
    tgb.text("# Intelligent Wikipedia", mode='md')

    with tgb.layout(columns="1 1"):
        with tgb.part():
            tgb.text("Enter your question in the box")
            tgb.input("{user_input}", multiline=False, class_name="fullwidth")
            tgb.button("Submit", on_action=input_change)
        with tgb.part():
            tgb.text("Intellligent Wikipedia's response will appear below:")
            tgb.text("{result}")


Gui(page=page).run(dark_mode=True)

这里是Streamlit版本的截图。

在这个应用的GUI版本里，你是看不到推理过程的，因为这些内容都被发到了控制台。所以，你应该能在运行应用的命令行里看到这些信息。如果你不想看到这些过程的话，可以在‘crew’定义里把verbose设置为False。

适合小朋友的维基百科——一个管道应用
要创建一个管道应用，我们需要定义更多的代理和任务，然后创建一个合理连接它们的‘crew’。
这个新应用是基于之前的应用构建的，叫‘适合孩子的维基百科’。它同样是从维基百科查找内容，但会把查到的内容翻译成适合年轻读者的语言，并把长度缩短到几百字以内。
下面我们来看看这些代理的代码。

researcher_agent = Agent(
        role="Researcher",
        goal="You research topics using Wikipedia and report on the results",
        backstory="You are an experienced writer and editor",
        tools=[wikipedia_lookup],
        llm=llm
    )
writer_agent = Agent(
        role="Writer",
        goal="You re-write articles so that they are suitable for young readers",
        backstory="You are an experienced writer and editor",
        llm=llm
    )
editor_agent = Agent(
        role="Editor",
        goal="You ensure that the text you are given is grammatically correct and the correct length",
        backstory="You are an experienced writer and editor",
        llm=llm
    )

你可以看到，现在我们有三个代理了，后面两个分别是writer_agent和editor_agent。目标字符串说明了它们的用途，虽然它们也要用到LLM，但不需要维基百科工具。
光有代理还不行，我们还得指定它们要执行的任务。下面就是任务的代码单元格。

task1 = Task(
    description=query,
    expected_output='A short text based on the tool output',
    agent=researcher_agent,
    tools=[wikipedia_lookup]
)
task2 = Task(
    description="Rework the text to be suitable for a 10-year-old reader",
    expected_output='A short text based on the tool output',
    agent=writer_agent,
)
task3 = Task(
    description="Edit the text to ensure that it is grammatically correct and no more than 500 words",
    expected_output='A short text based on the tool output',
    agent=editor_agent,
)

第一个任务和之前一样，而第二个和第三个任务要求代理将原文改写为适合10岁读者的内容，并确保语法正确且长度不超过500字。注意，在每个任务中都指定了相应的代理，而且维基百科工具仅在第一个任务中需要。
现在我们需要创建一个‘crew’并运行它。以下是相应的代码。

# Define the crew
crew = Crew(
    agents=[researcher_agent, writer_agent, editor_agent],
    tasks=[task1, task2, task3],
    verbose=True
)

result = crew.kickoff()

print(f"Raw Output: {result.raw}")

为了定义‘crew’，我们只需按需要执行的顺序列出代理和任务。注意，结果的检索方式有一个小的变化，我们不使用某个任务的输出，而是使用最终的结果。
这次的提示词只是‘Paris’，如你所见，我们的代理管道很好地用几百个字向孩子介绍了巴黎。下面是该应用的Streamlit版本的截图。

以下是完整的Streamlit应用代码。

from crewai import Agent, Task, Crew

import os
# OpenAI
os.environ["OPENAI_API_KEY"] = "your API key"
llm = "gpt-4o"

# Create the wikipedia tool
from crewai_tools import tool
import wikipedia

@tool("wikipedia_lookup")
def wikipedia_lookup(q: str) -> str:
    "Look up a query in Wikipedia and return a summary"
    return wikipedia.page(q).summary

# Define the agents
researcher_agent = Agent(
        role="Researcher",
        goal="You research topics using Wikipedia and report on the results",
        backstory="You are an experienced writer and editor",
        tools=[wikipedia_lookup],
        llm=llm
    )
writer_agent = Agent(
        role="Writer",
        goal="You re-write articles so that they are suitable for young readers",
        backstory="You are an experienced writer and editor",
        llm=llm

    )
editor_agent = Agent(
        role="Editor",
        goal="You ensure that the text you are given is grammatically correct and the correct length",
        backstory="You are an experienced writer and editor",
        llm=llm
    )

# The run function sets the task and executes the crew
# and returns the result
def run(s: str):
    task1 = Task(
        description=q,
        expected_output='A short text based on the tool output',
        agent=researcher_agent,
        tools=[wikipedia_lookup]
    )
    task2 = Task(
        description="Rework the text to be suitable for a 10-year-old reader",
        expected_output='A short text based on the tool output',
        agent=writer_agent,
    )
    task3 = Task(
        description="Edit the text to ensure that it is grammatically correct and no more than 500 words",
        expected_output='A short text based on the tool output',
        agent=editor_agent,
    )

    # Define the crew
    crew = Crew(
        agents=[researcher_agent, writer_agent, editor_agent],
        tasks=[task1, task2, task3],
        verbose=True
    )
    result = crew.kickoff()
    return result.raw    
###############################

import streamlit as st

st.header("Wikipedia for kids")

if q := st.text_input("Enter your question"):
    answer = run(q)
    print(answer)
    st.markdown( answer)

这就是两个示例应用程序了。它们的任务差不多，都用到相同的维基百科查询工具。不过，第二个应用在管道里加了几个额外的功能，把文本改得更适合年轻读者，还限制了长度。

总结：AI代理和AI管道在LLM应用中的区别
这两种方法在处理任务、决策方式和交互方式上有不同的特点。
**代理（Agents）**是自主的，能做出决策，而且具有交互性。在这种方式中，LLM扮演的是一个‘代理’的角色，能够理解复杂的用户提示，并根据需要调整自己的行为来实现特定目标。像我们看到的对话虚拟助手或自主客户服务代理就是这样的例子。
代理可以处理开放式的任务，并且能够长时间保持上下文，非常适合用于复杂的、互动性强的应用场景。
**AI管道（Pipelines）**则是一种线性、确定性的过程，任务的阶段都是预定义的，比如数据处理这种。管道结构化很好，适用于重复性高、任务定义清晰的应用，比如文本分类、文档处理或者数据转换。
AI管道是模块化的、可扩展的，因为它们通常由明确的任务组成，比代理那种开放式的行为更可预测。
虽然代理可以通过把提示结构化为一系列任务来模仿管道，但结果可能会不太一致或可预测。所以，在可以使用管道的地方，管道可能是更好的选择。