106 使用结构化答案过滤优化响应合成：深入解析与实战应用-CSDN博客

本文链接：https://blog.csdn.net/xycxycooo/article/details/141641494

使用结构化答案过滤优化响应合成：深入解析与实战应用

在处理自然语言处理任务时，响应合成是一个关键步骤。然而，不准确的响应可能会导致最终答案的质量下降。本文将深入探讨如何使用结构化答案过滤（Structured Answer Filtering）优化响应合成，特别是在使用Refine响应合成器时。我们将通过详细的代码示例和技术解释，帮助你全面理解其工作原理及实际应用。

前置知识

在深入探讨之前，我们需要了解一些基本概念：

Refine响应合成器：一种用于响应合成的工具，可以根据上下文信息逐步细化答案。
OpenAI模型：支持函数调用的OpenAI模型，如gpt-3.5-turbo-0613。
函数调用：一种机制，允许模型调用外部函数来获取或处理数据。
结构化答案过滤：一种技术，用于过滤掉不准确的响应，确保最终答案的准确性。

问题背景

在使用Refine响应合成器进行响应合成时，一个常见问题是“我不知道”这类无用响应的传播。即使上下文中存在实际答案，这种无用响应也可能持续存在，导致最终答案不准确。

解决方案：结构化答案过滤

通过设置structured_answer_filtering为True，可以过滤掉这些无用响应。默认情况下，该选项为False，因为它目前仅在使用支持函数调用的OpenAI模型时效果最佳。

安装依赖

首先，我们需要安装必要的依赖库：

%pip install llama-index-llms-openai
!pip install llama-index

加载数据

假设我们有以下文本数据：

texts = [
    "The president in the year 2040 is John Cena.",
    "The president in the year 2050 is Florence Pugh.",
    'The president in the year 2060 is Dwayne "The Rock" Johnson.',
]

初始化OpenAI模型

我们需要设置OpenAI API密钥并初始化模型：

import os

os.environ["OPENAI_API_KEY"] = "sk-..."
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-0613")

使用Refine响应合成器

我们将使用Refine响应合成器来生成答案：

from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine", llm=llm, verbose=True
)
response = summarizer.get_response("who is president in the year 2050?", texts)

失败结果

由于无用响应的传播，我们未能从输入文本中获得正确答案：

print(response)
# 输出: I'm sorry, but I don't have access to information about the future.

使用结构化答案过滤

现在，我们将structured_answer_filtering设置为True，再次尝试：

from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=llm,
    verbose=True,
    structured_answer_filtering=True,
)
response = summarizer.get_response("who is president in the year 2050?", texts)

成功结果

通过过滤无用响应，我们成功获得了正确答案：

print(response)
# 输出: Florence Pugh

非函数调用LLMs

如果你使用的是不支持函数调用的LLM，Refine模块会自动切换到使用结构化输出程序，而不依赖外部函数调用API：

# 使用不支持函数调用的旧模型
instruct_llm = OpenAI(model="gpt-3.5-turbo-instruct")

from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="refine",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)
response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)
# 输出: Florence Pugh

CompactAndRefine

由于CompactAndRefine是基于Refine构建的，因此它也支持结构化答案过滤：

from llama_index.core import get_response_synthesizer

summarizer = get_response_synthesizer(
    response_mode="compact",
    llm=instruct_llm,
    verbose=True,
    structured_answer_filtering=True,
)
response = summarizer.get_response("who is president in the year 2050?", texts)
print(response)
# 输出: Florence Pugh