大模型学习笔记十三：工作流

谢白羽

已于 2024-07-23 19:08:06 修改

阅读量2k

点赞数 4

分类专栏：大语言模型文章标签：学习笔记

于 2024-07-23 00:43:57 首次发布

谢白羽专属

本文链接：https://blog.csdn.net/weixin_43679037/article/details/140617605

版权

文章目录

一、了解工作流对大模型的辅助意义

1）那些影响大模型应用的效果

①模型能力：

    - 通识理解和泛化能力
    - 输入信息理解、推理、规划、执行能力
    - 输入信息补充知识学习能力
    - 文字生成创作的风格

②模型输出控制

    -1)单次请求控制
        - Prompt表达优化
        - 以CoT为代表的思维链控制方法
        - 输出格式控制（文本格式语法、工程结构化数据输出…）
    -2)多次请求控制
        - 以ReAct（Action-Observation-Reflection）为代表的多轮自我反思优化
        - 复杂任务的执行过程编排管理

2）单词请求的局限性

① 上下文窗口长度限制、输出长度限制（早期的LangChain长文本Summarize）
②直接进行CoT控制（尤其是用自然语言表达CoT，也就是思维链）会输出思考过程，但我们不希望用户看到这个过程
③ 随着工作进展出现的新信息，对任务时序、编排有依赖的信息，不一定能在单次请求中一次性完成输入

3）工作流优势（将工作拆分成多个工作节点）

①将工作任务拆分成多个工作节点
②能够将模型单次请求调用视作一个工作节点
③能够灵活将其他代码逻辑也写入工作节点
④能够对工作节点进行任务编排
⑤能够在工作节点之间进行数据传递

二、学会复现吴恩达的翻译工作流开源项目

1）项目要求和原理解析

项目介绍
翻译文本
项目地址：
https://github.com/andrewyng/translation-agent
项目原理
让模型在完成首轮翻译之后，通过自我反思后修正的工作流优化翻译结果，以提升最终文本翻译的质量
关键代码文件
https://github.com/andrewyng/translation-agent/blob/main/src/translation_agent/utils.py
关键步骤：

1）第一步：
输入信息：原始文本语言(source_lang) 、翻译目标语言(target_lang) 和 原始文本(source_text)
角色设定：以翻译文本为任务目标的语言学家
输出结果：基于所有输入信息，对 原始文本(source_text) 进行 **第一轮翻译的结果(translation_1)**；

2）第二步：
输入信息：原始文本语言(source_lang) 、翻译目标语言(target_lang) 、 原始文本(source_text) 和 第一轮翻译结果(translation_1)
角色设定：以阅读原始文本和翻译文本，并给出翻译改进意见为任务目标的语言学家
输出结果：基于所有输入信息，对 第一轮翻译结果(translation_1) 提出的 改进意见反思(reflection)

3）第三步：
输入信息：原始文本语言(source_lang) 、翻译目标语言(target_lang) 、 原始文本(source_text) 、 第一轮翻译结果(translation_1) 和 改进意见反思(reflection)
角色设定：以翻译文本为任务目标的语言学家（和第一步相同）
输出结果：基于所有输入信息，给出的第二轮优化后翻译结果(translation_2)

代码讲解(主函数one_chunk_translate_text)
1）总体先通过one_chunk_translate_text获得翻译后的第一次翻译结果文本
2）再通过one_chunk_reflect_on_translation获得反省建议
3）再根据建议和第一次翻译的结果优化结果

def one_chunk_translate_text(
    source_lang: str, target_lang: str, source_text: str, country: str = ""
) -> str:
    """
    Translate a single chunk of text from the source language to the target language.

    This function performs a two-step translation process:
    1. Get an initial translation of the source text.
    2. Reflect on the initial translation and generate an improved translation.

    Args:
        source_lang (str): The source language of the text.
        target_lang (str): The target language for the translation.
        source_text (str): The text to be translated.
        country (str): Country specified for target language.
    Returns:
        str: The improved translation of the source text.
    """
    translation_1 = one_chunk_initial_translation(
        source_lang, target_lang, source_text
    )

    reflection = one_chunk_reflect_on_translation(
        source_lang, target_lang, source_text, translation_1, country
    )
    translation_2 = one_chunk_improve_translation(
        source_lang, target_lang, source_text, translation_1, reflection
    )

    return translation_2

2）使用langGraph复现这个工作流

代码讲解

import json
import openai
from ENV import deep_seek_url, deep_seek_api_key, deep_seek_default_model
from langgraph.graph import StateGraph, START, END
import os

# 模型请求准备
client = openai.OpenAI(
    api_key = deep_seek_api_key,
    base_url =deep_seek_url
)
default_model = deep_seek_default_model

def get_completion(
    prompt: str,
    system_message: str = "You are a helpful assistant.",
    model: str = default_model,
    temperature: float = 0.3,
    json_mode: bool = False,
):
    response = client.chat.completions.create(
        model=model,
        temperature=temperature,
        top_p=1,
        messages=[
            {
   "role": "system", "content": system_message},
            {
   "role": "user", "content": prompt},
        ],
    )
    return response.choices[0].message.content

# 定义传递的信息结构
from typing import TypedDict, Optional
class State(TypedDict):
    source_lang: str
    target_lang: str   #必须的参数
    source_text: str
    country: Optional[str] = None
    translation_1: Optional[str] = None
    reflection: Optional[str] = None
    translation_2: Optional[str] = None

# 创建一个工作流对象
workflow = StateGraph(State)

# 定义三个工作块
"""
获取state中的信息：state.get("key_name")
更新state中的信息：return { "key_name": new_value }
"""
def initial_translation(state):
    source_lang = state.get("source_lang")
    target_lang = state.get("target_lang")
    source_text = state.get("source_text")

    system_message = f"You are an expert linguist, specializing in translation from {
     source_lang} to {
     target_lang}."

    prompt = f"""This is an {
     source_lang} to {
     target_lang} translation, please provide the {
     target_lang} translation for this text. \
Do not provide any explanations or text apart from the translation.
{
     source_lang}: {
     source_text}

{
     target_lang}:"""

    translation = get_completion(prompt, system_message=system_message)

    print("[初次翻译结果]: \n", translation)

    return {
    "translation_1": translation }

def reflect_on_translation(state):
    source_lang = state.get("source_lang")
    target_lang = state.get("target_lang")
    source_text = state.get("source_text")
    country = state.get("country") or ""
    translation_1 = state.get("translation_1")
    
    system_message = f"You are an expert linguist specializing in translation from {
   source_lang} to {
   target_lang}. \
You will be provided with a source text and its translation and your goal is to improve the translation."

    additional_rule = (
        f"The final style and tone of the translation should match the style of {
     target_lang} colloquially spoken in {
     country}."
        if country != ""
        else ""
    )
    
    prompt = f"""Your task is to carefully read a source text and a translation from {
     source_lang} to {
     target_lang}, and then give constructive criticism and helpful suggestions to improve the translation. \
{
     additional_rule}

The source text and initial translation, delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT> and <TRANSLATION></TRANSLATION>, are as follows:

<SOURCE_TEXT>
{
     source_text}
</SOURCE_TEXT>

<TRANSLATION>
{
     translation_1}
</TRANSLATION>

When writing suggestions, pay attention to whether there are ways to improve the translation's \n\
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),\n\
(ii) fluency (by applying {
     target_lang} grammar, spelling and punctuation rules, and ensuring there are no unnecessary repetitions),\n\
(iii) style (by ensuring the translations reflect the style of the source text and takes into account any cultural context),\n\
(iv) terminology (by ensuring terminology use is consistent and reflects the source text domain; and by only ensuring you use equivalent idioms {
     target_lang}).\n\

Write a list of specific, helpful and constructive suggestions for improving the translation.
Each suggestion should address one specific part of the translation.
Output only the suggestions and nothing else."""

    reflection = get_completion(prompt, system_message=system_message)

    print("[初次翻译结果]: \n", reflection)

    return {
    "reflection": reflection }

def improve_translation(state):
    source_lang = state.get<

最低0.47元/天解锁文章