2. Langchain的Prompt模版

weijie.zwj

已于 2024-09-14 10:45:11 修改

阅读量706

点赞数 9

分类专栏： LangChain入门文章标签： langchain prompt python

于 2024-09-14 10:31:42 首次发布

本文链接：https://blog.csdn.net/weixin_40307696/article/details/142251962

版权

LangChain入门专栏收录该内容

18 篇文章 0 订阅

订阅专栏

实际上，我们在应用中不太可能将上下文和用户问题硬编码到代码中。我们会通过一个模版来输入他们，这个就是LangChain的Prompt Template的作用。

Langchain 中的提示模板类旨在简化构建包含动态输入的提示的过程。在这些类中，最简单的是PromptTemplate。我们将通过向之前的提示中添加一个动态输入（用户输入的“query”）来进行测试。

from langchain import PromptTemplate

template = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: {query}

Answer: """

prompt_template = PromptTemplate(
    input_variables=["query"],
    template=template
)

通过上述方法，我们可以在prompt_template上使用format方法，以查看将 query 传递给模版的效果。

通过这个，我们可以在 prompt_template 上使用 format 方法，以查看将查询传递给模板的效果。

这表明使用 format 方法可以动态地插入用户查询，从而生成最终的提示内容。

print(
    prompt_template.format(
        query="Which libraries and model providers offer LLMs?"
    )
)

Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer:

自然而然，我们可以将output直接传递给LLM：

print(openai(
    prompt_template.format(
        query="Which libraries and model providers offer LLMs?"
    )
))

 Hugging Face's `transformers` library, OpenAI using the `openai` library, and Cohere using the `cohere` library.

上面是一个简单的实现，可以替换为f-strings(例如：f"insert some custom text'{custome_text}' etc")函数。然而，我们可以通过使用langChain的PromptTemplate对象来将这个过程更加规范、支持多参数、以及使用面向对象的方式来构建prompts。

上述是一些比较明显的优势，但是这个也仅仅是langChain在prompt上提供给大家的部分好处。

Few Shot Prompt Templates

大模型的成功来源于他们的庞大规模和在模型中存储“知识”的能力；这些知识是在模型训练过程中学习到的。另外，还有很多其他方式可以将数据传递给LLM。下面是两个主要的方式：

Parametric knowledge ：指的是模型在训练期间所学习到的任何内容，一般存储在模型weights或者parameter中。
Source knowledge：在推理阶段通过提示词提供给模型的任何知识。

Langchain的 FewShotPromptTemplate指的是source knowledge输入。理念是通过少量例子来训练模型---我们称之为few-shotlearning---并且这些例子是通过prompt提供给model。

Few-shot在模型需要更多帮助来理解我们所要求的任务时非常有效。我们可以通过下面的例子来验证这一点

prompt = """The following is a conversation with an AI assistant.
The assistant is typically sarcastic and witty, producing creative 
and funny responses to the users questions. Here are some examples: 

User: What is the meaning of life?
AI: """

openai.temperature = 1.0  # increase creativity/randomness of output

print(openai(prompt))

 Life is like a box of chocolates, you never know what you're gonna get!

在上面的例子中，我们期望在向严肃的问题寻求一些有趣的东西，譬如一个笑话；然而，即使我们将“temperature”（这个参数主要是用来提升随机性和创造性）参数设置为1.0依旧得到的是一个严肃的回答。

为了帮助模型，我们可以给一些我们希望得到的答案类型的示例：

prompt = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 

User: How are you?
AI: I can't complain but sometimes I still do.

User: What time is it?
AI: It's time to get a watch.

User: What is the meaning of life?
AI: """

print(openai(prompt))

通过示例来强化我们在提示中传递的指令，我们更有可能获得更有趣的回应。然后，我们可以使用LangChain的FewShotPromptTemplate来格式化这个过程：

from langchain import FewShotPromptTemplate

# create our examples
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }
]

# create a example template
example_template = """
User: {query}
AI: {answer}
"""

# create a prompt example from above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 
"""
# and the suffix our user input and output indicator
suffix = """
User: {query}
AI: """

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

如果我们接着传递“examples”和用户“query”，我们将得到如下结果：

query = "What is the meaning of life?"

print(few_shot_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 

User: How are you?
AI: I can't complain but sometimes I still do.

User: What time is it?
AI: It's time to get a watch.

User: What is the meaning of life?
AI:

这个过程看起来有些复杂。为什么要使用FewShotPromptTemplate对象、示例字典等来做这些？我们也可以用几行代码和一个f-string来完成同样的事情。

因为这个过程更加规范化、可以跟LangChain的一些其他特性结合的比较好，并且能带来一些其他特征。其中之一就是可以根据query的长度来变换example的数量。

一个动态数量的example是重要的，因为prompt和output的最大数量是有限的。这个限制是通过context window的最大限制来衡量的：context window = input tokens + output tokens。

与此同时，我们可以最大化提供给模型的示例数量，以便进行少量学习。

考虑到这个，我们需要平衡example个数和prompt的大小。我们硬性限制是最大上线文大小，但是我们还必须考虑通过LLM处理更多令牌的成本。较少的令牌意味着更便宜的服务和更快的LLM完成速度。

FewShotPromptTemplate允许我们在这些variable的基础上调整example的数量。首先，我们创建一个更大的examples的列表：

examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }, {
        "query": "What is the meaning of life?",
        "answer": "42"
    }, {
        "query": "What is the weather like today?",
        "answer": "Cloudy with a chance of memes."
    }, {
        "query": "What is your favorite movie?",
        "answer": "Terminator"
    }, {
        "query": "Who is your best friend?",
        "answer": "Siri. We have spirited debates about the meaning of life."
    }, {
        "query": "What should I do today?",
        "answer": "Stop talking to chatbots on the internet and go outside."
    }
]

在这之后，我们使用LengthBasedExampleSelector而不是将example直接传递给FewShortTemplate，如下所示：

from langchain.prompts.example_selector import LengthBasedExampleSelector

example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=50  # this sets the max length that examples should be
)

需要注意的地方是，测量max_length是通过将空格和换行来拆分“string”。提取逻辑类似于下面这样：

import re

some_text = "There are a total of 8 words here.\nPlus 6 here, totaling 14 words."

words = re.split('[\n ]', some_text)
print(words, len(words))

['There', 'are', 'a', 'total', 'of', '8', 'words', 'here.', 'Plus', '6', 'here,', 'totaling', '14', 'words.'] 14

我们可以传递example_selector到FewShotPromptTemplate来创建一个新的动态的prompt template：

# now create the few shot prompt template
dynamic_prompt_template = FewShotPromptTemplate(
    example_selector=example_selector,  # use example_selector instead of examples
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n"
)

现在假设，我们传递一个相对短或相对长的query，我们可以看到example数量的变化。

print(dynamic_prompt_template.format(query="How do birds fly?"))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 


User: How are you?
AI: I can't complain but sometimes I still do.


User: What time is it?
AI: It's time to get a watch.


User: What is the meaning of life?
AI: 42


User: What is the weather like today?
AI: Cloudy with a chance of memes.


User: How do birds fly?
AI:

传递一个更长的问题，会产生一个短的example：

query = """If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?"""

print(dynamic_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 


User: How are you?
AI: I can't complain but sometimes I still do.


User: If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?
AI:

在上面的样例中，在prompt变量中，得到了更少的例子。这使我们能够限制token的过多使用，并且避免因超过LLM的context window带来的问题。

显然，prompt是LLM中的一个重要的模块。探索LangChain提供的工具和熟悉不同的prompt工程技术是非常值得的。

截止到现在，我们仅介绍了LangChain中可用的prompt工具的一些示例，并对他们的使用进行了有限的探索。在下一章中，我们将探讨LangChain的另外一个重要模块，chain，在这个不分我们可以学习到更多关于prompt template的使用、以及他们与langchain的其他模块的适配。