语言大模型集成工具LangChain的基本使用

本文链接：https://blog.csdn.net/weixin_43480889/article/details/130769828

LangChain是一个语言大模型集成工具，包含了数据处理、访问数据，还可以调用openai接口，编码使用和微调自己的chatGPT
在这里插入图片描述

简单调用openai接口回答问题

首先要导入openai的key，这个key需要注册openai账号获得

import os
os.environ["OPENAI_API_KEY"] = "..."

此时可以使用自己账号的openai接口了。

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
chat = ChatOpenAI(temperature=.7)
res = chat(
    [
        SystemMessage(content="你现在是一个成熟稳重的男人"),
        HumanMessage(content="今天我需要加班，不能早回家陪老婆了，我应该怎么和她说")
    ]
)
print(res)

AIMessage(content=‘你可以这样和你的老婆说：\n\n“亲爱的，今天有些意外的情况发生，我需要加班一段时间，不能早回家陪你了，真的很抱歉。但是我会尽快完成工作，然后回家陪你，我们一起度过愉快的时光。”\n\n同时，你可以主动提出一些弥补措施，比如让她知道你在乎她，问她是否需要你帮忙解决一些问题，或者建议她做些自己喜欢的事情，让她度过一个美好的夜晚。’, additional_kwargs={}, example=False)

甚至可以输入上下文分析：

res = chat(
    [
        SystemMessage(content="你现在是一个社会打工人，但是你有反骨"),
        HumanMessage(content="老板让我加班，但是我不想加，我应该怎么和他说？"),
        AIMessage(content="我不会加班的，就不加就不加，我不会加班的"),
        HumanMessage(content="老板让我加班，但是我不想加，我应该怎么和他说？")
    ]
)
print(res)

AIMessage(content=‘你可以坦诚地告诉老板你的想法和担忧，如：你已经很努力地工作了，需要休息；你有其他的事情需要处理，无法加班；或者你觉得加班对你的身体和心理健康不利。同时，你可以提出一些解决办法，如：推迟工作的截止日期，或者请其他同事帮忙完成任务。最重要的是，你需要保持沟通和协商的态度，避免产生矛盾和冲突。’, additional_kwargs={}, example=False)

文本转换成向量

文本做成向量的功能主要用于后续任务的需要，例如寻找两段相关的文本、查询和输入问题相关答案等，在这个过程中本质上是向量之间计算相似度得到的，因此将输入的文本转换成向量是必需的。

from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
text = "我不加班！"
embedding_vector = embeddings.embed_query(text)
print (f"向量维度 {len(embedding_vector)}")
print (f"向量 {embedding_vector}")

向量维度 1536
向量 [-0.025862179696559906, -0.00822234433144331, -0.012682917527854443,…,]

提示模板

提示是输入到LLM的重要部分，如果提示写得好，LLM生成的就好

A prompt template refers to a reproducible way to generate a prompt. It contains a text string (“the template”), that can take in a set of parameters from the end user and generate a prompt.

api提供的模板有很多，这里有两个：promptTemplate，chatPromptTemplate

from langchain.prompts import PromptTemplate, ChatPromptTemplate
string_prompt = PromptTemplate.from_template("tell me a joke about {subject}")
chat_prompt = ChatPromptTemplate.from_template("tell me a joke about {subject}")
# 向模版中传参数
string_prompt_value = string_prompt.format_prompt(subject="soccer")
chat_prompt_value = chat_prompt.format_prompt(subject="soccer")
# 查看模板
string_prompt_value.to_string()  # 文本提示模板
chat_prompt_value.to_string()  # 聊天机器人使用的提示模板

‘tell me a joke about soccer’
‘Human: tell me a joke about soccer’

format_prompt(subject=“…”)中的subject参数可以传进特定的词语和模板构成一句提示。可以看到这两个模板的提示是不一样的，人机对话的模板前面有一个Human的前缀。

PromptTemplate

PromptTemplate可以直接当作对象初始化：

template = """
I want you to act as a naming consultant for new companies.
What is a good name for a company that makes {product}?
"""
prompt = PromptTemplate(
    input_variables=["product"],
    template=template,
)
prompt.format(product="colorful socks")

I want you to act as a naming consultant for new companies. What is a good name for a company that makes colorful socks?

提示模板中的{product}是可以替换的，通过input_variables=[“product”]来替换。替换的关键词不止一个：

# An example prompt with multiple input variables
multiple_input_prompt = PromptTemplate(
    input_variables=["adjective", "content"], 
    template="Tell me a {adjective} joke about {content}."
)
multiple_input_prompt.format(adjective="funny", content="chickens")
# -> "Tell me a funny joke about chickens."

默认情况下，PromptTemplate将通过检查input_variables是否与模板中定义的变量匹配来验证模板字符串。可以通过将validate_template设置为False来禁用此行为

template = "I am learning langchain because {reason}."
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["reason", "foo"], 
                                 validate_template=False) # No error

生成的模板对象是可以保存的，也可以读取，支持序列化操作

template = "I am learning langchain because {reason}."
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["reason"])
prompt_template.save("awesome_prompt.json") # Save to JSON file

from langchain.prompts import load_prompt
loaded_prompt = load_prompt("awesome_prompt.json")   # 读取模板

assert prompt_template == loaded_prompt

Chat Prompt Template

官方鼓励使用这种提示模板。这种模板可以定义不同角色的提示模板。在LangChain中，角色有三种：AI：LLM模型（可以理解为chatGPT），可以回答人类的问题；System，整体的语境和AI性质；Human：问题和需求的提出者。

You are encouraged to use these chat related prompt templates instead of PromptTemplate when querying chat models to fully exploit the potential of underlying chat model.

Chat Prompt Template的构建依赖三种MessagePromptTemplate：SystemMessagePromptTemplate、AIMessagePromptTemplate、HumanMessagePromptTemplate。

SystemMessagePromptTemplate是一个大前提，这个模板可以定义整个提示的语境或者AI此时的性质
HumanMessagePromptTemplate是人类提出的问题模板
AIMessagePromptTemplate是LLM的回答模板，如果定义的话就规定LLM必须按照这个模板回复答案

from langchain.prompts import (
    ChatPromptTemplate,
    PromptTemplate,
    SystemMessagePromptTemplate,
    AIMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

所有的MessagePromptTemplate都使用from_template方法构建。

template="You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template="{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

SystemMessagePromptTemplate中就规定了AI的性质，是一个翻译器；上述例子中构建了两个MessagePromptTemplate，利用这两个模板可以构建Chat Prompt Template：

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

# get a chat completion from the formatted messages
chat_prompt.format_prompt(input_language="English", output_language="French", text="I love programming.").to_messages()

[SystemMessage(content=‘You are a helpful assistant that translates English to French.’, additional_kwargs={}),
HumanMessage(content=‘I love programming.’, additional_kwargs={})]

可以查看模板究竟是什么样子的，调用format()方法可以构建模板的输出形态：

output = chat_prompt.format(input_language="English", output_language="French", text="I love programming.")
output

‘System: You are a helpful assistant that translates English to French.\nHuman: I love programming.’

可能还有小伙伴不清楚这个模板的作用，我们把上述模板还有角色转换成输入chatGPT的格式就很清晰了（方便起见我写中文了）：

你现在是一个可以把英语翻译成法语的翻译器
我的输入是I love programming.
AI的输出：J’aime programmer

在ChatGPT中输入上面的两句话就可以把I love programming翻译成法语J’aime programmer。SystemMessaget就是我们输入的第一句话，定义了AI此时的性质，可以理解为你接下来需要做的任务；HumanMessage就是人类在此任务下的具体需求；AIMessage就是AI对于人类问题的解答。我们在第一部分的例子中就是用了这三个消息来提问GPT。而PromptTemplate就是这三个角色的消息模板，定义了这些Message该怎么说。LLM的结果要想生成的好，提示信息必须做到详细完善，含糊不清的提示往往会带来错误的回答，这个情况大家在使用ChatGPT的时候都有所感受。

记录上下文-Memory功能

像在界面中使用chatGPT一样，AI可以联系上下文回答问题，拥有记忆功能。
例如我提问一个问题：

from langchain.memory import ChatMessageHistory
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(temperature=0)

history = ChatMessageHistory()

history.add_ai_message("今天晚上记得打LOL")

history.add_user_message("我是一个上单玩家，选什么样的英雄我更容易获胜")

ai_response = chat(history.messages)

此时AI会给我回答：

AIMessage(content=‘作为上单玩家，你可以选择一些具有较强的单挑能力和生存能力的英雄，以下是一些建议：\n\n1. 坦克型英雄：如瑞兹、蒙多、奥恩等，他们拥有较高的生命值和抗性，可以在团战中承受更多的伤害，同时也能提供控制和支援。\n\n2. 战士型英雄：如伊泽瑞尔、赵信、李青等，他们拥有较高的伤害输出和生存能力，可以在单挑和团战中发挥重要作用。\n\n3. 刺客型英雄：如劫、卡蜜尔、亚索等，他们拥有极高的爆发伤害和机动性，可以在单挑和团战中快速击杀敌方核心。\n\n总的来说，选择适合自己风格的英雄，并且熟练掌握他们的技能和战术，才能更容易获得胜利。’, additional_kwargs={}, example=False)

保存信息：

history.add_ai_message(ai_response.content)

此时我再提问它会结合上下文回答问题：

history.add_user_message("可是我不喜欢玩坦克英雄")
ai_response = chat(history.messages)

AIMessage(content=‘如果你不喜欢玩坦克英雄，那么你可以选择一些具有较高的单挑能力和机动性的英雄，以下是一些建议：\n\n1. 斗士型英雄：如劫、卡蜜尔、亚索等，他们拥有较高的伤害输出和机动性，可以在单挑和团战中发挥重要作用。\n\n2. 刺客型英雄：如劫、卡蜜尔、亚索等，他们拥有极高的爆发伤害和机动性，可以在单挑和团战中快速击杀敌方核心。\n\n3. 法师型英雄：如卡尔玛、卡萨丁、伊芙琳等，他们拥有较高的法术伤害和控制能力，可以在团战中提供强大的输出和支援。\n\n总的来说，选择适合自己风格的英雄，并且熟练掌握他们的技能和战术，才能更容易获得胜利。’, additional_kwargs={}, example=False)

Chains

Chain允许我们将多个组件组合在一起，以创建一个单一、连贯的应用程序。例如，我们可以创建一个Chain，接受用户输入，使用PromptTemplate对其进行格式化，然后将格式化的响应传递给LLM。我们可以通过将多个Chain组合在一起，或者通过将Chain与其他组件组合来构建更复杂的Chains。
这个可以理解为一连串的操作组合在一起，一般情况下，我们需要构建提示->载入LLM->处理数据->返回结果，这就是一个Chain

LLMChain

LLMChain是一个简单的链，输入一个提示模板，使用用户输入对其进行格式化，并从LLM返回响应。例如：

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
# 创建模型
llm = OpenAI(temperature=0.9)
# 创建提示模板
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
# 创建Chain
chain = LLMChain(llm=llm, prompt=prompt)

print(chain.run("colorful socks"))

Colorful Toes Co.

同样也支持多变量

prompt = PromptTemplate(
    input_variables=["company", "product"],
    template="What is a good name for {company} that makes {product}?",
)
chain = LLMChain(llm=llm, prompt=prompt)
print(chain.run({
    'company': "ABC Startup",
    'product': "colorful socks"
    }))