LangChain Experssion Language之How to（二）

阿喵要当程序员

已于 2024-03-12 11:28:35 修改

阅读量886

点赞数 19

分类专栏： nlp 文章标签： langchain 自然语言处理人工智能

于 2024-03-09 08:34:45 首次发布

本文链接：https://blog.csdn.net/mashutian/article/details/136576874

版权

nlp 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

LangChain Experssion Language简介

LangChain Experssion Language示例大赏

Create a runnable with the `@chain` decorator：chain chain

Add fallbacks：出错了怎么回退

Stream custom generator functions：让答案变得流式一些

Inspect your runnables：chain长啥样

Add message history (memory)：加入历史记录的chain chain

LangChain Experssion Language简介

哒哒，我又来了！首先继续介绍一下咱们明星产品Langchain的LangChain Experssion Language，简称LCEL，感觉就是为了节省代码量，让程序猿们更好地搭建基于大语言模型的应用，而在LangChain框架中整了新的语法来搭建prompt+LLM的chain。来，大家直接看官网链接：LangChain Expression Language (LCEL) | 🦜️🔗 Langchain。

本文的例子主要来自官网给出的How to示例（How to | 🦜️🔗 Langchain）。就是我现在没工作在家自己学习一下，毕竟也是做NLP的。然后就自己理解看一遍代码，如果有问题的话欢迎来评论。本文是二，有二必然有一，但是未必有三。为了让大家不会觉得看不下去，是的，我分了两篇文章《LangChain Experssion Language之How to（一）》和《LangChain Experssion Language之How to（二）》。下面来简单看看这篇文章里的示例吧。

LangChain Experssion Language示例大赏

Create a runnable with the `@chain` decorator：chain chain

可以利用chain定义你的chain

@chain
def custom_chain(text):
    prompt_val1 = prompt1.invoke({"topic": text})
    output1 = ChatOpenAI().invoke(prompt_val1)
    parsed_output1 = StrOutputParser().invoke(output1)
    chain2 = prompt2 | ChatOpenAI() | StrOutputParser()
    return chain2.invoke({"joke": parsed_output1})

Add fallbacks：出错了怎么回退

由于LLM可能会出现API错误，因此这一节示例就是告诉你发生错误的时候怎么回退。

因为API的调用限制报错，比如openai_llm有问题，利用with_fallbacks回退到anthropic_llm。

from unittest.mock import patch

import httpx
from openai import RateLimitError

request = httpx.Request("GET", "/")
response = httpx.Response(200, request=request)
error = RateLimitError("rate limit", response=response, body="")

# Note that we set max_retries = 0 to avoid retrying on RateLimits, etc
openai_llm = ChatOpenAI(max_retries=0)
anthropic_llm = ChatAnthropic()
llm = openai_llm.with_fallbacks([anthropic_llm])

# Let's use just the OpenAI LLm first, to show that we run into an error
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(openai_llm.invoke("Why did the chicken cross the road?"))
    except RateLimitError:
        print("Hit error")
# Now let's try with fallbacks to Anthropic
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(llm.invoke("Why did the chicken cross the road?"))
    except RateLimitError:
        print("Hit error")

# 在chain invoke的时候进行try

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're a nice assistant who always includes a compliment in your response",
        ),
        ("human", "Why did the {animal} cross the road"),
    ]
)
chain = prompt | llm
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(chain.invoke({"animal": "kangaroo"}))
    except RateLimitError:
        print("Hit error")

比如一个chain有问题，回退到另一个chain

# First let's create a chain with a ChatModel
# We add in a string output parser here so the outputs between the two are the same type
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI

chat_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're a nice assistant who always includes a compliment in your response",
        ),
        ("human", "Why did the {animal} cross the road"),
    ]
)
# Here we're going to use a bad model name to easily create a chain that will error
chat_model = ChatOpenAI(model_name="gpt-fake")
bad_chain = chat_prompt | chat_model | StrOutputParser()# Now lets create a chain with the normal OpenAI model



prompt_template = """Instructions: You should always include a compliment in your response.

Question: Why did the {animal} cross the road?"""
prompt = PromptTemplate.from_template(prompt_template)
llm = OpenAI()
good_chain = prompt | llm

# 最终定义的chain，利用with_fallbacks函数会回退到good_chain。
# We can now create a final chain which combines the two
chain = bad_chain.with_fallbacks([good_chain])
chain.invoke({"animal": "turtle"})

Stream custom generator functions：让答案变得流式一些

如题目所示，本节的案例关于流式自定义生成器。一般输出都是一条字符串，但是里面可能包含了几个答案。因此，可以通过连接符分割的方式将字符串转化为字符列表，就有流式输出的那味儿了，官网给了同步版和异步版两个示例：

from typing import Iterator, List
from langchain.prompts.chat import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate.from_template(
    "Write a comma-separated list of 5 animals similar to: {animal}"
)
model = ChatOpenAI(temperature=0.0)

str_chain = prompt | model | StrOutputParser()
str_chain.invoke({"animal": "bear"})
# 结果是 'lion, tiger, wolf, gorilla, panda'


# 同步版
# This is a custom parser that splits an iterator of llm tokens
# into a list of strings separated by commas
def split_into_list(input: Iterator[str]) -> Iterator[List[str]]:
    # hold partial input until we get a comma
    buffer = ""
    for chunk in input:
        # add current chunk to buffer
        buffer += chunk
        # while there are commas in the buffer
        while "," in buffer:
            # split buffer on comma
            comma_index = buffer.index(",")
            # yield everything before the comma
            yield [buffer[:comma_index].strip()]
            # save the rest for the next iteration
            buffer = buffer[comma_index + 1 :]
    # yield the last chunk
    yield [buffer.strip()]

list_chain = str_chain | split_into_list
list_chain.invoke({"animal": "bear"})
# 然后就得到一个list啦：['lion', 'tiger', 'wolf', 'gorilla', 'panda']

# 异步版
from typing import AsyncIterator


async def asplit_into_list(
    input: AsyncIterator[str],
) -> AsyncIterator[List[str]]:  # async def
    buffer = ""
    async for (
        chunk
    ) in input:  # `input` is a `async_generator` object, so use `async for`
        buffer += chunk
        while "," in buffer:
            comma_index = buffer.index(",")
            yield [buffer[:comma_index].strip()]
            buffer = buffer[comma_index + 1 :]
    yield [buffer.strip()]


list_chain = str_chain | asplit_into_list
await list_chain.ainvoke({"animal": "bear"})
# 同样得到一个list啦：['lion', 'tiger', 'wolf', 'gorilla', 'panda']
# 想了解异步更多，可搜索Python异步async和await

Inspect your runnables：chain长啥样

用一些函数来更好地检查你运行的chain：可以了解chain的框架结构，比如chain.get_graph()，并且把graph打印出来，chain.get_graph().print_ascii()。也可以打印出你的prompt，chain.get_prompts()。

Add message history (memory)：加入历史记录的chain chain

很多LLM的实际应用都是多轮对话，或者需要进行上下文回顾，官网在这里给了示例，如何来添加历史信息。首先呢，官网原话是，咱们的输入可以是以下这么三类信息：

a sequence of BaseMessage
a dict with a key that takes a sequence of BaseMessage
a dict with a key that takes the latest message(s) as a string or sequence of BaseMessage, and a separate key that takes historical messages

输出的话也是三类：

a string that can be treated as the contents of an AIMessage
a sequence of BaseMessage
a dict with a key that contains a sequence of BaseMessage

官网这里构造了一个get session history的函数，来把历史信息存储在字典里，每个session用session id进行存储和区分。

# 定义这个叫做runnable的chain，助手机器人
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai.chat_models import ChatOpenAI

model = ChatOpenAI()

# 定义prompt这里是 from_messages啦
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're an assistant who's good at {ability}. Respond in 20 words or fewer",
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{input}"),
    ]
)
runnable = prompt | model


from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

# 将session history进行存储，然后返回
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(
    runnable,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)

# 当调用这个with message history得时候
# 通过这个configuration参数我们可以根据session ID把对应的记录存储好
with_message_history.invoke(
    {"ability": "math", "input": "What does cosine mean?"},
    config={"configurable": {"session_id": "abc123"}},
)
# 此处返回AIMessage(content='Cosine is a trigonometric function that calculates the ratio of the adjacent side to the hypotenuse of a right triangle.')

# chain记住对应session的样子
# Remembers
with_message_history.invoke(
    {"ability": "math", "input": "What?"},
    config={"configurable": {"session_id": "abc123"}},
)
# 此处返回AIMessage(content='Cosine is a mathematical function used to calculate the length of a side in a right triangle.')

# 换成新的session就忘了
# New session_id --> does not remember.
with_message_history.invoke(
    {"ability": "math", "input": "What?"},
    config={"configurable": {"session_id": "def234"}},
)
# 此处返回AIMessage(content='I can help with math problems. What do you need assistance with?')

除了利用session ID来进行存储，其实还可以定义用户ID以及对应的conversion ID来存储。这里用到了history_factory_config。

from langchain_core.runnables import ConfigurableFieldSpec

store = {}


def get_session_history(user_id: str, conversation_id: str) -> BaseChatMessageHistory:
    if (user_id, conversation_id) not in store:
        store[(user_id, conversation_id)] = ChatMessageHistory()
    return store[(user_id, conversation_id)]


with_message_history = RunnableWithMessageHistory(
    runnable,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="user_id",
            annotation=str,
            name="User ID",
            description="Unique identifier for the user.",
            default="",
            is_shared=True,
        ),
        ConfigurableFieldSpec(
            id="conversation_id",
            annotation=str,
            name="Conversation ID",
            description="Unique identifier for the conversation.",
            default="",
            is_shared=True,
        ),
    ],
)

# invoke的时候这样输入咱们的config函数
with_message_history.invoke(
    {"ability": "math", "input": "Hello"},
    config={"configurable": {"user_id": "123", "conversation_id": "1"}},
)

然后，官网展示了几种不同类型的输入输出Messages input, dict output；Messages input, messages output还有Dict with single key for all messages input, messages output（直接上网看吧）。因为对话这个事情，大多数时候还是最好要保留对话记录的，因此官网给了一个GitHub的链接：https://github.com/langchain-ai/langserve/blob/main/examples/chat_with_persistence_and_user/server.py，让大家学习怎么本地存储咱的用户记录。在这个示例的最后，Langchain又给自己的LangSmith打了个广告，在加入历史记录的时候，利用LangSmith进行跟踪可以发现在构造prompt提示的时候，history包含两个信息列表，first input and output。

最后的最后，如果你运行代码发现报错是需要api key那就是代码里模型定义和加载的时候请加上api key的参数，记得申请key哦。

✅ gpt系列，需要参数openai_api_key，申请地址：https://platform.openai.com/api-keys

✅ anthropic也就是前几天发的Claude系列，需要参数anthropic_api_key，申请地址：App unavailable \ Anthropic

记得Create new secret key以后需要把你的key在别的地方存一下，因为不会再能展示给你看了。