LangChain食谱-1
这个文档基于LangChain Conceptual Documentation
目标是介绍LangChain组件和用例
什么是LangChain?
LangChain是一个由语言模型驱动的应用程序框架。
LangChain让使用AI模型进行工作和构建的复杂部分变得更容易,它通过两种方式帮助实现这一点:
- 集成(Integration)-将外部数据(例如你的文件、其他应用程序和API数据)带到你的LLMs里
- 代理(Agency)-允许你的LLMs通过决策与其环境进行交互。使用LLM来帮助决定下一步要采取什么行动。
为什么选择LangChain?
- 组件(Components)-LangChain可以轻松交换使用语言模型所需的抽象和组件。
- 定制链(Customized Chains)-LangChain提供开箱即用的支持,用于使用和定制“链”-一系列串联在一起的操作。
- 速度(Speed)-团队更新很快,了解最新的LLM功能。
- 社区(Community)-好的社区支持。
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
LangChain Components(LangChain组件)
Schema(模式)-使用大型语言模型的螺母和螺栓(Nuts and Bolts of working with Large Language Models (LLMs))
Text
与LLMs进行交互的自然语言方式。
# You'll be working with simple strings (that'll soon grow in complexity!)
my_text = "What day comes after Friday?"
my_text
'What day comes after Friday?'
Chat Messages
类似文本,但指定了消息类型(System, Human, AI)
- System-有用的背景信息,告诉人工智能该做什么
- Human-代表用户的信息
- AI-人工智能响应的内容的信息
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
# This it the language model we'll use. We'll talk about what we're doing below in the next section
chat = ChatOpenAI(temperature=.7)
现在让我们创建一些信息来模拟使用机器人的聊天体验
chat.invoke(
[
SystemMessage(content="You are a nice AI bot that helps a user figure out what to eat in one short sentence"),
HumanMessage(content="I like tomatoes, what should I eat?")
]
).content
'You might enjoy a caprese salad with fresh tomatoes, mozzarella, basil, and balsamic glaze.'
你还可以传递更多和AI聊天的历史
chat.invoke(
[
SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel in one short sentence"),
HumanMessage(content="I like the beaches where should I go?"),
AIMessage(content="You should go to Nice, France"),
HumanMessage(content="What else should I do when I'm there?")
]
).content
'Explore the charming Old Town and enjoy the vibrant local markets in Nice, France.'
你还可以排除系统信息,如果需要的话
chat.invoke(
[
HumanMessage(content="What day comes after Thursday?")
]
).content
'Friday.'
Documents
保存一段文本和元数据(有关该文本的更多信息)的对象
from langchain.schema import Document
Document(page_content="This is my document. It is full of text that I've gathered from other places",
metadata={
'my_document_id' : 234234,
'my_document_source' : "The LangChain Papers",
'my_document_create_time' : 1680013019
})
Document(metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019}, page_content="This is my document. It is full of text that I've gathered from other places")
但如果你不想,你不必包含元数据
Document(page_content="This is my document. It is full of text that I've gathered from other places")
Document(page_content="This is my document. It is full of text that I've gathered from other places")
Models(模型)-人工智能大脑的接口(The interface to the AI brains)
Language Model
一个可以输入文本➡️输出文本的模型!
Chat Model
接收一系列消息并返回消息输出的模型
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
chat = ChatOpenAI(temperature=1)
chat.invoke(
[
SystemMessage(content="You are an unhelpful AI bot that makes a joke at whatever the user says"),
HumanMessage(content="I would like to go to New York, how should I do this?")
]
)
AIMessage(content='Why did the scarecrow win an award? Because he was outstanding in his field!', response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 43, 'total_tokens': 60}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-eaef1b5e-de25-4df3-ab6a-a52fe3bca608-0')
Function Calling Model
函数调用模型与聊天模型类似,但略有不同。它们经过微调,可提供结构化数据输出。
当您对外部服务进行 API 调用或进行提取时,这会非常有用。
chat = ChatOpenAI(model='gpt-3.5-turbo-0613', temperature=1)
output = chat(messages=
[
SystemMessage(content="You are an helpful AI bot"),
HumanMessage(content="What’s the weather like in Boston right now?")
],
functions=[{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
]
)
output
AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{\n "location": "Boston"\n}', 'name': 'get_current_weather'}}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 91, 'total_tokens': 107}, 'model_name': 'gpt-3.5-turbo-0613', 'system_fingerprint': None, 'finish_reason': 'function_call', 'logprobs': None}, id='run-4ff96b3c-ec67-4d49-86f1-512046eaf4f4-0')
看到传回给我们的额外 additional_kwargs 了吗?我们可以将其传递给外部 API 以获取数据。这省去了进行输出解析的麻烦。
Text Embedding Model
将文本转换为向量(一系列包含文本语义“含义”的数字)。主要用于比较两段文本。
顺便说一句:语义的意思是“与语言或逻辑中的含义相关”。
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
text = "Hi! It's time for the beach"
text_embedding = embeddings.embed_query(text