玩转LangChain - 1

最新推荐文章于 2024-10-06 21:45:11 发布

新加坡内哥谈技术

最新推荐文章于 2024-10-06 21:45:11 发布

阅读量483

点赞数 9

文章标签： langchain 人工智能

本文链接：https://blog.csdn.net/2301_79342058/article/details/136611240

版权

每周跟踪AI热点新闻动向和震撼发展想要探索生成式人工智能的前沿进展吗？订阅我们的简报，深入解析最新的技术突破、实际应用案例和未来的趋势。与全球数同行一同，从行业内部的深度分析和实用指南中受益。不要错过这个机会，成为AI领域的领跑者。点击订阅，与未来同行！订阅：https://rengongzhineng.io/

玩转LangChain：AI黑科技的新篇章

在这个信息爆炸的时代，有一款名为LangChain的开源框架，正悄悄地改变着我们与AI的互动方式。想象一下，通过几行代码，你就能构建出一个拥有广泛知识库的聊天机器人，它不仅能理解复杂的上下文，还能提供精准的回复。是不是听起来就像科幻小说里的情节？

那就让我们一起深入LangChain的世界，用Google Colab搭建一个简单的RAG（检索增强生成）应用。

首先，你得有个Google Colab的账号 https://colab.research.google.com/ ，然后准备好你的数据文件，比如我这里就准备了一个包含以下的信息文件。

Nedved yang likes to eat chicken rice

随后，就是一系列的设置环节，包括导入必要的库、设置OpenAI的API密钥，还得把这个文件放到Google Drive里。

# Import necessary libraries and modules from langchain and other packages
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import TextLoader
from langchain.vectorstores import DocArrayInMemorySearch, FAISS
from langchain.embeddings import OpenAIEmbeddings, HuggingFaceInstructEmbeddings
from langchain.memory import ConversationBufferMemory
from langchain.indexes import VectorstoreIndexCreator
from langchain_experimental.agents.agent_toolkits.csv.base import create_csv_agent
from langchain.agents.agent_types import AgentType
import openai

# For Google Colab users, mount Google Drive to access files
from google.colab import drive
drive.mount('/content/drive/')

import os

# Request and configure the OpenAI API key for usage
api_key = input("OpenAI API key: ")
os.environ["OPENAI_API_KEY"] = api_key
print("OPENAI_API_KEY has been successfully configured.")

# Display utilities from IPython for enhanced output formatting
from IPython.display import display, Markdown

# Note: This code snippet assumes you're working in a Google Colab environment and requires an OpenAI API key.
# It includes mounting Google Drive for accessing files and setting up environment variables for OpenAI API access.

接下来，就是将这些数据加载到向量数据库中，通过分割、创建嵌入，最终形成一个可以进行有效搜索的数据库。这里有个关键步骤，就是使用LangChain提供的各种工具和抽象来简化聊天机器人或虚拟代理等应用的开发。

# Split the 'data.txt' file into chunks and create embeddings from those chunks.
# Ensure to check your OpenAI API quota before proceeding.

from langchain.text_splitter import CharacterTextSplitter
from langchain.loading import TextLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Define the path to your text file
text_file_path = '/content/drive/MyDrive/LCTest/data.txt'

# Load the text data from the specified file path
text_data_loader = TextLoader(file_path=text_file_path, encoding="utf-8")
text_data = text_data_loader.load()

# Initialize the text splitter with specific chunk size and overlap
splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

# Split the loaded text data into chunks
chunked_data = splitter.split_documents(text_data)

# Initialize the embeddings and vector store for the chunked data
embedder = OpenAIEmbeddings()
vector_store = FAISS.from_documents(chunked_data, embedding=embedder)

通过设置一系列参数，我们可以构建出一个能够进行动态对话处理的会话链。

# Initialize a conversational chain with a language model for dynamic conversation handling.
from langchain.llms import ChatOpenAI
from langchain.memories import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

# Set up the language model with specific parameters for conversation generation.
language_model = ChatOpenAI(temperature=0.1, model_name="gpt-3.5-turbo")

# Configure a memory buffer to store and retrieve conversation history.
conversation_memory = ConversationBufferMemory(
    memory_key='chat_history',  # Key to identify conversation history in memory.
    return_messages=True        # Option to return previous messages in the conversation.
)

# Create a conversational retrieval chain that leverages the language model,
# a specified retriever for information retrieval, and a memory buffer for context.
conversation_chain = ConversationalRetrievalChain.from_llm(
    llm=language_model,
    chain_type="custom",  # Specify the type of conversational chain. "stuff" is replaced with "custom" for clarity.
    retriever=vector_store.as_retriever(),  # Use the previously created vector store as the information retriever.
    memory=conversation_memory  # Include the conversation memory for context-aware conversations.
)

当一切就绪后，就可以进行查询了。

# Formulate a query to find out Nedved Yang's favorite food using the conversational chain.
query_text = "What is the favorite food for Nedved Yang?"

# Execute the query through the conversation chain to obtain a response.
query_response = conversation_chain(query={"question": query_text})

# Extract the answer from the query response.
favorite_food_answer = query_response["answer"]

# Display the obtained answer.
favorite_food_answer

输出是：

Nedved Yang likes to eat chicken rice.

试个其他的问题

# Request suggestions for places in Singapore where Nedved Yang can make purchases.
purchase_query = "Can you suggest places in Singapore for Nedved Yang to buy?"

# Submit the query to the conversational chain and capture the response.
purchase_response = conversation_chain({"question": purchase_query})

# Extract the suggested places from the response.
suggested_places = purchase_response["answer"]

# Output the list of suggested places.
suggested_places

输出是

Nedved Yang can buy chicken rice, his favorite food, at various hawker centers and food courts in Singapore. Some popular places to try chicken rice include Maxwell Food Centre, Tian Tian Hainanese Chicken Rice at Maxwell Road, and Chinatown Complex Food Centre.

记忆被完美保存下来了，而且那份包含了从data.txt获取信息的提示已经成功地送达给了GPT-3.5 Turbo。看起来超级棒。

通过这个实验，不难发现LangChain不仅是一个功能强大的工具，它还打开了一个全新的可能性世界。无论你是AI领域的研究者，还是对话系统的爱好者，LangChain都能让你的项目增添不少乐趣。现在，就是探索和实验的最佳时机，不妨亲自动手，看看你能用LangChain创造出什么新奇的东西吧！