使用 Zilliz Cloud 和 LangChain 对文档进行问答

Chen_Chance

已于 2024-07-11 13:13:14 修改

阅读量628

点赞数 25

文章标签： langchain redis 数据库

于 2024-07-11 13:12:49 首次发布

本文链接：https://blog.csdn.net/qq_44154915/article/details/140337223

版权

手把手教你基于「向量数据库+LangChain」快速搭建智能问答系统（Zilliz Cloud、Milvus）
Question Answering over Documents with Zilliz Cloud and LangChain
本文章代码主要以第二篇文章为主，第一个视频为辅
在这里插入图片描述

如果运行中出现安装包的问题，可以试一试以下几个命令

pip install langchain
pip install -U langchain-community
pip install openai
pip install pymilvus
pip install -U langchain-openai

以下代码在jupyter notebook中运行

!python -m pip install --upgrade pymilvus langchain openai tiktoken

我这里用的是中转gpt4的模型embedding，中转gpt3.5问了一家没办法embedding

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores.zilliz import Zilliz
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chat_models import ChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.prompts import PromptTemplate

import os

# 1. Set up the name of the collection to be created.
COLLECTION_NAME = 'doc_qa_db'

# 2. Set up the dimension of the embeddings.
DIMENSION = 768

# 3. Set up the cohere api key
OPENAI_API_KEY = "sk-sy…………0a2e"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

# 4. Set up the connection parameters for your Zilliz Cloud cluster.
URI = 'https://in05-6……3.serverless.ali-cn-hangzhou.cloud.zilliz.com.cn'

# 5. Set up the token for your Zilliz Cloud cluster.
# You can either use an API key or a set of cluster username and password joined by a colon.
TOKEN = 'e……cda'

# Use the WebBaseLoader to load specified web pages into documents
loader = WebBaseLoader([
    'https://milvus.io/docs/overview.md',
])

docs = loader.load()

# Split the documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=0)
all_splits = text_splitter.split_documents(docs)

我这里是中转，所以用了base_url参数

from langchain_openai import OpenAIEmbeddings

# 你的自定义基础URL
custom_base_url = 'https://api.xiaoai.plus/v1'

# 如果OpenAIEmbeddings类接受base_url参数
embeddings = OpenAIEmbeddings(base_url=custom_base_url)

connection_args = { 'uri': URI, 'token': TOKEN }

vector_store = Zilliz(
    embedding_function=embeddings, 
    connection_args=connection_args,
    collection_name=COLLECTION_NAME,
    drop_old=True
).from_documents(
    all_splits,
    embedding=embeddings,
    collection_name=COLLECTION_NAME,
    connection_args=connection_args,
        auto_id=True  # 添加这一行
)

query = "What are the main components of Milvus?"
docs = vector_store.similarity_search(query)

print(len(docs))

我这里用回了gpt3.5

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0,    openai_api_key=os.getenv("OPENAI_API_KEY"),
    openai_api_base=os.getenv("OPENAI_API_BASE_URL")) 
retriever = vector_store.as_retriever()

template = """Use the following pieces of context to answer the question at the end. 
If you don't know the answer, just say that you don't know, don't try to make up an answer. 
Use three sentences maximum and keep the answer as concise as possible. 
Always say "thanks for asking!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
rag_prompt = PromptTemplate.from_template(template)

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
)

print(rag_chain.invoke("Explain IVF_FLAT in Milvus."))

Chen_Chance

关注

25
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
使用 Zilliz Cloud 和 LangChain 对文档进行问答

我这里用的是中转gpt4的模型embedding，中转gpt3.5问了一家没办法embedding。如果运行中出现安装包的问题，可以试一试以下几个命令。本文章代码主要以第二篇文章为主，第一个视频为辅。我这里是中转，所以用了base_url参数。我这里用回了gpt3.5。
复制链接

扫一扫