使用 Zilliz Cloud 和 LangChain 对文档进行问答

手把手教你基于「向量数据库+LangChain」快速搭建智能问答系统(Zilliz Cloud、Milvus)
Question Answering over Documents with Zilliz Cloud and LangChain
本文章代码主要以第二篇文章为主,第一个视频为辅
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
如果运行中出现安装包的问题,可以试一试以下几个命令

pip install langchain
pip install -U langchain-community
pip install openai
pip install pymilvus
pip install -U langchain-openai

以下代码在jupyter notebook中运行

!python -m pip install --upgrade pymilvus langchain openai tiktoken

我这里用的是中转gpt4的模型embedding,中转gpt3.5问了一家没办法embedding

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores.zilliz import Zilliz
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chat_models import ChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.prompts import PromptTemplate

import os

# 1. Set up the name of the collection to be created.
COLLECTION_NAME = 'doc_qa_db'

# 2. Set up the dimension of the embeddings.
DIMENSION = 768

# 3. Set up the cohere api key
OPENAI_API_KEY = "sk-sy…………0a2e"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

# 4. Set up the connection parameters for your Zilliz Cloud cluster.
URI = 'https://in05-6……3.serverless.ali-cn-hangzhou.cloud.zilliz.com.cn'

# 5. Set up the token for your Zilliz Cloud cluster.
# You can either use an API key or a set of cluster username and password joined by a colon.
TOKEN = 'e……cda'
# Use the WebBaseLoader to load specified web pages into documents
loader = WebBaseLoader([
    'https://milvus.io/docs/overview.md',
])

docs = loader.load()

# Split the documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=0)
all_splits = text_splitter.split_documents(docs)

我这里是中转,所以用了base_url参数

from langchain_openai import OpenAIEmbeddings

# 你的自定义基础URL
custom_base_url = 'https://api.xiaoai.plus/v1'

# 如果OpenAIEmbeddings类接受base_url参数
embeddings = OpenAIEmbeddings(base_url=custom_base_url)

connection_args = { 'uri': URI, 'token': TOKEN }

vector_store = Zilliz(
    embedding_function=embeddings, 
    connection_args=connection_args,
    collection_name=COLLECTION_NAME,
    drop_old=True
).from_documents(
    all_splits,
    embedding=embeddings,
    collection_name=COLLECTION_NAME,
    connection_args=connection_args,
        auto_id=True  # 添加这一行
)
query = "What are the main components of Milvus?"
docs = vector_store.similarity_search(query)

print(len(docs))

我这里用回了gpt3.5

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0,    openai_api_key=os.getenv("OPENAI_API_KEY"),
    openai_api_base=os.getenv("OPENAI_API_BASE_URL")) 
retriever = vector_store.as_retriever()

template = """Use the following pieces of context to answer the question at the end. 
If you don't know the answer, just say that you don't know, don't try to make up an answer. 
Use three sentences maximum and keep the answer as concise as possible. 
Always say "thanks for asking!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
rag_prompt = PromptTemplate.from_template(template)

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
)

print(rag_chain.invoke("Explain IVF_FLAT in Milvus."))
  • 25
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值