在本篇文章中,我们将介绍如何使用Chroma和LlamaIndex构建一个多模态检索系统。Chroma是一个专为开发者生产力和幸福感设计的AI原生开源向量数据库,本教程将展示如何在其中创建多模态索引并进行检索。
安装所需库
首先,安装所需的软件包:
pip install chromadb
pip install llama-index
pip install llama-index-vector-stores-qdrant
pip install llama-index-embeddings-huggingface
pip install llama-index-vector-stores-chroma
pip install sentence-transformers
pip install pydantic==1.10.11
pip install open-clip-torch
创建Chroma索引
首先,我们来创建一个Chroma索引:
# 导入必要的库
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from IPython.display import Markdown, display
import chromadb
# 设置OpenAI API Key
import os
import openai
OPENAI_API_KEY = "你的OpenAI API密钥"
openai.api_key = OPENAI_API_KEY
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
下载维基百科图片和文本
我们将从维基百科中下载一些文本和图片数据:
import requests
from pathlib import Path
import urllib.request
def get_wikipedia_images(title):
response = requests.get(
"https://en.wikipedia.org/w/api.php",
params={
"action": "query",
"format": "json",
"titles": title,
"prop": "imageinfo",
"iiprop": "url|dimensions|mime",
"generator": "images",
"gimlimit": "50",
},
).json()
image_urls = []
for page in response["query"]["pages"].values():
if page["imageinfo"][0]["url"].endswith(".jpg") or page["imageinfo"][0]["url"].endswith(".png"):
image_urls.append(page["imageinfo"][0]["url"])
return image_urls
image_uuid = 0
MAX_IMAGES_PER_WIKI = 20
wiki_titles = {
"Tesla Model X",
"Pablo Picasso",
"Rivian",
"The Lord of the Rings",
"The Matrix",
"The Simpsons",
}
data_path = Path("mixed_wiki")
if not data_path.exists():
Path.mkdir(data_path)
for title in wiki_titles:
response = requests.get(
"https://en.wikipedia.org/w/api.php",
params={
"action": "query",
"format": "json",
"titles": title,
"prop": "extracts",
"explaintext": True,
},
).json()
page = next(iter(response["query"]["pages"].values()))
wiki_text = page["extract"]
with open(data_path / f"{title}.txt", "w") as fp:
fp.write(wiki_text)
images_per_wiki = 0
try:
list_img_urls = get_wikipedia_images(title)
for url in list_img_urls:
if url.endswith(".jpg") or url.endswith(".png"):
image_uuid += 1
urllib.request.urlretrieve(url, data_path / f"{image_uuid}.jpg")
images_per_wiki += 1
if images_per_wiki > MAX_IMAGES_PER_WIKI:
break
except:
print(str(Exception("No images found for Wikipedia page: ")) + title)
continue
设置嵌入模型
我们需要设置默认的文本和图像嵌入函数:
from chromadb.utils.embedding_functions import OpenCLIPEmbeddingFunction
embedding_function = OpenCLIPEmbeddingFunction()
构建Chroma多模态索引
通过LlamaIndex构建Chroma多模态索引:
from llama_index.core.indices import MultiModalVectorStoreIndex
from llama_index.vector_stores.qdrant import QdrantVectorStore
from chromadb.utils.data_loaders import ImageLoader
image_loader = ImageLoader()
# 创建客户端和新的集合
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection(
"multimodal_collection",
embedding_function=embedding_function,
data_loader=image_loader,
)
# 加载文档
documents = SimpleDirectoryReader("./mixed_wiki/").load_data()
# 设置ChromaVectorStore并加载数据
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context,
)
从多模态索引中检索结果
最后,我们将从多模态索引中检索结果:
retriever = index.as_retriever(similarity_top_k=50)
retrieval_results = retriever.retrieve("Picasso famous paintings")
from llama_index.core.schema import ImageNode
from llama_index.core.response.notebook_utils import (
display_source_node,
display_image_uris,
)
image_results = []
MAX_RES = 5
cnt = 0
for r in retrieval_results:
if isinstance(r.node, ImageNode):
image_results.append(r.node.metadata["file_path"])
else:
if cnt < MAX_RES:
display_source_node(r)
cnt += 1
display_image_uris(image_results, [3, 3], top_k=2)
可能遇到的错误
- 缺少依赖库:确保所有所需的Python库都已安装。
- API Key错误:确保你已正确设置了OpenAI API Key。
- 网络问题:可能会由于网络问题导致无法从维基百科下载数据。
- 文件路径问题:确保指定的文件路径存在或是可读可写的。
参考资料:
如果你觉得这篇文章对你有帮助,请点赞,关注我的博客,谢谢!