RAG+Agent 实战 llama-index+ollama 本地环境构建rag、agent

被玩弄的小猫咪

已于 2024-10-07 23:20:09 修改

阅读量3.8k

点赞数 17

文章标签： llama

于 2024-10-05 19:48:20 首次发布

本文链接：https://blog.csdn.net/yierbubu1212/article/details/142718139

版权

RAG+Agent 实战

1.环境搭建

1.1llama-index环境

pip install llama-index

#ollama chroma 使用到的包
pip install llama-index-llms-ollama
pip install llama-index-embeddings-ollama
pip install llama_index-vector_stores-chroma

1.2 ollama环境

git clone https://www.modelscope.cn/modelscope/ollama-linux.git

cd ollama-linux
sudo chmod 777 ./ollama-modelscope-install.sh
./ollama-modelscope-install.sh

1.3 ollama启动

ollama serve

2.模型下载（ollama）

2.1embedding模型下载

ollama pull yxl/m3e

2.2 llm模型下载

本次测试选用qwen2.5-7b模型

ollama pull qwen2.5:7b

2.3 模型查看

ollama list

2.4 自己微调模型使用

本次篇暂不详细说明，可以参考

ollama实战(二): gguf 格式部署及转换方式（llamacpp）_imtoken下载新版本-CSDN博客

3.向量数据库下载

开源向量存储 Chroma

pip install chromadb

4.构建RAG

自己修改的地方：

模型路径

文档读取路径

4.1构建数据向量库

import chromadb
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, get_response_synthesizer, Settings
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext


# 设置嵌入模型和语言模型
Settings.embed_model = OllamaEmbedding(model_name="yxl/m3e:latest")  
Settings.llm = Ollama(model="qwen2.5:7b", request_timeout=360)  

# 读取文档
documents = SimpleDirectoryReader("docs").load_data()

# 初始化 Chroma 客户端，指定数据存储路径为当前目录下的 chroma_db 文件夹
db = chromadb.PersistentClient(path="./chroma_db")

# 获取或创建名为 "quickstart" 的集合，如果该集合不存在，则创建它
chroma_collection = db.get_or_create_collection("quickstart")

# 使用上述集合创建一个 ChromaVectorStore 实例，以便 llama_index 可以与 Chroma 集合进行交互
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# 创建一个存储上下文，指定向量存储为刚刚创建的 ChromaVectorStore 实例
storage_context = StorageContext.from_defaults(vector_store=vector_store)


# 构建索引
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, transformations=[SentenceSplitter(chunk_size=256)]
)

注：对同一个集合再次需要embedding时，对新的内容新的文件夹直接添加即可，无需再次对旧文件embedding

测试：对26M pdf文件embedding大概花费时间十分钟左右

4.2 自定义查询

import chromadb
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, get_response_synthesizer, Settings
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

# 设置嵌入模型和语言模型
Settings.embed_model = OllamaEmbedding(model_name="yxl/m3e:latest")  # 使用指定的嵌入模型
Settings.llm = Ollama(model="qwen2.5:7b", request_timeout=360)  # 使用指定的语言模型

# 初始化 Chroma 客户端，指定数据存储路径为当前目录下的 chroma_db 文件夹
db = chromadb.PersistentClient(path="./chroma_db")

# 获取或创建名为 "quickstart" 的集合，如果该集合不存在，则创建它
chroma_collection = db.get_or_create_collection("quickstart")

# 使用上述集合创建一个 ChromaVectorStore 实例，以便 llama_index 可以与 Chroma 集合进行交互
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# 创建一个存储上下文，指定向量存储为刚刚创建的 ChromaVectorStore 实例
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# 从存储的向量中加载索引
index = VectorStoreIndex.from_vector_store(
    vector_store, storage_context=storage_context
)

# 配置检索器
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=5,  # 返回最相似的前 n 个文档片段
)

# 配置响应合成器
response_synthesizer = get_response_synthesizer()

# 组装查询引擎
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,    
)

# 执行查询
response = query_engine.query("分析借款合同中关于利息的规定，特别是当利息预先在本金中扣除时的处理方式，指明是第几条民法典规定")
print(response)  # 输出查询结果

5.效果

在这里插入图片描述

6.Agent（简单使用工具）

此处不在进行数据库写入，可参考第四章


from llama_index.core.tools import QueryEngineTool
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.core import VectorStoreIndex, get_response_synthesizer, Settings
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
import chromadb


# 设置嵌入模型和语言模型
Settings.embed_model = OllamaEmbedding(model_name="yxl/m3e:latest")  # 使用指定的嵌入模型
Settings.llm = Ollama(model="qwen2.5:7b", request_timeout=360)  # 使用指定的语言模型

# 初始化 Chroma 客户端，指定数据存储路径为当前目录下的 chroma_db 文件夹
db = chromadb.PersistentClient(path="./chroma_db")

# 获取或创建名为 "quickstart" 的集合，如果该集合不存在，则创建它
chroma_collection = db.get_or_create_collection("quickstart")

# 使用上述集合创建一个 ChromaVectorStore 实例，以便 llama_index 可以与 Chroma 集合进行交互
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# 创建一个存储上下文，指定向量存储为刚刚创建的 ChromaVectorStore 实例
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# 从存储的向量中加载索引
index = VectorStoreIndex.from_vector_store(
    vector_store, storage_context=storage_context
)

# 配置检索器
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,  # 返回最相似的前 n个文档片段
)

# 配置响应合成器
response_synthesizer = get_response_synthesizer()

# 组装查询引擎
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,    
)

# 定义查询工具
budget_tool = QueryEngineTool.from_defaults(
    query_engine,
    name="rag",
    description="用于查询具体信息的工具",
)

# 定义数学计算工具
def multiply(a: float, b: float) -> float:
    """Multiply two numbers and returns the product"""
    return a * b

multiply_tool = FunctionTool.from_defaults(fn=multiply)

def add(a: float, b: float) -> float:
    """Add two numbers and returns the sum"""
    return a + b

add_tool = FunctionTool.from_defaults(fn=add)

# 实例化 ReActAgent
agent = ReActAgent.from_tools(
    [multiply_tool, add_tool, budget_tool], verbose=True
)

# 测试代理
response = agent.chat("霸王茶姬香港店什么时候开业？店长薪酬多少？,再将店长的薪酬增加两万又是多少呢，使用工具计算")
print(response)

信息记录

> Running step e8624d8e-364b-40ed-8b7a-65aeaf0b6978. Step input: 霸王茶姬香港店什么时候开业？店长薪酬多少？,再将店长的薪酬增加两万又是多少呢，使用工具计算
Thought: 我需要分别查询霸王茶姬香港店的开业时间和店长的薪酬。然后根据提供的信息，通过multiply和add函数来计算新的店长薪酬。
Action: rag
Action Input: {'input': '霸王茶姬香港店何时开业？'}
Observation: 霸王茶姬位于尖沙咀K11的香港首店预计正式开业时间为8月。
> Running step 5c6e5d6c-6673-44ae-aefa-155da2c6627e. Step input: None
Thought: 根据查询，霸王茶姬香港首店将在8月份开业。接下来我将查询店长薪酬。
Action: rag
Action Input: {'input': '霸王茶姬香港店店长月薪多少？'}
Observation: 霸王茶姬在香港开出的店长职位薪酬为1.8万-2.5万港元。
> Running step 2b46ef25-d17d-4804-8c6a-0e4c46c0d634. Step input: None
Thought: 现在我知道了香港首店店长的月薪范围是1.8万至2.5万港币，接下来我需要将这个薪酬增加两万。
Action: add
Action Input: {'a': 20000, 'b': 18000}
Observation: 38000
> Running step 9e0e94a0-1068-4a4b-9b89-c69166552f53. Step input: None
Thought: 现在我知道了基本薪酬，接下来我需要将这个数值与原月薪范围相加。
Action: add
Action Input: {'a': 25000, 'b': 20000}
Observation: 45000
> Running step a6b47d3b-db4a-40ec-b09e-f0ade0f788f0. Step input: None
Thought: 我可以回答了。根据查询，霸王茶姬香港首店预计开业时间为8月，店长的基本月薪范围为1.8万至2.5万港币，如果将店长的薪酬增加两万，则新的薪资范围将是3.8万至4.5万港币。
Answer: 霸王茶姬香港首店预计将在8月份正式开业。如果将店长的基本月薪增加两万，那么他们的新月薪范围将会是3.8万至4.5万港币。
霸王茶姬香港首店预计将在8月份正式开业。如果将店长的基本月薪增加两万，那么他们的新月薪范围将会是3.8万至4.5万港币。

RAG附录

对于网页端读取

pip install llama_index-readers-web

from llama_index.readers.web import SimpleWebPageReader



# 提取网页信息
documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["https://baike.baidu.com/item/%E5%A4%A7%E5%87%89%E5%B1%B1/7427289#:~:text=%E5%A4%A7%E5%87%89%E5%B1%B1%EF%BC%88Dali"]
)