GraphRAG+Ollama实现本地部署（最全，非常详细，保姆教程）

高子熠

已于 2024-11-26 15:34:32 修改

阅读量2.3w

点赞数 93

文章标签： python 深度学习神经网络自然语言处理知识图谱 nlp

于 2024-07-23 17:40:11 首次发布

本文链接：https://blog.csdn.net/gaotianhao123/article/details/140640415

版权

GraphRAG+Ollama本地部署实现

前文已经介绍过GraphRAG在调用llm api的方式的部署方式，如果有需要的可以去看我写的上一篇文章
微软开源GraphRAG的使用教程（最全，非常详细）
但上述有一个关键的问题——由于GraphRAG过于消耗token，所以money可能接受不了 ~~（ps：听说如果用GPT-4o可能需要46刀）~~
为了找到一种省钱的方式并且能够使用GraphRAG便可以调用Ollama本地部署的开源大模型，但需要修改部分源码，我已经实现过了，给大家避坑，快速部署。如果对你有用的，欢迎点赞、收藏！

前提

在使用该教程之前确保自己电脑已经安装好Ollama，anaconda等基础配置

1. 创建一个新的虚拟环境（前提已经安装好anaconda），此处推荐python版本3.10

conda create -n graphrag-ollama-local python=3.10
conda activate graphrag-ollama-local

2.安装Ollama

pip install ollama

3.使用Ollama下载使用的开源模型

（1）此处需要选择一个llm和一个embedding模型，这里我选择mistral(4.1GB) 和nomic-embed-text(278MB)

此处也可以自己选择模型，可以自行去Ollama官网查看

ollama pull mistral  #llm
ollama pull nomic-embed-text  #embedding

（2）检查是否下载成功

ollama list

（3）启动服务

ollama serve

如图已经成功，记得保证http://localhost:11434的ollama默认端口，不要被别的占用

4. 下载源代码

git clone https://github.com/TheAiSingularity/graphrag-local-ollama.git
cd graphrag-local-ollama/

此处已经要用git下载，如果download源代码可能后续会报错

5. 安装依赖包（非常重要！！！）

pip install -e .

此处如果报错，一定要使用git clone源代码，再次运行即可

6. 创建GraphRAG目录，并且创建input/目录用于存放原始文档

mkdir -p ./ragtest/input

7. 将原始文档放入到./ragtest/input目录下（仅支持txt文件，可多个）

cp input/* ./ragtest/input #可根据自己的需求修改

此处给个示例，也可以直接放入./ragtest/input/目录下

8. 初始化项目

python -m graphrag.index --init --root ./ragtest

此处ragtest目录下有output,input,settings.yaml, prompts，.env(默认隐藏）五个目录及文件

9. 移动 settings.yaml 文件，这是用 ollama 本地模型配置的主要预定义配置文件：

mv settings.yaml ./ragtest

10.修改配置文件

此处需要注意，因为我们在第三步选择了mistral(4.1GB) 和nomic-embed-text(278MB)，所以需要将setting.yaml中的llm model修改为mistral；embedding model修改为nomic-embed-text。
api_base为http://localhost:11434/v1和http://localhost:11434/api（此为Ollama的默认api调用地址，如果无特殊要求可以不修改）
修改结果如下：

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mistral
  model_supports_json: false # recommended if this is available for your model.
  # max_tokens: 4000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  # concurrent_requests: 25 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text
    api_base:  http://localhost:11434/api
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request

最低0.47元/天解锁文章