GraphRAG+Ollama实现本地部署

最新推荐文章于 2024-08-21 15:12:02 发布

11子贤

最新推荐文章于 2024-08-21 15:12:02 发布

阅读量682

点赞数 20

文章标签：人工智能

本文链接：https://blog.csdn.net/weixin_42037035/article/details/140840906

版权

参考GraphRAG+Ollama实现本地部署（最全，非常详细，保姆教程）_graphrag ollama-CSDN博客

1.创建环境

mkdir graphragollama
cd graphragollama
conda create -n graphrag-ollama python=3.10
conda activate graphrag-ollama

2.安装Ollama

pip install ollama

3.使用Ollama下载使用的开源模型

1）此处需要选择一个llm和一个embedding模型，这里我选择mistral(4.1GB) 和nomic-embed-text(278MB)

此处也可以自己选择模型，可以自行去Ollama官网查看

ollama pull mistral  #llm
ollama pull nomic-embed-text  #embedding

提示报错：

Command 'ollama' not found, but can be installed with:

snap install ollama
Please ask your administrator.
解决：

sudo snap install ollama

2）检查是否下载成功

ollama list

3）启动服务

ollama serve

如图已经成功，记得保证http://localhost:11434的ollama默认端口，不要被别的占用

如果被占用：解决：确认占用的可以停止的情况下

查找并终止占用端口的进程：

查找占用端口的进程：

sudo lsof -i :11434

输出示例：

COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
someprocess 1234 user   10u  IPv4 1234567      0t0  TCP localhost:11434 (LISTEN)

终止占用端口的进程：
```
sudo kill -9 1234 
```
将 1234 替换为实际的进程 ID。

4.下载源代码

git clone https://github.com/TheAiSingularity/graphrag-local-ollama.git
cd graphrag-local-ollama/

5. 安装依赖包（非常重要！！！）

pip install -e .  #根据代码自动安装所需的包

6. 创建GraphRAG目录，并且创建input/目录用于存放原始文档

mkdir -p ./ragtest/input

7. 将原始文档放入到./ragtest/input目录下（仅支持txt文件，可多个）

cp input/* ./ragtest/input #可根据自己的需求修改

8. 初始化项目

python -m graphrag.index --init --root ./ragtest

此处ragtest目录下有output,input,setting.yaml, prompts.env(默认隐藏）五个目录及文件

9. 移动 settings.yaml 文件，这是用 ollama 本地模型配置的主要预定义配置文件：

mv settings.yaml ./ragtest

10.修改配置文件

此处需要注意，因为我们在第三步选择了mistral(4.1GB) 和nomic-embed-text(278MB)，所以需要将setting.yaml中的llmmodel修改为mistral；embedding model修改为nomic-embed-text。
api_base为http://localhost:11434/v1（此为Ollama的默认api调用地址，如果无特殊要求可以不修改）

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mistral
  model_supports_json: false # recommended if this is available for your model.
  # max_tokens: 4000
  # request_timeout: 180.0
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  # concurrent_requests: 25 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text
    api_base:  http://localhost:11434/api
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional
  


chunks:
  size: 200
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents
    
input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 8000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: yes
  raw_entities: yes
  top_level_nodes: yes

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32

11. 进行索引，构建图

conda activate graphrag-ollama
python -m graphrag.index --root ./ragtest

报错：./graphrag-local-ollama/ragtest/output/20240801-140716/reports/logs.json中的信息如下

{"type": "error", "data": "Error executing verb \"cluster_graph\" in create_base_entity_graph: Columns must be same length as key", "stack": "Traceback (most recent call last):\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n    result = node.verb.func(**verb_args)\n  File \"/home/modeltest02/graphragollama/graphrag-local-ollama/graphrag/index/verbs/graph/clustering/cluster_graph.py\", line 102, in cluster_graph\n    output_df[[level_to, to]] = pd.DataFrame(\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/pandas/core/frame.py\", line 4299, in __setitem__\n    self._setitem_array(key, value)\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/pandas/core/frame.py\", line 4341, in _setitem_array\n    check_key_length(self.columns, key, value)\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/pandas/core/indexers/utils.py\", line 390, in check_key_length\n    raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}
{"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n  File \"/home/modeltest02/graphragollama/graphrag-local-ollama/graphrag/index/run.py\", line 323, in run_pipeline\n    result = await workflow.run(context, callbacks)\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 369, in run\n    timing = await self._execute_verb(node, context, callbacks)\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n    result = node.verb.func(**verb_args)\n  File \"/home/modeltest02/graphragollama/graphrag-local-ollama/graphrag/index/verbs/graph/clustering/cluster_graph.py\", line 102, in cluster_graph\n    output_df[[level_to, to]] = pd.DataFrame(\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/pandas/core/frame.py\", line 4299, in __setitem__\n    self._setitem_array(key, value)\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/pandas/core/frame.py\", line 4341, in _setitem_array\n    check_key_length(self.columns, key, value)\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/pandas/core/indexers/utils.py\", line 390, in check_key_length\n    raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}

配置文件之前没有完全修改好，按上面的代码全部复制替换一下setting.yaml就开始跑了，但是速度非常慢

{"type": "error", "data": "Error Invoking LLM", "stack": "Traceback (most recent call last):\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/httpx/_transports/default.py\", line 69, in map_httpcore_exceptions\n    yield\n  File \"/home/modeltest02/.conda/envs/graphrag-ollama/lib/python3.10/site-packages/httpx/_transports/default.py\", line 373, in handle_async_request\n    resp = await self._pool.handle_async_request(req)\n

应该是API的问题

将ebedding模型换成其他模型

安装 — Xinference

pip install "xinference[all]"

后台启动服务

默认：9997端口

xinference-local --host 0.0.0.0 --port 9997

这里直接复制地址在本地打开肯定不行，将0.0.0.0换成服务器地址就可以

All workflows completed successfully.出现这个表明成功了

12.进行全局查询

python -m graphrag.query --root ./ragtest --method global "What is Neural Networks?"

INFO: Reading settings from ragtest/settings.yaml
creating llm client with {'api_key': 'REDACTED,len=49', 'type': "openai_chat", 'model': 'glm-4-0520', 'max_tokens': 128000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'request_timeout': 180.0, 'api_base': 'https://open.bigmodel.cn/api/paas/v4/', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
Warning: All map responses have score 0 (i.e., no relevant information found from the dataset), returning a canned 'I do not know' answer. You can try enabling `allow_general_knowledge` to encourage the LLM to incorporate relevant general knowledge, at the risk of increasing hallucinations.

SUCCESS: Global Search Response: I am sorry but I am unable to answer this question given the provided data.

13.进行局部查询

python -m graphrag.query --root ./ragtest --method local "What is Neural Networks?"

graphrag-ollama) modeltest02@ubun:~/graphragollama/grapragglm/graphrag-main$ python -m graphrag.query --root ./ragtest --method local "What is Neural Networks?"

INFO: Reading settings from ragtest/settings.yaml

INFO: Vector Store Args: {}
creating llm client with {'api_key': 'REDACTED,len=49', 'type': "openai_chat", 'model': 'glm-4-0520', 'max_tokens': 128000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'request_timeout': 180.0, 'api_base': 'https://open.bigmodel.cn/api/paas/v4/', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
creating embedding llm client with {'api_key': 'REDACTED,len=49', 'type': "openai_embedding", 'model': 'embedding-2', 'max_tokens': 4000, 'temperature': 0, 'top_p': 1, 'n': 1, 'request_timeout': 180.0, 'api_base': 'https://open.bigmodel.cn/api/paas/v4/', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Local Search Response: Neural networks are a class of machine learning models inspired by the structure and function of the human brain. They consist of interconnected nodes or neurons that process information in a layered architecture. Each neuron takes inputs, applies a mathematical function to them, and produces an output that can be passed on to other neurons. Neural networks are designed to recognize patterns and features in data, making them powerful tools for tasks such as image and speech recognition, natural language processing, and various other applications in artificial intelligence.

There are several types of neural networks, including:

1. **Feedforward Neural Networks (FNNs):** These are the simplest form of neural networks where information flows in a single direction from the input layer to the output layer, without cycles or loops.

2. **Convolutional Neural Networks (CNNs):** Specialized for processing structured grid data like images, CNNs use convolutional layers to automatically and adaptively learn spatial hierarchies of features.

3. **Recurrent Neural Networks (RNNs):** Designed to handle sequential data, RNNs have feedback connections that allow them to retain information from previous inputs, making them suitable for tasks involving time series or sequences.

4. **Transformer Neural Networks:** A more recent innovation, transformers rely on self-attention mechanisms and are primarily used for natural language processing tasks. They have become the backbone of numerous state-of-the-art models due to their ability to handle long-range dependencies and parallelize training processes.

Neural networks have been at the forefront of advancing machine learning and AI technologies, with continuous research aimed at improving their architectures, training algorithms, and overall performance.

[Data: Entities (27, 25, 23, 0); Relationships (5, 4); Sources (1, 0)]

Please note that the above explanation is a general overview of neural networks, and the specific data references pertain to the various types of neural networks and their descriptions mentioned in the provided data tables.

局部查询效果还行

11子贤

关注

20
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
GraphRAG+Ollama实现本地部署

2.安装Ollama3.使用Ollama下载使用的开源模型1）此处需要选择一个llm和一个embedding模型，这里我选择mistral(4.1GB) 和nomic-embed-text(278MB)提示报错：Command 'ollama' not found, but can be installed with:snap install ollamaPlease ask your administrator.解决：2）检查是否下载成功3）启动服务如果被占
复制链接

扫一扫