RAG实战：本地部署ragflow+ollama（linux）

最新推荐文章于 2025-04-17 20:59:48 发布

ling913

最新推荐文章于 2025-04-17 20:59:48 发布

阅读量1.5w

点赞数 30

分类专栏： AI 文章标签： linux rag ragflow ollama 大模型部署

本文链接：https://blog.csdn.net/ling913/article/details/144918527

版权

AI 专栏收录该内容

3 篇文章

订阅专栏

1.部署ragflow

1.1安装配置docker

因为ragflow需要诸如elasticsearch、mysql、redis等一系列三方依赖，所以用docker是最简便的方法。

docker安装可参考Linux安装Docker完整教程，安装后修改docker配置如下：

vim /etc/docker/daemon.json
{
  "builder": {
    "gc": {
      "defaultKeepStorage": "20GB",
      "enabled": true
    }
  },
  "experimental": false,
  "features": {
    "buildkit": true
  },
  "live-restore": true,
  "registry-mirrors": [
    "https://docker.211678.top",
    "https://docker.1panel.live",
    "https://hub.rat.dev",
    "https://docker.m.daocloud.io",
    "https://do.nark.eu.org",
    "https://dockerpull.com",
    "https://dockerproxy.cn",
    "https://docker.awsl9527.cn/"
  ]
}

修改后重新加载配置并重启docker服务：

systemctl daemon-reload && systemctl restart docker

1.2 配置ragflow


git clone https://github.com/infiniflow/ragflow.git
cd ragflow/docker
docker compose -f docker-compose.yml up -d

构建docker环境期间，有遇到elasticsearch下载失败的情况，于是将docker-compose-base.yml中的
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
改成：
image: elasticsearch:${STACK_VERSION}

环境构建完成后，确认服务器状态：

docker logs --tail 100 -f ragflow-server

出现以下界面提示说明服务器启动成功：

     ____   ___    ______ ______ __               
    / __ \ /   |  / ____// ____// /____  _      __
   / /_/ // /| | / / __ / /_   / // __ \| | /| / /
  / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ / 
 /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/  

 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9380
 * Running on http://x.x.x.x:9380
 INFO:werkzeug:Press CTRL+C to quit

此时，通过docker ps可以看到运行中的容器：

如果要停止服务：docker stop $(docker ps -q)

1.3 登陆ragflow页面

在你的浏览器中输入你的服务器对应的 IP 地址并登录 RAGFlow。只需输入 http://IP_OF_YOUR_MACHINE 即可：未改动过配置则无需输入端口（默认的 HTTP 服务端口 80，如需修改端口，修改docker-compose.yml中ports下面80前面端口号）

你将在浏览器中看到如下界面，第一次要注册一个账号，邮箱随便填。

2.部署ollama

2.1下载ollama

# 两种下载方式：
# 方法一：
curl -fsSL https://ollama.com/install.sh | sh

# 方法二：
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
sudo tar -C /usr -xzf ollama-linux-amd64.tgz

2.2 启动

ollama serve

2.3 下载大模型

以qwen2-7b为例，其他模型可以去https://ollama.com/library搜索。

方法一：ollama run qwen2:7b

模型文件比较大，如果上述方法网络不稳定，可以使用下面的方法二。

方法二：

①去https://huggingface.co/models?library=gguf下载gguf格式的模型文件，根据所需，选择一个下载，如Qwen2-7B-Instruct.Q4_K_M.gguf

②创建一个构造文件qwen2-7b.modelfile（自由命名），文件的内容为你下载的模型文件路径，如：

FROM ./Qwen2-7B-Instruct.Q4_K_M.gguf

③构造

ollama create qwen2-7b -f qwen2-7b.modelfile

构造完成后执行ollama list即可看到你构造的模型。如：

$ollama list
NAME                  ID              SIZE      MODIFIED      
qwen2-7b:latest       0151b69b0ffa    4.7 GB    1 weeks ago

测试：

ollama run qwen2-7b "你是谁？"

2.4 补充其他两种调用方式

url调用：

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2-7b",
  "messages": [
    { "role": "user", "content": "你是谁？" }
  ]
}'

python代码调用：

import requests
import json

def send_message_to_ollama(message, port=11434):
    url = f"http://localhost:{port}/api/chat"
    payload = {
        "model": "qwen2-7b",
        "messages": [{"role": "user", "content": message}]
    }
    response = requests.post(url, json=payload)
    if response.status_code == 200:
        response_content = ""
        for line in response.iter_lines():
            if line:
                response_content += json.loads(line)["message"]["content"]
        return response_content
    else:
        return f"Error: {response.status_code} - {response.text}"

if __name__ == "__main__":
    user_input = "why is the sky blue?"
    response = send_message_to_ollama(user_input)
    print("Ollama's response:")
    print(response)