deepseek大模型本地私有化部署开发最全文档

liuyunshengsir

已于 2025-03-29 21:26:03 修改

阅读量815

点赞数 14

分类专栏：大模型文章标签：语言模型 windows linux

于 2025-02-09 21:45:00 首次发布

本文链接：https://blog.csdn.net/liuyunshengsir/article/details/145528131

版权

大模型专栏收录该内容

27 篇文章

订阅专栏

简介

DeepSeek-V3 在推理速度上相较历史模型有了大幅提升。

在目前大模型主流榜单中，DeepSeek-V3 在开源模型中位列榜首，与世界上最先进的闭源模型不分伯仲。

CUDA和cuDNN 安装

https://developer.nvidia.com/cuda-downloads?target_os=Linux

https://developer.nvidia.com/cudnn-downloads?target_os=Linux

Ollama安装

Ollama 是一个可以在本地部署和管理开源大语言模型的框架，由于它极大的简化了开源大语言模型的安装和配置细节，一经推出就广受好评。

https://ollama.com/

运行deepseek

支持部署的模型参数
1.5b
7b
8b
14b
32b
70b
671b

ollama run deepseek-r1:671b

在这里插入图片描述

openwebui 部署

Open web是一个可扩展、功能丰富、用户友好的自托管AI平台，旨在完全离线运行。它支持各种LLM运行程序，如Ollama和openai兼容的api，并为RAG内置推理引擎，使其成为强大的AI部署解决方案。https://docs.openwebui.com/
在这里插入图片描述

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

或者
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

配置API（Python）开发使用

安装依赖

pip install ollama

代码样例

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='deepseek-r1:671b', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

from ollama import Client
client = Client(
  host='http://localhost:11434',
  headers={'x-some-header': 'some-value'}
)
response = client.chat(model='deepseek-r1:671b', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])