【llm/ollama/qwen】在本地部署qwen2.5-coder并在vscode中集成使用代码提示功能

o0o_-_

已于 2025-04-10 14:28:56 修改

阅读量5k

点赞数 14

分类专栏：瞎折腾文章标签： vscode ide 编辑器

于 2025-01-06 00:59:08 首次发布

本文链接：https://blog.csdn.net/qq_33446100/article/details/144953115

版权

瞎折腾专栏收录该内容

12 篇文章

订阅专栏

说在前面

操作系统：windows11
ollama版本：0.5.4
vscode版本：1.96.2
continue插件版本：0.8.66

ollama安装

访问官网，点击下载安装即可
默认装在了C盘，比较蛋疼；但是可以指定路径安装：
```
OllamaSetup.exe /DIR="d:\some\location"
```

安装后确定安装完成

D:\ollama>ollama -v
ollama version is 0.5.4

安装qwen2.5-coder

在官网搜索qwen2.5-coder

复制安装命令

D:\ollama>ollama run qwen2.5-coder
pulling manifest
pulling 60e05f210007...   2% ▕                                                        ▏  79 MB/4.7 GB  9.9 MB/s   7m43s

安装完之后就可以进行对话了

>>> 帮我用golang写一个斐波那契数列
 当然可以！以下是一个用Go语言编写的生成斐波那契数列的示例代码：

```go
package main

import (
        "fmt"
)

// Fibonacci 计算斐波那契数列的第n个数
func Fibonacci(n int) int {
        if n <= 1 {
                return n
        }
        a, b := 0, 1
        for i := 2; i <= n; i++ {
                a, b = b, a+b
        }
        return b
}

// PrintFibonacci 打印斐波那契数列的前n个数
func PrintFibonacci(n int) {
        for i := 0; i < n; i++ {
                fmt.Println(Fibonacci(i))
        }
}

func main() {
        n := 10 // 你可以更改这个值来打印不同长度的斐波那契数列
        PrintFibonacci(n)
}
```

这个程序定义了两个函数：
1. `Fibonacci` 函数用于计算斐波那契数列的第n个数。
2. `PrintFibonacci` 函数用于打印斐波那契数列的前n个数。

在 `main` 函数中，我们调用 `PrintFibonacci(n)` 来打印前10个斐波那契数列的数。你可以根据需要更改 `n` 的值来打印不同长
度的斐波那契数列。

>>> Send a message (/? for help)

运行后窗口不要关闭

安装continue插件

在vscode的插件管理窗口搜索continue
点击安装即可
选择左侧插件按钮，并选择Add Chat Model
选择Ollama，点击Connect后看到Autodetect - qwen2.5-coder
然后就可以进行对话了

配置continue自动补全

点击vscode右下角的continue按钮
选择Configure autocomplete options

将配置文件修改为以下内容：

{
  "models": [
    {
      "model": "AUTODETECT",
      "title": "Autodetect",
      "provider": "ollama"
    }
  ],
  "tabAutocompleteModel": {
    "title": "qwen",
    "provider": "ollama",
    "model": "qwen2.5-coder:latest",
    "apiKey": ""
  },
  "contextProviders": [
    {
      "name": "code",
      "params": {}
    },
    {
      "name": "docs",
      "params": {}
    },
    {
      "name": "diff",
      "params": {}
    },
    {
      "name": "terminal",
      "params": {}
    },
    {
      "name": "problems",
      "params": {}
    },
    {
      "name": "folder",
      "params": {}
    },
    {
      "name": "codebase",
      "params": {}
    }
  ],
  "slashCommands": [
    {
      "name": "share",
      "description": "Export the current chat session to markdown"
    },
    {
      "name": "cmd",
      "description": "Generate a shell command"
    },
    {
      "name": "commit",
      "description": "Generate a git commit message"
    }
  ]
}

然后就可以愉快的自动补全了

离线环境下部署大模型

以上为在有网络的环境下进行安装的过程，在内网环境下也是类似的，ollama、continue插件的安装都比较简单，不再赘述
主要说下模型的部署
ollama目前支持两种类型的模型导入：
- Build from a Safetensors model
```
FROM <model directory>
```
  The model directory should contain the Safetensors weights for a supported architecture.
  Currently supported model architectures:
  - Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
  - Mistral (including Mistral 1, Mistral 2, and Mixtral)
  - Gemma (including Gemma 1 and Gemma 2)
  - Phi3
- Build from a GGUF file
```
FROM ./ollama-model.gguf
```
  The GGUF file location should be specified as an absolute path or relative to the Modelfile location.
根据文档说明，qwen2.5-coder只能使用GGUF文件来导入
从huggingface或者魔塔下载
这里我选择7B大小的模型，Qwen2.5-Coder-7B-Instruct-GGUF
点击模型文件，选择下图红框中的下载即可
然后在我们的ollama窗口执行/show modelfile命令，

将输出内容保存为文件Modelfile，并修改FROM后面的内容，例如：

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM qwen2.5-coder:latest

FROM ./qwen2.5-coder-7b-instruct-q6_k.gguf
TEMPLATE """{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|>
{{- else if .Messages }}
{{- if or .System .Tools }}<|im_start|>system
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{"type": "function", "function": {{ .Function }}}
{{- end }}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
{{- end }}<|im_end|>
{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ if .Content }}{{ .Content }}
{{- else if .ToolCalls }}<tool_call>
{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{ end }}</tool_call>
{{- end }}{{ if not $last }}<|im_end|>
{{ end }}
{{- else if eq .Role "tool" }}<|im_start|>user
<tool_response>
{{ .Content }}
</tool_response><|im_end|>
{{ end }}
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
{{ end }}
{{- end }}
{{- else }}
{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}"""
SYSTEM You are Qwen, created by Alibaba Cloud. You are a helpful assistant.

将该文件与下载好的GGUF文件放到同一个文件夹

然后运行ollama命令

ollama create qwen2.5-coder -f ./Modelfile

然后就安装完成了

问题

在离线安装好模型之后，使用continue时会出现以下错误：

Error calling Ollama /api/generate endpoint: Error: Error: HTTP 404 Not Found

找了半天资料后，将配置文件中"model": "AUTODETECT",改成对应的模型名后就正常了

{
  "models": [
    {
      "model": "AUTODETECT",
      "title": "Autodetect",
      "provider": "ollama"
    }
  ],
  "tabAutocompleteModel": {
    "title": "qwen",
    "provider": "ollama",
    "model": "qwen2.5-coder:latest",
    "apiKey": ""
  },