说在前面
- 操作系统:windows11
- ollama版本:0.5.4
- vscode版本:1.96.2
- continue插件版本:0.8.66
ollama安装
- 访问官网,点击下载安装即可
- 默认装在了C盘,比较蛋疼;但是可以指定路径安装:
OllamaSetup.exe /DIR="d:\some\location"
- 安装后确定安装完成
D:\ollama>ollama -v ollama version is 0.5.4
安装qwen2.5-coder
- 在官网搜索qwen2.5-coder
- 复制安装命令
D:\ollama>ollama run qwen2.5-coder pulling manifest pulling 60e05f210007... 2% ▕ ▏ 79 MB/4.7 GB 9.9 MB/s 7m43s
- 安装完之后就可以进行对话了
>>> 帮我用golang写一个斐波那契数列 当然可以!以下是一个用Go语言编写的生成斐波那契数列的示例代码: ```go package main import ( "fmt" ) // Fibonacci 计算斐波那契数列的第n个数 func Fibonacci(n int) int { if n <= 1 { return n } a, b := 0, 1 for i := 2; i <= n; i++ { a, b = b, a+b } return b } // PrintFibonacci 打印斐波那契数列的前n个数 func PrintFibonacci(n int) { for i := 0; i < n; i++ { fmt.Println(Fibonacci(i)) } } func main() { n := 10 // 你可以更改这个值来打印不同长度的斐波那契数列 PrintFibonacci(n) } ``` 这个程序定义了两个函数: 1. `Fibonacci` 函数用于计算斐波那契数列的第n个数。 2. `PrintFibonacci` 函数用于打印斐波那契数列的前n个数。 在 `main` 函数中,我们调用 `PrintFibonacci(n)` 来打印前10个斐波那契数列的数。你可以根据需要更改 `n` 的值来打印不同长 度的斐波那契数列。 >>> Send a message (/? for help)
- 运行后窗口不要关闭
安装continue插件
- 在vscode的插件管理窗口搜索continue
- 点击安装即可
- 选择左侧插件按钮,并选择
Add Chat Model
- 选择
Ollama
,点击Connect
后看到Autodetect - qwen2.5-coder
- 然后就可以进行对话了
配置continue自动补全
- 点击vscode右下角的continue按钮
- 选择
Configure autocomplete options
- 将配置文件修改为以下内容:
{ "models": [ { "model": "AUTODETECT", "title": "Autodetect", "provider": "ollama" } ], "tabAutocompleteModel": { "title": "qwen", "provider": "ollama", "model": "qwen2.5-coder:latest", "apiKey": "" }, "contextProviders": [ { "name": "code", "params": {} }, { "name": "docs", "params": {} }, { "name": "diff", "params": {} }, { "name": "terminal", "params": {} }, { "name": "problems", "params": {} }, { "name": "folder", "params": {} }, { "name": "codebase", "params": {} } ], "slashCommands": [ { "name": "share", "description": "Export the current chat session to markdown" }, { "name": "cmd", "description": "Generate a shell command" }, { "name": "commit", "description": "Generate a git commit message" } ] }
- 然后就可以愉快的自动补全了
离线环境下部署大模型
-
以上为在有网络的环境下进行安装的过程,在内网环境下也是类似的,
ollama
、continue插件
的安装都比较简单,不再赘述 -
主要说下模型的部署
-
ollama目前支持两种类型的模型导入:
- Build from a Safetensors model
The model directory should contain the Safetensors weights for a supported architecture.FROM <model directory>
Currently supported model architectures:- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
- Mistral (including Mistral 1, Mistral 2, and Mixtral)
- Gemma (including Gemma 1 and Gemma 2)
- Phi3
- Build from a GGUF file
The GGUF file location should be specified as an absolute path or relative to theFROM ./ollama-model.gguf
Modelfile
location.
- Build from a Safetensors model
-
根据文档说明,qwen2.5-coder只能使用GGUF文件来导入
-
从huggingface或者魔塔下载
-
这里我选择7B大小的模型,Qwen2.5-Coder-7B-Instruct-GGUF
-
点击模型文件,选择下图红框中的下载即可
-
然后在我们的
ollama
窗口执行/show modelfile
命令,
-
将输出内容保存为文件
Modelfile
,并修改FROM
后面的内容,例如:# Modelfile generated by "ollama show" # To build a new Modelfile based on this, replace FROM with: # FROM qwen2.5-coder:latest FROM ./qwen2.5-coder-7b-instruct-q6_k.gguf TEMPLATE """{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{- end }} {{- if .Tools }} # Tools You may call one or more functions to assist with the user query. You are provided with function signatures within <tools></tools> XML tags: <tools> {{- range .Tools }} {"type": "function", "function": {{ .Function }}} {{- end }} </tools> For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags: <tool_call> {"name": <function-name>, "arguments": <args-json-object>} </tool_call> {{- end }}<|im_end|> {{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if eq .Role "user" }}<|im_start|>user {{ .Content }}<|im_end|> {{ else if eq .Role "assistant" }}<|im_start|>assistant {{ if .Content }}{{ .Content }} {{- else if .ToolCalls }}<tool_call> {{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}} {{ end }}</tool_call> {{- end }}{{ if not $last }}<|im_end|> {{ end }} {{- else if eq .Role "tool" }}<|im_start|>user <tool_response> {{ .Content }} </tool_response><|im_end|> {{ end }} {{- if and (ne .Role "assistant") $last }}<|im_start|>assistant {{ end }} {{- end }} {{- else }} {{- if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|> {{ end }}<|im_start|>assistant {{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}""" SYSTEM You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
-
将该文件与下载好的GGUF文件放到同一个文件夹
-
然后运行ollama命令
ollama create qwen2.5-coder -f ./Modelfile
-
然后就安装完成了
问题
- 在离线安装好模型之后,使用continue时会出现以下错误:
找了半天资料后,将配置文件中Error calling Ollama /api/generate endpoint: Error: Error: HTTP 404 Not Found
"model": "AUTODETECT",
改成对应的模型名后就正常了{ "models": [ { "model": "AUTODETECT", "title": "Autodetect", "provider": "ollama" } ], "tabAutocompleteModel": { "title": "qwen", "provider": "ollama", "model": "qwen2.5-coder:latest", "apiKey": "" },