使用Ollama进行大模型的api部署

不会写代码的大模型

已于 2024-12-01 14:20:18 修改

阅读量1.1w

点赞数 9

文章标签：人工智能语言模型 gpt-3

于 2024-10-29 11:26:35 首次发布

本文链接：https://blog.csdn.net/xuptyjs/article/details/143323421

版权

使用Ollama进行大模型的api部署有两种方式：原生接口和openai兼容接口

1.原生模式

验证本机是否安装成功Ollama，Win+R打开终端后输入：

ollama -v

拉取模型：

ollama run qwen:0.5b

1.1.第一种请求方式

打开postman，输入下面的url:

http://localhost:11434/api/generate

发送请求（该方法需要完整的prompt)：

{
    "model":"qwen:0.5b",
    "prompt":"你是谁"
}

响应结果为：

{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.2227341Z",
    "response": "我是",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.2489328Z",
    "response": "来自",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.2728088Z",
    "response": "阿里",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.2970255Z",
    "response": "云",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.3208497Z",
    "response": "的",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.3444397Z",
    "response": "超",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.390067Z",
    "response": "大规模",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.4145786Z",
    "response": "语言",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.4396235Z",
    "response": "模型",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.4643017Z",
    "response": "，",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.4891521Z",
    "response": "我",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.515477Z",
    "response": "叫",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.5407572Z",
    "response": "通",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.5746175Z",
    "response": "义",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.6038589Z",
    "response": "千",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.6293497Z",
    "response": "问",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.6552632Z",
    "response": "。",
    "done": false
}
{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:12:55.6825481Z",
    "response": "",
    "done": true,
    "done_reason": "stop",
    "context": [
        151644,
        872,
        198,
        105043,
        100165,
        151645,
        198,
        151644,
        77091,
        198,
        104198,
        101919,
        102661,
        99718,
        9370,
        71304,
        105483,
        102064,
        104949,
        3837,
        35946,
        99882,
        31935,
        64559,
        99320,
        56007,
        1773
    ],
    "total_duration": 558287200,
    "load_duration": 36553100,
    "prompt_eval_count": 10,
    "prompt_eval_duration": 59275000,
    "eval_count": 18,
    "eval_duration": 459349000
}

1.2.第二种请求方式

输入请求的url:

http://localhost:11434/api/chat

发送请求（该方法只需以message形式请求，只需 role、content键，完整的prompt由ollama自动生成：

{
    "messages": [
        {
            "role": "user",
            "content": "你是谁"
        }
    ],
    // "model": "lark",
    "model": "qwen:0.5b",
    "stream": false,
    "temperature": 0.01,
    "max_tokens": 1024
}

响应结果：

{
    "model": "qwen:0.5b",
    "created_at": "2024-10-29T03:17:59.3797606Z",
    "message": {
        "role": "assistant",
        "content": "我是来自阿里云的大规模语言模型，我叫通义千问。"
    },
    "done_reason": "stop",
    "done": true,
    "total_duration": 567748800,
    "load_duration": 29021200,
    "prompt_eval_count": 10,
    "prompt_eval_duration": 113331000,
    "eval_count": 17,
    "eval_duration": 423288000
}

2.OpenAI兼容的API

输入请求的url:

http://localhost:11434/v1/chat/completions

发送请求内容同1.2

响应结果：

{
    "id": "chatcmpl-289",
    "object": "chat.completion",
    "created": 1730172131,
    "model": "qwen:0.5b",
    "system_fingerprint": "fp_ollama",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "我是来自阿里云的超大规模语言模型，我叫通义千问。"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 18,
        "total_tokens": 28
    }
}