简记：Xinference中LLM的Function calling和接口调用

小东西啊

于 2024-08-28 10:57:09 发布

阅读量852

点赞数 13

分类专栏：日常记录文章标签： python AI编程

本文链接：https://blog.csdn.net/qq_39067449/article/details/141631255

版权

日常记录专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Function calling

LLM作为人工智能领域的热门技术,其强大的数据处理和学习能力使其在很多领域都取得了显著的成果。然而,LLM更像是一个“大脑”,虽然拥有强大的计算和推理能力,但缺乏像人类一样的“手和脚”,即缺乏执行实际物理操作或实时信息获取的能力,因为大模型无法获取实时信息。

所以能不能让它具备人的能力呢？AI Agent是LLM通过外部接口工具（tools）构建的”手脚健全“智能体，从而更接近”人“，Function calling是实现AI Agent的方式之一。举个例子：你如果问LLM今天的天气怎么样，它无法给出准确的回答，LLM不具备调用网络的能力；如果你把外部天气APP的信息通过Function calling传给它，这个问题就可以被解决了。

xinference的tools工具介绍：

通过 tools 功能，可以让模型使用外部工具。

像 OpenAI的Function calling API一样，可以定义带有参数的函数，并让模型动态选择要调用哪个函数以及传递给它什么参数。

调用函数的一般过程：

提交一个查询，详细说明函数、它们的参数和描述。
LLM 决定是否启动功能。如果选择不启动，它会用日常语言回复，要么基于其内在理解提供解决方案，要么询问有关查询和工具使用的进一步细节。在决定使用工具时，它会推荐适合的 API 和 JSON 格式的使用说明。
接下来，在应用程序中实现 API 调用，并将返回的响应发送回 LLM 进行结果分析，并继续执行下一步操作。

目前没有为 tools 功能实现专用的 API 端点。它必须与 Chat API 结合使用。

Xinference 上支持使用 tools 功能的模型有：qwen-chat、chatglm3、gorilla-openfunctions-v1

官网示例代码：

import openai

client = openai.Client(
    api_key="cannot be empty",
    base_url="http://<XINFERENCE_HOST>:<XINFERENCE_PORT>/v1"
)
client.chat.completions.create(
    model="<MODEL_UID>",
    messages=[{
        "role": "user",
        "content": "Call me an Uber ride type 'Plus' in Berkeley at zipcode 94704 in 10 minutes"
    }],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "uber_ride",
                "description": "Find suitable ride for customers given the location, "
                "type of ride, and the amount of time the customer is "
                "willing to wait as parameters",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "loc": {
                            "type": "int",
                            "description": "Location of the starting place of the Uber ride",
                        },
                        "type": {
                            "type": "string",
                            "enum": ["plus", "comfort", "black"],
                            "description": "Types of Uber ride user is ordering",
                        },
                        "time": {
                            "type": "int",
                            "description": "The amount of time in minutes the customer is willing to wait",
                        },
                    },
                },
            },
        }
    ],
)
print(response.choices[0].message)

Xinference官网：

https://inference.readthedocs.io/zh-cn/latest/models/model_abilities/tools.html#

接口调用：

需求：调用领域文章知识库，对LLM的输出数据进行结构化处理（json），类似信息抽取，封装接口返回数据供其它调用。采取的办法是通过Prompt给模型输入所需信息和输出格式即可。

查看所用的模型:

from xinference.client import Client

client = Client("http://0.0.0.0:9997")
print(client.list_models())

调用示例：

import openai

client = openai.Client(
    api_key="cannot be empty",
    base_url="http://<XINFERENCE_HOST>:<XINFERENCE_PORT>/v1"
)
client.chat.completions.create(
    model="<MODEL_UID>",
    messages=[
        {
            "content": "What is the largest animal? + 需求+返回格式",
            "role": "user",
        }
    ],
    max_tokens=512,
    temperature=0.7
)

因为这个输出的是ChatCompletion格式，需要转一下json，

    xinfer = response.model_dump_json()
    xinferout = json.loads(xinfer)
    print("json Text:", xinferout)

然后接口就可封装了。

小东西啊

关注

13
点赞
踩
17

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录