即刻关注,获取更多
关注公众号 N学无止界
获取更多
目的:封装一个通用的 Java 框架的 chat completion 接口,来适配各种大模型的统一调用
限制条件:仅对聊天(文本)接口进行封装,其他接口待后续计划安排
OpenAi chat completion 接口分析
==>官方文档
curl http://chat.xxxxxx.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
返回值
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo-0125",
"system_fingerprint": "fp_44709d6fcb",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "\n\nHello there, how may I assist you today?",
},
"logprobs": null,
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
from openai import OpenAI
# client = OpenAI()
client = OpenAI(
base_url='http://chat.xxxxxx.com/'
)
completion = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system", "content": "You are a helpful assistant."},
{
"role": "user", "content": "Hello!"}
],
stream=True
)
for chunk in completion:
print(chunk.choices[0].delta)
返回值
{
"id": "chatcmpl-123",
"object": "chat.completion.chunk",
"created": 1694268190,
"model": "gpt-3.5-turbo-0125",
"system_fingerprint": "fp_44709d6fcb",
"choices": [
{
"index": 0,
"delta": {
"role": "assistant",
"content": ""
},
"logprobs": null,
"finish_reason": null
}
]
}
{
"id": "chatcmpl-123",
"object": "chat.completion.chunk",
"created": 1694268190,
"model": "gpt-3.5-turbo-0125",
"system_fingerprint": "fp_44709d6fcb",
"choices": [
{
"index": 0,
"delta": {
"content": "Hello"
},
"logprobs": null,
"finish_reason": null
}
]
}
....
{
"id": "chatcmpl-123",
"object": "chat.completion.chunk",
"created": 1694268190,
"model": "gpt-3.5-turbo-0125",
"system_fingerprint": "fp_44709d6fcb",
"choices": [
{
"index": 0,
"delta": {
},
"logprobs": null,
"finish_reason": "stop"
}
]
}
请求参数分析
-
model 选填,默认为 gpt-3.5-turbo
-
messages 必填
[
{
"role": "system", "content": "You are a helpful assistant."},
{
"role": "user", "content": "Hello!"}
]
- stream 选填,默认为 false
If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message
- temperature 选填,默认为 1
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
使用什么样的采样温度,在 0 到 2 之间。较高的值,比如 0.8,会使输出更随机,而较低的值,比如 0.2,会使其更加聚焦和确定性。
We generally recommend altering this or top_p but not both.
我们一般建议修改这个参数或者 top_p,但不要同时修改两者。
- top_p 选填,默认为 1
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
一种与温度采样相对应的替代方法是核心采样,模型会考虑具有 top_p 概率质量的标记的结果。因此,0.1 表示仅考虑构成顶部 10% 概率质量的标记。
We generally recommend altering this or temperature but not both.
我们一般建议修改这个参数或者温度,但不要同时修改两者。
其他可能用到的请求参数
- max_tokens 选填,无默认值
The maximum number of tokens that can be generated in the chat completion.
- n integer or null 选填,默认值 1
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
输出参数对比(默认和流式结果)
-
流式结果没有 usage 字段,要计算token数需要额外计算,官方并未提供相关方法
计算token 的 官方介绍
Another small drawback of streaming responses is that the response no longer includes the usage field to tell you how many tokens were consumed. After receiving and combining all of the responses, you can calculate this yourself using tiktoken.
python库 tiktoken使用方法
-
普通请求结果的 choices 字段中的 message 用 delta 字段代替
灵医Bot Chat 接口分析
接口路径 /api/01bot/sse-gateway/stream
灵医Bot Chat 提供 如果是服务端调用,参考 Server Sent Events server and client for Golang
Header参数
名称 | 示例 | 类型 | 必选 | 说明 |
---|---|---|---|---|
Content-Type | application/json | string | 是 | Content-Type类型 |
X-IHU-Authorization-V2 | 参考《鉴权认证文档》 | string | 是 | 签名字符串 |
非流式请求示例(单轮)参数
{
"model": "test-model",
"stream": false,
"messages": [
{
"version":"api-v2",
"created": 1683944235,
"role": "user",
"content": [{
"type":"text",
"body":"患者3天前面部肿胀伴多发红疹,自觉瘙痒,是什么疾病?"
}]
}
]
}
与OpenAi区别
messages 字段多了 version created 字段
文档中message 参数中有 default 参数必填 (兜底开关,用于兜底策略:0:不兜底 1:一言兜底) 但是给出的示例中未体现
content 内容结构不同, 有 string 改成 实体类型(type, body)
非流式响应参数(单轮)
{