Cohere API之旅
0. 背景
大约两个月前有一份新闻稿。
Oracle 计划与为企业提供人工智能平台的 Cohere 合作,在 OCI 上提供新的生成式人工智能服务。
请注意,对于可以免费使用的试用密钥,严格设置了速率限制和执行总数。
1. Cohere提供的模型和函数
-
Command
- 提供文本生成模型
- 学会遵循用户命令
-
Embeddings
- 提供 Embedding 模型
- 多言语対応
此外,所提供的功能在 API 参考中进行了描述,但除非在某些等待列表中注册才能使用的功能除外。
API endpoint | 概述 |
---|---|
/generate | 给定输入,生成真实文本并返回 |
/embed | 对文本进行 Embedding 并返回浮点列表 |
/classify | 判断哪个标签与输入文本匹配并返回结果 |
/tokenize | 将输入文本转换分为 token 并返回结果 |
/detokenize | 接收 BPE(byte-pair encoding)token 并返回其文本表示形式 |
/detect-language | 识别输入文本是用哪种语言编写的并返回结果 |
/summarize | 为输入文本创建英文摘要并返回结果 |
/rerank | 接受查询和文本列表,生成并返回一个有序数组,其中每个文本都分配有一个关联的分数 |
2. Cohere 客户端
Cohere 还有一个异步客户端,但这次我们将使用同步客户端,因为我们只是测试功能。
代码示例,
import cohere
api_key = '<your-api-key>'
co = cohere.Client(api_key)
co.check_api_key()
3. Co.Generate API
仅指定最少的必要参数并尝试执行。
示例代码,
res_generate = co.generate(
prompt='Hello, how'
)
print(res_generate)
输出结果,
[cohere.Generation {
id: 905bc1fa-cb5a-4082-b1d8-2704d426956b
prompt: Hello, how
text: How can I help you?
likelihood: None
finish_reason: None
token_likelihoods: None
}]
它生成了文本 “How can I help you?”?
还有其他参数允许您选择模型(model)、要生成的文本数量、文本中的最大标记数量等等。以下是我尝试过的一些示例。
示例:使用 Nightly 版本生成 3 个包含最多 50 个标记的文本
示例代码,
res_generate = co.generate(
prompt='Hello, how',
model='command-nightly',
num_generations=3,
max_tokens=50
)
print(res_generate)
输出结果,
[cohere.Generation {
id: 8eca3fb0-9a9b-4df2-997d-e5fba22ebfbb
prompt: Hello, how
text: can I help you?
likelihood: None
finish_reason: None
token_likelihoods: None
}, cohere.Generation {
id: 2b5d7382-64cb-48a8-8bb0-33c574565def
prompt: Hello, how
text: Hello! I'm a chatbot trained to be helpful and harmless. How can I assist you today?
likelihood: None
finish_reason: None
token_likelihoods: None
}, cohere.Generation {
id: 060903ac-241f-4a9f-bd46-32de661a0799
prompt: Hello, how
text: are you doing today?
likelihood: None
finish_reason: None
token_likelihoods: None
}]
示例:仅考虑前 5 个最可能的标记 (k=5) 进行生成并最大化生成中的随机性 (temperature=5.0)
示例代码,
res_generate = co.generate(
prompt='Hello, how',
model='command-nightly',
num_generations=3,
max_tokens=50,
k=5,
temperature=5.0
)
print(res_generate)
输出结果,
[cohere.Generation {
id: 55f7482c-70ef-4846-be70-34af13df268a
prompt: Hello, how
text: are you ?
likelihood: None
finish_reason: None
token_likelihoods: None
}, cohere.Generation {
id: d9286b01-bd12-4563-839b-6cf867f69873
prompt: Hello, how
text: I’d like to thank the user. Is this how they would prefer me to address them? If so I can change that to their username.
How may I be of service? If I can help with any tasks I
likelihood: None
finish_reason: None
token_likelihoods: None
}, cohere.Generation {
id: 6a2472a8-2663-4b83-a988-f164d1254707
prompt: Hello, how
text: can I help you with something, or provide any assistance?
likelihood: None
finish_reason: None
token_likelihoods: None
}]
示例:以 JSON 流的形式接收生成的结果(对于增量呈现响应内容的 UI 似乎很有用)
示例代码,
res_generate_streaming = co.generate(
prompt='Hello, how',
stream=True
)
for index, token in enumerate(res_generate_streaming):
print(f"{index}: {token}")
res_generate_streaming.texts
输出结果,
0: StreamingText(index=0, text=' Hello', is_finished=False)
1: StreamingText(index=0, text='!', is_finished=False)
2: StreamingText(index=0, text=' How', is_finished=False)
3: StreamingText(index=0, text=' can', is_finished=False)
4: StreamingText(index=0, text=' I', is_finished=False)
5: StreamingText(index=0, text=' help', is_finished=False)
6: StreamingText(index=0, text=' you', is_finished=False)
7: StreamingText(index=0, text=' today', is_finished=False)
8: StreamingText(index=0, text='?', is_finished=False)
[' Hello! How can I help you today?']
其他详细信息请参阅API参考 - Co.Generate。
4. Co.Embed API
仅指定最少的必要参数并尝试执行。
示例代码,
res_embed = co.embed(
texts=['hello', 'world']
)
print(res_embed)
输出结果(部分省略),
cohere.Embeddings {
embeddings: [[1.6142578, 0.24841309, 0.5385742, -1.6630859, -0.27783203, 0.35888672, 1.3378906, -1.8261719, 0.89404297, 1.0791016, 1.0566406, 1.0664062, 0.20983887, ... , -2.4160156, 0.22875977, -0.21594238]]
compressed_embeddings: []
meta: {'api_version': {'version': '1'}}
}
返回嵌入的结果。另外,在Embedding函数中,可以将要使用的模型指定为参数。
示例:尝试使用轻型版本模型 (model=embed-english-light-v2.0)
示例代码,
res_embed = co.embed(
texts=['hello', 'world'],
model='embed-english-light-v2.0'
)
print(res_embed)
输出结果(部分省略),
cohere.Embeddings {
embeddings: [[-0.16577148, -1.2109375, 0.54003906, -1.7148438, -1.5869141, 0.60839844, 0.6328125, 1.3974609, -0.49658203, -0.73046875, -1.796875, 1.5410156, 0.66064453, 0.9448242, -0.53515625, 0.24914551, 0.53222656, 0.23425293, 0.52685547, -1.3935547, 0.04095459, 0.8569336, -0.5620117, -0.42211914, 0.55371094, 3.5820312, 7.2890625, -1.2539062, 1.3583984, 0.12988281, -1.1660156, 0.124816895, ... ,0.87060547, 1.0205078, 0.5854492, -2.734375, -0.066589355, 1.8349609, 0.16430664, -0.26220703, -1.0625]]
compressed_embeddings: []
meta: {'api_version': {'version': '1'}}
}
示例:尝试使用多语言模型 (model=embed-multilingual-v2.0)
示例代码,
res_embed = co.embed(
texts=['こんにちは', '世界'],
model='embed-multilingual-v2.0'
)
print(res_embed)
输出结果(部分省略),
cohere.Embeddings {
embeddings: [[0.2590332, 0.41308594, 0.24279785, 0.30371094, 0.04647827, 0.1361084, 0.41357422, -0.40063477, 0.2553711, 0.17749023, -0.1899414, -0.041900635, 0.20141602, 0.43017578, -0.5878906, 0.18054199, 0.42333984, 0.010749817, -0.56640625, 0.1517334, 0.14282227, 0.36767578, 0.26953125, 0.1418457, 0.28051758, 0.1661377, -0.13293457, 0.23620605, 0.08703613, 0.36914062, 0.22180176, ... ,0.027786255, -0.18530273, -0.24414062, 0.123168945, 0.6425781, 0.08831787, -0.21862793, -0.18237305, -0.031341553]]
compressed_embeddings: []
meta: {'api_version': {'version': '1'}}
}
其他详细信息请参阅 API 参考 - Co.Embed。
5. Co.Classify API
仅指定最少的必要参数并尝试执行。
示例代码,
from cohere.responses.classify import Example
examples=[
Example("Dermatologists don't like her!", "Spam"),
Example("'Hello, open to this?'", "Spam"),
Example("I need help please wire me $1000 right now", "Spam"),
Example("Nice to know you ;)", "Spam"),
Example("Please help me?", "Spam"),
Example("Your parcel will be delivered today", "Not spam"),
Example("Review changes to our Terms and Conditions", "Not spam"),
Example("Weekly sync notes", "Not spam"),
Example("'Re: Follow up from today's meeting'", "Not spam"),
Example("Pre-read for tom