Calculation of pricing of tokens in OPENAI calls

题意:在OpenAI调用中计算代币的定价

问题背景:

I'm trying to price the tokens used in a call to OPENAI. I have a txt file with plain text that was uploaded to Qdrant. When I ask the following question:

我正在尝试为调用OpenAI时使用的代币定价。我有一个包含纯文本的txt文件,该文件已上传到Qdrant。当我提出以下问题时:

Who is Michael Jordan?        "Who is Michael Jordan?" 的翻译为:“谁是迈克尔·乔丹?”

and use the get_openai_callback function to track the number of tokens and the price of the operation, one of the keys of information in the output doesn't make sense to me.

并使用 get_openai_callback 函数来跟踪操作中的代币数量和价格,但输出中的一个信息键对我来说没有意义。

Tokens Used: 85
    Prompt Tokens: 68
    Completion Tokens: 17
Successful Requests: 1
Total Cost (USD): $0.00013600000000000003

Why does the Prompt Tokens value differ from the input value? The amount of tokens in the input text (which is what I understand as Prompt Token) is:

为什么Prompt Tokens的值与输入值不同?输入文本中的代币数量(这是我理解的Prompt Token)是:

query = 'Who is Michael Jordan'

encoding = tiktoken.encoding_for_model('gpt-3.5-turbo-instruct')
print(f"Tokens: {len(encoding.encode(query))}")

4

, but the output in the response is like 68. I considered the idea that Prompt Tokens were the sum of the base tokens (txt file) added to the question tokens, but the math doesn't fit.

但是响应中的输出是68。我考虑过Prompt Tokens是基本代币(txt文件)和问题代币的总和的想法,但数学上并不吻合。

Number of tokens in the txt file: 17        txt文件中的代币数量:17

Arquivo txt: 'Michael Jeffrey Jordan is an American businessman and former basketball player who played as a shooting guard'

txt文件内容:‘迈克尔·杰弗里·乔丹(Michael Jeffrey Jordan)是一名美国商人和前篮球运动员,司职得分后卫

query + file_token: 21 (4+17)

Could anyone help me understand the pricing calculation?

有人能帮我理解一下定价计算吗?

I tried to search OPENAI's own documentation, github and other forums, but I don't think it's easy to find information or that it's open to the public. I want to understand if I'm missing something or if it's a calculation that users don't have access to.

我尝试搜索OpenAI自己的文档、GitHub和其他论坛,但我认为不容易找到相关信息,或者这些信息并不对公众开放。我想知道是我遗漏了什么,还是这种计算方式是用户无法访问的。

UPDATE For any future questions from other users:        更新:对于其他用户未来的任何问题:

import langchain 
langchain.debug = True

Run the get_openai_callback() function and see the entire log appear on the screen. The value of the "prompts" key is a list containing a string that is the instruction on how the response should be given. The number of tokens for this prompt is the value that appears in the Prompt Tokens.

运行get_openai_callback()函数,并查看屏幕上出现的完整日志。'prompts'键的值是一个列表,其中包含一个字符串,该字符串是关于如何给出响应的指令。这个提示的代币数量就是Prompt Tokens中显示的值。

问题解决:

Prompt Tokens includes your question and any context provided, plus additional system messages and formatting added by the API. While Completion Tokens generated in the response.

Prompt Tokens包括您的问题和提供的任何上下文,以及API添加的额外系统消息和格式。而Completion Tokens则是在响应中生成的。

In your example:        在你的示例中:

Visible Query: Who is Michael Jordan? (4 tokens) Text from File: Michael Jeffrey Jordan is an American businessman and former basketball player who played as a shooting guard (17 tokens) Expected: 4+17=21 4+17=21 tokens.

可见查询:迈克尔·乔丹是谁?(4个代币) 文件中的文本:迈克尔·杰弗里·乔丹是一名美国商人和前篮球运动员,司职得分后卫(17个代币) 预期:4+17=21 共有21个代币。

However, you see 68 prompt tokens because the API adds tokens for roles, instructions, and other metadata.To understand the exact token count, you can log the full request payload or use OpenAI's token counting tools. This extra context explains why the prompt token count is higher than expected.

然而,您看到68个提示代币,因为API为角色、指令和其他元数据添加了代币。为了了解确切的代币数量,您可以记录完整的请求负载或使用OpenAI的代币计数工具。这个额外的上下文解释了为什么提示代币的数量高于预期。

  • 10
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

营赢盈英

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值