震惊!OpenAI API结构化输出准确率100%

OpenAI开发者的福音

程序员们,你们是否曾为模型输出的不一致性而头疼?是否曾花费大量时间尝试不同的提示词,以期获得理想的输出结果?现在,OpenAI带来了一个令人振奋的消息——API支持结构化输出了


这意味着,你只需定义一个JSON Schema,AI就会按照你的指令输出数据,而且准确率可达100%!

技术原理解析

OpenAI采用了一种称为约束解码的技术,通过将JSON Schema转换为上下文无关文法(CFG),在模型生成输出时,确保每个token都符合提供的schema。这种方法不仅提高了输出的准确性,还减少了不必要的延迟。

使用方式介绍

在API中,你可以通过两种方式引入结构化输出:

  1. 函数调用:在函数定义中设置strict: true,模型输出将与工具定义匹配。

请求demo

{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
    },
    {
      "role": "user",
      "content": "look up all my orders in may of last year that were fulfilled but not delivered on time"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "query",
        "description": "Execute a query.",
        "strict": true,
        "parameters": {
          "type": "object",
          "properties": {
            "table_name": {
              "type": "string",
              "enum": ["orders"]
            },
            "columns": {
              "type": "array",
              "items": {
                "type": "string",
                "enum": [
                  "id",
                  "status",
                  "expected_delivery_date",
                  "delivered_at",
                  "shipped_at",
                  "ordered_at",
                  "canceled_at"
                ]
              }
            },
            "conditions": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "column": {
                    "type": "string"
                  },
                  "operator": {
                    "type": "string",
                    "enum": ["=", ">", "<", ">=", "<=", "!="]
                  },
                  "value": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "number"
                      },
                      {
                        "type": "object",
                        "properties": {
                          "column_name": {
                            "type": "string"
                          }
                        },
                        "required": ["column_name"],
                        "additionalProperties": false
                      }
                    ]
                  }
                },
                "required": ["column", "operator", "value"],
                "additionalProperties": false
              }
            },
            "order_by": {
              "type": "string",
              "enum": ["asc", "desc"]
            }
          },
          "required": ["table_name", "columns", "conditions", "order_by"],
          "additionalProperties": false
        }
      }
    }
  ]
}

输出demo

  "table_name": "orders",
  "columns": ["id", "status", "expected_delivery_date", "delivered_at"],
  "conditions": [
    {
      "column": "status",
      "operator": "=",
      "value": "fulfilled"
    },
    {
      "column": "ordered_at",
      "operator": ">=",
      "value": "2023-05-01"
    },
    {
      "column": "ordered_at",
      "operator": "<",
      "value": "2023-06-01"
    },
    {
      "column": "delivered_at",
      "operator": ">",
      "value": {
        "column_name": "expected_delivery_date"
      }
    }
  ],
  "order_by": "asc"
}
  1. response_format参数:开发人员可以通过这个新选项选择需要的输出格式,当模型以结构化方式响应用户时,这一功能非常有用。

请求demo

{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful math tutor."
    },
    {
      "role": "user",
      "content": "solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "math_response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "steps": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "explanation": {
                  "type": "string"
                },
                "output": {
                  "type": "string"
                }
              },
              "required": ["explanation", "output"],
              "additionalProperties": false
            }
          },
          "final_answer": {
            "type": "string"
          }
        },
        "required": ["steps", "final_answer"],
        "additionalProperties": false
      }
    }
  }
}

输出demo

  "steps": [
    {
      "explanation": "Subtract 31 from both sides to isolate the term with x.",
      "output": "8x + 31 - 31 = 2 - 31"
    },
    {
      "explanation": "This simplifies to 8x = -29.",
      "output": "8x = -29"
    },
    {
      "explanation": "Divide both sides by 8 to solve for x.",
      "output": "x = -29 / 8"
    }
  ],
  "final_answer": "x = -29 / 8"
}

成本节省亮点

使用结构化输出的另一个好处是成本的大幅降低。与之前的版本相比,输入端成本节省了50%,输出端节省了33%。这对于初创公司和个人开发者来说,无疑是一个巨大的吸引力。

原生SDK支持

OpenAI的Python和Node SDK已经更新,原生支持结构化输出。这意味着,为工具提供架构或响应格式就像提供Pydantic或Zod对象一样简单,SDK能自动将JSON响应反序列化为类型化数据结构。

用例列举

结构化输出的应用场景非常广泛,例如:

  • 动态生成用户界面
  • 将最终答案与支撑性的推理或附加评论分开
  • 从非结构化数据中提取结构化数据

安全保障

安全始终是OpenAI的首要考虑。新的结构化输出功能遵守OpenAI现有的安全政策,允许模型拒绝不安全的请求。API响应上的新的refusal字符串值,使开发人员能够以编程方式检测模型是否生成了拒绝。

结语

OpenAI的这一新功能,无疑将极大地提高开发者的工作效率,降低成本,同时保证了输出的准确性和安全性。这不仅是技术的一次飞跃,更是对开发者的一次深刻理解。在未来,这种结构化输出功能将如何重塑我们与人工智能的交互方式,让我们拭目以待。

关注爽姐

如果你想跟随爽姐一起探索AIGC提效工具及其创新玩法,深入挖掘AI编程与AI智能体的奥秘,可扫描下方二维码图片添加爽姐微信,爽姐免费送你一套AI资料图片,还可加入AI探讨群,跟众多志同道合的小伙伴一起交流。

图片

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值