json-schema
Recently, I have seen several questions like “what’s the difference between JSON-LD and JSON Schema” or “can I use JSON Schema and Schema.org”. I come from a linked data background (which is close to the world of Schema.org) but have recently started using JSON Schema a lot and I have to admit that there is no trivial answer to these questions. There is the obvious similarity in the standard names like “Schema” and “JSON”. If you compare the Schema.org page for Person to this example on the JSON Schema page, you have to admit that they kind of look alike. Combine this with the fact that Schema.org touts JSON-LD, which — by design — very much looks like regular JSON completes the confusion. So there definitely are enough reasons to write this article.
最近,我遇到了一些问题,例如“ JSON-LD和JSON Schema有什么区别”或“我可以使用JSON Schema和Schema.org”。 我来自链接数据背景(与Schema.org的世界接近),但是最近开始大量使用JSON Schema,我不得不承认对这些问题没有简单的答案。 标准名称(例如“ Schema”和“ JSON”)之间存在明显的相似性。 如果将Person的Schema.org页面与JSON Schema页面上的此示例进行比较,则必须承认它们看起来相似。 将其与Schema.org吹捧JSON-LD的事实相结合,从设计上看,JSON-LD非常像常规JSON来解决此问题。 因此,绝对有足够的理由来撰写本文。
JSON模式 (JSON Schema)
JSON Schema is to JSON what XML Schema is to XML. It allows you to specify the structure of a JSON document. You can state that the field “email” must follow a certain regular expression or that an address has “street_name”, “number”, and “street_type” fields. Michael Droettboom’s book “Understanding JSON Schema” illustrates validation quite nicely with red & green examples.
JSON Schema是JSON,XML Schema是XML。 它允许您指定JSON文档的结构。 您可以声明“电子邮件”字段必须遵循某个正则表达式,或者地址具有“ street_name”,“ number”和“ street_type”字段。 Michael Droettboom的书“ Understanding JSON Schema”(理解JSON模式)通过红色和绿色示例很好地说明了验证。
The main use case for JSON Schema seems to be in JSON APIs where it plays two major roles:
JSON模式的主要用例似乎是在JSON API中,它扮演两个主要角色:
- Clients and servers can validate request and response objects in a generic way. This makes development a lot easier, since the implementation can “outsource” these checks to a standard component. Once a message passed the validation, you can safely assume that the document adheres to the rules. 客户端和服务器可以以通用方式验证请求和响应对象。 因为实现可以将这些检查“外包”到标准组件,所以这使开发变得容易得多。 邮件通过验证后,您可以放心地认为该文档符合规则。
As with any API, documentation is key when developers write code that uses it. JSON Schema is increasingly used to describe the structure of requests and responses by embedding it in an overall API description. Swagger is probably the most prominent example of this paradigm. Consider the pet-store example. Scroll all the way down on the page and you see this JSON Schema definition of “Pet”, which is a basic element in requests and responses of this API (you can find the actual JSON Schema embedded in the raw Swagger file — note that currently there are still some differences between the OpenAPI specification and JSON Schema which will be resolved with OpenAPI 3.1).
与任何API一样,文档是开发人员编写使用它的代码时的关键。 通过将JSON模式嵌入到总体API描述中,它越来越多地用于描述请求和响应的结构。 昂首阔步可能是这种范例最突出的例子。 考虑一下宠物店的例子 。 一直滚动到页面上,您会看到“ Pet”的JSON Schema定义,这是此API请求和响应中的基本元素(您可以找到原始的Swagger文件中嵌入的实际JSON Schema,请注意,当前OpenAPI规范和JSON模式之间仍然存在一些差异,将通过OpenAPI 3.1解决)。

As with all things related to code, reuse is a good idea. JSON Schema has the ability to import schemas using the $ref keyword. There are also efforts to share schemas. JSON Schema Store is one example. Its main use case is to support syntax highlighting for editors, for instance when editing a swagger file. At the time of writing, it contains over 250 schemas including — drum-roll please / you certainly guessed it — Schema.org. These describe things like Action and Place. So the idea could be to centrally define JSON Schema building blocks that can be re-used in different APIs, making it easier to consume them, maybe even to the point where intelligent software can interact with APIs automatically. But before we get carried away, let’s have a look at Schema.org.
与所有与代码相关的事物一样,重用是一个好主意。 JSON模式可以使用$ ref关键字导入模式。 还需要努力共享模式。 JSON模式存储就是一个示例。 它的主要用例是支持编辑器的语法突出显示,例如在编辑swagger文件时。 在撰写本文时,它包含250多个模式,包括-请鼓动/您肯定猜到了-Schema.org。 这些描述了诸如Action和Place之类的东西。 因此,其想法可能是集中定义可在不同API中重复使用的JSON Schema构建块,从而使其更易于使用,甚至可能使智能软件可以自动与API进行交互。 但是,在我们不为所动之前,让我们看一下Schema.org。
Schema.org (Schema.org)
Schema.org provides “schemas for structured data on the Internet”. Let’s assume you book a hotel and get a confirmation email. The email contains schema.org markup providing the contents of the email in machine readable form. This allows your calendar to “understand” the email and add entry automatically. We put “understand” in quotes because there really is no magic here. This very useful feature is made possible by the fact that the hotel’s IT system and the calendar agree on what a hotel is and also agree to represent this concept using markup like this:
Schema.org提供了“ Internet上结构化数据的方案”。 假设您预订了一家酒店并收到确认电子邮件。 电子邮件包含schema.org标记,以机器可读的形式提供电子邮件的内容。 这使您的日历可以“了解”电子邮件并自动添加条目 。 我们在引号中加上“理解”,因为这里确实没有魔术。 饭店的IT系统和日历可以就饭店的意思达成一致,并同意使用如下标记来代表这个概念,从而使这一非常有用的功能成为可能:
{
"@context": “http://schema.org",
"@type": "Hotel",
"name": "ACME Hotel Innsbruck",
"checkinTime": "13:00:00-05:00"
…
}
In fact, it is almost like they agree on an API which happens to use email as the transport mechanism (please note that schema.org is not limited to email). It is important to note that Schema.org not only defines concepts or classes. The fields or properties are also standardized. Take checkinTime for example, which is an XML Schema (time and timezone) string defining “the earliest someone may check into a lodging establishment”.
实际上,这几乎就像他们同意使用电子邮件作为传输机制的API一样(请注意,schema.org不限于电子邮件)。 重要的是要注意,Schema.org不仅定义了概念或类。 字段或属性也已标准化。 以checkinTime为例,这是一个XML Schema(时间和时区)字符串,定义了“最早有人可以入住住宿场所”。
Schema.org not only defines agreed-upon class and property definitions, it also defines a hierarchy of classes and properties (a Hotel is a LodgingBusiness), which properties are allowed for which class (checkinTime can be used for LodgingBusiness and LodgingReservation) and the type of the properties (checkinTime is a DateTime or Time and starRating is a Rating).
Schema.org不仅定义了公认的类和属性定义,还定义了类和属性的层次结构(酒店是LodgingBusiness),允许哪个类使用哪些属性(checkinTime可用于LodgingBusiness和LodgingReservation)以及属性的类型(checkinTime是DateTime或Time,而starRating是Rating)。
Both Schema.org schemas and JSON schemas describe document structures using classes and properties / types and fields. The difference is that Schema.org is based on an Ontology which is published in different formats on GitHub. An ontology:
Schema.org模式和JSON模式都使用类和属性/类型和字段来描述文档结构。 区别在于Schema.org基于本体,该本体以不同的格式发布在GitHub上 。 本体:
defines classes and properties with agreed upon IRIs like: https://schema.org/Hotel
通过商定的IRI定义类和属性,例如: https : //schema.org/Hotel
- describes data where nodes (being an instance of a class) link to other nodes (via properties) forming a linked data graph 描述数据,其中节点(作为类的实例)链接到其他节点(通过属性),从而形成链接的数据图
- establishes a class and property taxonomy 建立阶级和财产分类法
- treats properties as first class citizens which can originate from different classes (called domain) and can have different types as well (called range) 将属性视为头等公民,这些公民可以来自不同的阶级(称为域),也可以具有不同的类型(称为范围)
Now let’s get a bit more concrete and look at JSON-LD as one of the possible representations of Schema.org data.
现在让我们更具体一点,将JSON-LD视为Schema.org数据的可能表示形式之一。
JSON-LD (JSON-LD)
The JSON-LD motto is “Data is messy and disconnected. JSON-LD organizes and connects it, creating a better Web.” Let’s take the hotel description as an example. We’re starting with a “normal” JSON representation:
JSON-LD的座右铭是“数据混乱且断开连接。 JSON-LD进行组织和连接,从而创建一个更好的Web。” 让我们以酒店描述为例。 我们从“普通” JSON表示开始:
{
"name": "ACME Hotel Innsbruck",
"checkinTime": "13:00:00-05:00",
"starRating": {
"bestRating": 10,
...
}
...
}
This JSON document has the following tree structure:
该JSON文档具有以下树结构:

We are already using the Schema.org vocabulary, however there could be other vocabulary and especially a field like “name” is very likely to be ambiguous. Therefore, we specify our vocabulary as follows:
我们已经在使用Schema.org词汇表,但是可能还有其他词汇表,尤其是像“名称”这样的字段很可能是模棱两可的。 因此,我们将词汇指定如下:
"@context": "http://schema.org/"
This causes “name” to become http://schema.org/name. The actual context URL (via a content type redirect) is a simple list that defines a mapping from simple names to Schema.org URLs. Other examples define datatypes such as string, date, or link (@id).
这使“名称”成为http://schema.org/name 。 实际的上下文URL (通过内容类型重定向)是一个简单列表,它定义了从简单名称到Schema.org URL的映射。 其他示例定义数据类型,例如字符串,日期或链接(@id)。
Our tree is already a graph, however, we are lacking the information of which hotel we mean. In other words, we do not know the ID of the top graph node. We can specify this using:
我们的树已经是一张图,但是,我们缺少我们指的是哪家酒店的信息。 换句话说,我们不知道顶部图节点的ID。 我们可以使用以下命令指定它:
"@id": "urn:acme-hotel"
We chose a URN, note that any IRI is possible. You could also use the hotel’s URL. The only prerequisite is that other participants can understand and interpret the ID.
我们选择了URN,请注意,任何IRI都是可能的。 您也可以使用酒店的URL。 唯一的前提条件是其他参与者可以理解和解释该ID。
Finally, we can add the information that this document describes a hotel:
最后,我们可以添加此文档描述的酒店信息:
"@type": "Hotel"
The resulting JSON-LD document is:
生成的JSON-LD文档为:
{
"@context": "http://schema.org/",
"@type": "Hotel",
"@id": "urn:acme-hotel",
"name": "ACME Hotel Innsbruck",
"checkinTime": "13:00:00-05:00",
"starRating": {
"bestRating": 10,
...
}
...
}
which represents this structure:
代表以下结构:

If you paste the example into the JSON-LD playground you get this exact graph (we are choosing the table representation where each link above becomes one table row stating that subject predicate object).
如果将示例粘贴到JSON-LD游乐场中,则会得到此精确图形(我们选择表表示形式,其中上面的每个链接变为一个表行,说明该主题谓词对象)。

Note that the rating node has the ID _:b0 which is an anonymous ID. This means that the rating cannot be referenced by other document and it only exists as the child or its parent object. The hotel, though, can be referenced by other documents. For instance, a person (http://example.org/joe) can be affiliated with the hotel:
请注意,评级节点的ID _:b0是匿名ID。 这意味着该分级不能被其他文档引用,并且只能作为子级或其父级对象存在。 但是,该酒店可以由其他文档引用。 例如,一个人(http://example.org/joe)可以隶属于酒店:
{
"@context": "http://schema.org/",
"@id": "http://example.org/joe",
"affiliation": {
"@id": "urn:acme-hotel"
}
}
Both documents can be combined, resulting in a graph with one additional link from Joe to the hotel. Adding additional properties under id would add another property of the hotel.
可以将两个文档组合在一起,从而生成一个图表,其中包含从Joe到酒店的另一个链接。 在id下添加其他属性会添加酒店的另一个属性。
好,现在呢? (OK, Now What?)
We looked at JSON Schema, Schema.org, and JSON-LD, and now the question is, can we combine the approaches? Let’s look at three possibilities:
我们查看了JSON Schema,Schema.org和JSON-LD,现在的问题是,我们可以结合使用这些方法吗? 让我们看一下三种可能性:
使用JSON模式验证JSON-LD (Validating JSON-LD Using JSON Schema)
The first idea that comes to mind would be to use JSON Schema to validate JSON-LD. As a matter of fact, JSON-LD publishes a JSON schema. However, the schema only looks at JSON-LD keywords and overall document structure and does not validate the actual ontology. It makes more sense to validate JSON-LD by converting it into an RDF graph and validating it via the underlying ontology using RDF / OWL tooling.
我想到的第一个想法是使用JSON模式验证JSON-LD。 实际上,JSON-LD发布JSON模式 。 但是,该模式仅查看JSON-LD关键字和整个文档结构,而没有验证实际的本体。 通过将JSON-LD转换为RDF图并使用RDF / OWL工具通过基础本体进行验证来验证JSON-LD更为有意义。
从Schema.org生成JSON模式 (Generating JSON Schema from Schema.org)
The next approach could be to generate JSON schemas from Schema.org. Remember Schema.org being present on the schema store website? It turns out that one of the main problems is the handling of arrays. In a “normal” JSON schema, you specify whether a property is an array or has a single value. In the linked data world, any property can be repeated. So it is perfectly legal for two sources to state check in times for the hotel. Therefore, the graph data model will always return a list of values when you ask for a property of a given graph node.
下一种方法是从Schema.org生成JSON模式。 还记得架构商店网站上的Schema.org吗? 事实证明,主要问题之一是数组的处理。 在“普通” JSON模式中,您可以指定属性是数组还是具有单个值。 在链接数据世界中,任何属性都可以重复。 因此,有两个消息来源说明酒店的入住时间是完全合法的。 因此,当您要求给定图节点的属性时,图数据模型将始终返回值列表。
One can work around this by defining all properties as “oneOf” single value or array of values, but this leads to complex and convoluted schemas. Some projects allow you to define the cardinality a priori as a parameter of the generation process. The schema on schemastore, for example, makes the following choices: smokingAllowed has a single value whereas amenityFeature is an array. This certainly makes sense. The telephone number also has a single value and one can certainly make the case that it should be an array.
可以通过将所有属性定义为“ oneOf”单个值或值数组来解决此问题,但这会导致复杂而复杂的模式。 一些项目允许您先定义基数作为生成过程的参数。 例如, schemastore上的架构进行以下选择:吸烟允许具有单个值,而amenityFeature是一个数组。 这当然是有道理的。 电话号码也只有一个值,可以肯定的是它应该是一个数组。
使用Schema.org整理JSON模式 (Linting JSON Schema Using Schema.org)
The JSON Schema website lists a number of linter tools that check the schema itself. One approach could be to encourage the use of Schema.org vocabulary, so a linting suggestion could be to change amenity_feature to amenityFeature since it can easily be mapped to https://schema.org/amenityFeature. The benefit for developers would be that they can reuse the Schema.org documentation. Also, maybe one day their REST services can be understood and consumed by intelligent clients.
JSON模式网站列出了许多用于检查模式本身的linter工具 。 一种方法可能是鼓励使用Schema.org词汇表,因此建议是将amenity_feature更改为amenityFeature,因为它可以轻松映射至https://schema.org/amenityFeature 。 对于开发人员来说,好处是他们可以重用Schema.org文档。 同样,也许有一天,智能客户端可以理解和使用他们的REST服务。
摘要 (Summary)
It is clear that Schema.org and JSON-LD on the one hand and JSON Schema on the other come from different angles and do not really fit together naturally. Nevertheless, I believe that each community can benefit from the other. JSON Schema can learn about ontologies, reuse, and semantics. The linked data community can learn from the pragmatism of JSON Schema. In fact, I think JSON-LD already learned this lesson and is a great improvement over NTriples or RDF/XML. Likewise, approaches such as JSON API introduce linked data principles to the world of REST APIs.
显然,一方面是Schema.org和JSON-LD,另一方面是JSON Schema,它们从不同的角度来看,并不能真正自然地融合在一起。 尽管如此,我相信每个社区都能从另一个社区中受益。 JSON Schema可以了解本体,重用和语义。 链接数据社区可以从JSON Schema的实用性中学到东西。 实际上,我认为JSON-LD已经学习了这一课,并且比NTriples或RDF / XML有了很大的改进。 同样, JSON API之类的方法将链接数据原理引入了REST API领域。
翻译自: https://levelup.gitconnected.com/json-schema-schema-org-json-ld-whats-the-difference-e30d7315686a
json-schema