JSON 标准 ECMA-404

ECMA-404

翻译仅供参考,请以原文为准

介绍 Introduction

​ JSON 是一种文本数据交换格式,语法由大括号、方括号、冒号和逗号组成,用于很多上下文、配置文件和应用程序中。JSON 受到 JavaScript 的对象表示启发,它不会尝试将 JS 的内部数据表示强加于其他语言,而是共享文本表示的一部分。

​ JSON 不关心数字的实现。在不同的编程语言中会有各种长度和补码、整型或浮点型、二进制或十进制的不同数字类型。这使得不同编程语言之间的数据交换很困难。相反,JSON 只提供人们使用的数字表示:数字序列。所有的编程语言都可以解析数字序列,即使它们内部对于数字的实现不同。这对于数据交换已经足够了。

​ JSON 文本是一系列 Unicode 码点。JSON 也依赖 \u 转义符中的 Unicode 十六进制表示。

​ 不同编程语言在是否支持对象以及若支持对象对于对象的特征和约束方面存在很大差异。对象系统的模型已经大相径庭而且还将变得更加不同。相反,JSON 提供了一些简单的符号来表示键值对。大多数编程语言都有一些表示这样集合的特性,这些特性可能被命名为 record, struct, diet, map, hash, object。

​ JSON 也支持有序列表。所有的编程语言都有一些特性来表示这样的列表,这些特性可能被命名为 array, vector, list。因为对象和数组可以嵌套,所以可以表示树和其他复杂的数据结构。通过接受 JSON 的简单约定,复杂的数据结构可以很容易的在不兼容编程语言之间转换。

​ JSON 不支持,至少不直接支持循环引用。JSON 同样不适用于需要二进制数据的应用。

​ 预计将会有其他基于此标准的,严格遵守 JSON 格式同时提供了各种编码细节限制的标准。这些标准可能需要特定的的行为。JSON 本身并没有指定任何行为。

​ 因为它很简单,所以预计 JSON 的语法将永远不会改变。这给了 JSON 作为基础表示极大的稳定性。JSON 作为 JS 对象简谱于 2001 年在 JSON.org 首次向世界展示。

JSON is a text format that facilitates structured data interchange between all programming languages. JSON is syntax of braces, brackets, colons, and commas that is useful in many contexts, profiles, and applications. JSON was inspired by the object literals of JavaScript aka ECMAScript as defined in the ECMAScript Language Specification, third Edition [1]. It does not attempt to impose ECMAScript’s internal data representations on other programming languages. Instead, it shares a small subset of ECMAScript’s textual representations with all other programming languages.

JSON is agnostic about numbers. In any programming language, there can be a variety of number types of various capacities and complements, fixed or floating, binary or decimal. That can make interchange between different programming languages difficult. JSON instead offers only the representation of numbers that humans use: a sequence of digits. All programming languages know how to make sense of digit sequences even if they disagree on internal representations. That is enough to allow interchange.

JSON text is a sequence of Unicode code points. JSON also depends on Unicode in the hex numbers used in the \u escapement notation.

Programming languages vary widely on whether they support objects, and if so, what characteristics and constraints the objects offer. The models of object systems can be wildly divergent and are continuing to evolve. JSON instead provides a simple notation for expressing collections of name/value pairs. Most programming languages will have some feature for representing such collections, which can go by names like record, struct, diet, map, hash, or object.

JSON also provides support for ordered lists of values. All programming languages will have some feature for representing such lists, which can go by names like array, vector, or list. Because objects and arrays can nest, trees and other complex data structures can be represented. By accepting JSON’s simple convention, complex data structures can be easily interchanged between incompatible programming languages.

JSON does not support cyclic graphs, at least not directly. JSON is not indicated for applications requiring binary data.

It is expected that other standards will refer to this one, strictly adhering to the JSON text format, while imposing restrictions on various encoding details. Such standards may require specific behaviours. JSON itself specifies no behaviour.

Because it is so simple, it is not expected that the JSON grammar will ever change. This gives JSON, as a foundational notation, tremendous stability. JSON was first presented to the world at the JSON.org website in 2001. JSON stands for JavaScript Object Notation.

JSON 数据交换格式 The JSON Data Interchange Format

1. 范围 Scope

​ JSON 是一种轻量级的、基于文本的、与语言无关的数据交换格式。它源自 ECMAScript 编程语言,但独立于编程语言。JSON 为结构化数据的可移植表示定义了一小组结构化规则。

JSON is a lightweight, text-based, language-independent data interchange format. It was derived from the ECMAScript programming language, but is programming language independent. JSON defines a small set of structuring rules for the portable representation of structured data.

2. 一致性 Conformance

​ JSON 文本是严格符合 JSON 语法的 Unicode 码点序列。

Conforming JSON text is a sequence of Unicode code points that strictly conforms to the JSON grammar.

3. 引用标准 Normative references

The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

ISO/IEC 10646:2012*, Information Technology – Universal Coded Character Set (UCS)*

The Unicode Consortium. The Unicode Standard, Version 6.2.0, (Mountain View, CA: The Unicode Consortium, 2012. ISBN 978-1-936213-07-8)

http://www.unicode.org/versions/Unicode6.2.0/.

4. JSON 文本 JSON Text

​ JSON 文本是由符合 JSON 语法的 Unicode 码点组成的标记(token)序列。标记集合包括六个结构标记、字符串、数字和三个字面常量标记。

​ 六个结构标记:

  • [ U+005B 左中括号
  • { U+007B 左大括号
  • ] U+005D 右中括号
  • } U+007D 右大括号
  • : U+003A 冒号
  • , U+002C 逗号

​ 三个字面常量标记

  • true U+0074 U+0072 U+0075 U+0065
  • false U+0066 U+0061 U+006c U+0073 U+0065
  • null U+006E U+0075 U+006C U+006C

在任何标记之间或之后都允许使用无关紧要的空白字符,这些空白字符有制表符 (U+000A)、换行符 (U+000A)、回车符 (U+000D) 和空格符 (U+0020)。除了字符串外,其他任何标记内都不允许使用空白字符。

A JSON text is a sequence of tokens formed from Unicode code points that conforms to the JSON value grammar. The set of tokens includes six structural tokens, strings, numbers, and three literal name tokens.

The six structural tokens:

  • [ U+005B left square bracke
  • { U+007B left curly bracket
  • ] U+005D right square bracket
  • } U+007D right curly bracket
  • : U+003A colon
  • , U+002C comma

These are three literal name tokens:

  • true U+0074 U+0072 U+0075 U+0065
  • false U+0066 U+0061 U+006c U+0073 U+0065
  • null U+006E U+0075 U+006C U+006C

Insignificant whitespace is allowed before or after any token. The whitespace characters are: character tabulation (U+0009), line feed (U+000A), carriage return (U+000D), and space (U+0020). Whitespace is not allowed within any token, except that space is allowed in strings.

5. JSON 值 JSON Values

​ JSON 的值可以是对象 (object),列表 (array),数字 (number),字符串 (string),truefalsenull

请添加图片描述

A JSON value can be an object, array, number, string, true, false, or null.

6. 对象 Objects

​ 对象结构表示为一对包含零个或多个键值对的大括号标记。键为一个字符串。每个键后面都有一个冒号标记,将键和值分开。不同键值对之间以逗号标记分隔。
请添加图片描述

An object structure is represented as a pair of curly bracket tokens surrounding zero or more name/value pairs. A name is a string. A single colon token follows each name, separating the name from the value. A single comma token separates a value from a following name.

7. 列表 Arrays

​ 数组结构是一对围绕零个或多个值的中括号标记。这些值以逗号分隔。值的顺序是重要的。

请添加图片描述

An array structure is a pair of square bracket tokens surrounding zero or more values. The values are separated by commas. The order of the values is significant.

8.数字 Numbers

​ 一个十进制没有前导零的十进制数。它可能有一个负号 (U+002D) 前缀。它可能有一个以小数点 (U+002E) 作为前缀的小数位。它可能有一个以十为底数前缀为 e (U+0065) 或 E (U+0045) 前缀后可选择正号 () 或负号 () 的指数位。数字的码点为 U+0030 到 U+0039。

​ 不能使用不能表示为数字序列的数值,例如无穷大 Infinity 或 NaN。

请添加图片描述

A number is represented in base 10 with no superfluous leading zero. It may have a preceding minus sign (U+002D). It may have a . (U+002E) prefixed fractional part. It may have an exponent of ten, prefixed by e(U+0065) or E (U+0045) and optionally + (U+002B) or – (U+002D). The digits are the code points U+0030 through U+0039.

Numeric values that cannot be represented as sequences of digits (such as Infinity and NaN) are not permitted.

9. 字符串 String

​ 字符串是用引号 (U+005C) 包裹的 Unicode 码点序列。除去必须转义的字符包括引号 (U+0022)、反斜杠 (U+005C) 和 U+0000 到 U+001F 的控制字符外,所有字符都可以放在引号内。某些字符由两个字符的转义序列表示。

转义序列对应字符字符的码点
\"引号U+0022
\\反斜杠U+005C
\/斜杠U+002F
\b退格符U+0008
\f换页符U+000C
\n换行符U+000A
\r回车符U+000D
\t制表符U+0009

​ 因此,例如,仅包含一个反斜杠的字符串可以表示为“\\”。

​ 任何码点都可以表示为一个十六进制数。这样的一个数字的意义是明确的。根据 ISO/IEC 10646 ,若码点位于基本多文种平面 (U+0000 到 U+FFFF) ,那么它可能表示已一个六个字符的序列:反斜线,后跟小写字母 u,然后是编码码点的四个十六进制数字。十六进制数字可以是数字 (U+0030 到 U+0039) 或大写的十六进制字母 A 到 F (U+0041 到 U+0046) 或小写 (U+0061 到 U+0066)。因此,例如,仅包含一个反斜杠的字符串可以表示为 “\u005C”。

​ 下述的四个表达式结果是相同的:“\u002F”、“\u002f”、“/”、“/”。

​ 为了转义不在基本多文种平面的码点,字符被表示为一个十二个字符的序列来编码 UTF-16 代理对。例如,一个只包含 G 谱号 (𝄞) (U+1D11E) 的字符串可以表示为 “\uD834\uDD1E”。

请添加图片描述

A string is a sequence of Unicode code points wrapped with quotation marks (U+0022). All characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F. There are two-character escape sequence representations of some characters.

Any code point may be represented as a hexadecimal number. The meaning of such a number is determined by ISO/IEC 10646. If the code point is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the code point. Hexadecimal digits can be digits (U+0030 through U+0039) or the hexadecimal letters A through F in uppercase (U+0041 through U+0046) or lowercase (U+0061 through U+0066). So, for example, a string containing only a single reverse solidus character may be represented as “\u005C”.

The following four cases all produce the same result:

“\u002F”

“\u002f”

“/”

“/”

To escape a code point that is not in the Basic Multilingual Plane, the character is represented as a twelve-character sequence, encoding the UTF-16 surrogate pair. So for example, a string containing only the G clef character (U+1D11E) may be represented as “\uD834\uDD1E”.

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值