什么是 OOXML?
Office Open XML,也称为 OpenXML 或 OOXML,是一种基于 XML 的办公文档格式,包括文字处理文档、电子表格、演示文稿以及图表、图表、形状和其他图形材料。该规范由 Microsoft 开发,2006 年被 ECMA 国际采用为 ECMA-376。2008 年 12 月发布了第二个版本,2011 年 6 月发布了该标准的第三个版本。该规范已被 ISO 和 IEC 采纳作为 ISO/IEC 29500。
请务必记住,OOXML 与 OpenOffice XML 或 OpenOffice.org 和其他开源办公软件基础的开放文档格式 (ODF) 不同。 Office Open XML 和 Open Office XML 或 ODF 在某种意义上是办公文档的竞争 XML 标准。
尽管 Microsoft 继续支持旧的二进制格式(.doc、xls 和 .ppt),但 OOXML 现在是所有 Microsoft Office 文档(.docx、.xlsx 和 .pptx)的默认格式。
OOXML 指定了什么?
标记规范
ECMA-376 包括针对三种主要办公文档类型中的每一种的三种不同规范——用于文字处理文档的 WordprocessingML、用于电子表格文档的 SpreadsheetML 和用于演示文档的 PresentationML。它还包括一些支持标记语言,最重要的是用于绘图、形状和图表的 DrawingML。
该规范包括书面形式的 XML 模式和约束。任何符合标准的文档都必须符合 XML 模式,并采用 UTF-8 或 UTF-16 编码。该规范确实包括一些可扩展性机制,允许自定义 XML 与 OOXML 标记一起存储。
文件打包规范
除了标记语言规范之外,ECMA-376 的第 2 部分还指定了开放包装约定 (OPC)。 OPC 是一种容器文件技术,它利用通用 ZIP 格式将文件组合成一个通用包。所以 OOXML 文件是包含各种 XML 文件(部分)并组织成单个包的 ZIP 档案。这种将数据分解或分块的方式可以更轻松、更快速地访问数据,并减少数据损坏的机会。部件可以包含任何类型的数据;为了在不依赖文件扩展名的情况下跟踪每个部分的数据类型,每个部分的类型在名为 [Content_Types].xml 的包内的文件中指定。部件与包的关系以及任何部件可能具有的关系从部件中抽象出来并单独存储在关系文件中——一个用于整个包,一个用于具有关系的每个包。通过这种方式,引用只存储一次,因此可以在必要时轻松更改。
What is OOXML?
Office Open XML, also known as OpenXML or OOXML, is an XML-based format for office documents, including word processing documents, spreadsheets, presentations, as well as charts, diagrams, shapes, and other graphical material. The specification was developed by Microsoft and adopted by ECMA International as ECMA-376 in 2006. A second version was released in December, 2008, and a third version of the standard released in June, 2011. The specification has been adopted by ISO and IEC as ISO/IEC 29500.
It is important to keep in mind that OOXML is not the same as Open Office XML or the Open Document Format (ODF) that underlies the OpenOffice.org and other open source office software. Office Open XML and Open Office XML or ODF are in some sense competing XML standards for office documents.
Although the older binary formats (.doc, xls, and .ppt) continue to be supported by Microsoft, OOXML is now the default format of all Microsoft Office documents (.docx, .xlsx, and .pptx).
What does OOXML specify?
Markup specifications
ECMA-376 includes three different specifications for each of the three main office document types–WordprocessingML for word processing documents, SpreadsheetML for spreadsheet documents, and PresentationML for presentation documents. It also includes some supporting markup languages, most importantly DrawingML for drawings, shapes and charts.
The specification includes both XML schemas and constraints in written form. Any conforming document must conform to XML schemas, and be in UTF-8 or UTF-16 encoding. The specification does include some extensibility mechanisms allowing custom XML to be stored with the OOXML markup.
A file packaging specification
In addition to the markup language specifications, Part 2 of ECMA-376 specifies the Open Packaging Conventions (OPC). OPC is a container-file technology leveraging the common ZIP format for combining files into a common package. So OOXML files are ZIP archives containing various XML files (parts) and organized into single package. This breaking up or chunking of the data into pieces makes it easier and quicker to access data and reduces the chances of data corruption. The parts can contain any type of data; to keep track of the data type of each part without relying on file extensions, the type for each part is specified in a file within the package called [Content_Types].xml. The relationships of the parts to the package as well as relationships that any part may have are abstracted from the parts and stored separately in relationship files–one for the package as a whole and one for each package that has relationships. In this way references are stored only once and can therefor be easily changed when necessary.