XPATH1.0 规范翻译

最新推荐文章于 2022-02-06 23:54:30 发布

zhy2002

最新推荐文章于 2022-02-06 23:54:30 发布

阅读量1.8k

点赞数 2

文章标签： string function character xml processing attributes

本文链接：https://blog.csdn.net/zhy2002/article/details/1968610

版权

XPath是一种用于XML文档部分地址定位的语言，设计用于XSLT和XPointer。它提供了一种紧凑的非XML语法，方便在URI和XML属性值中使用。XPath操作XML文档的抽象逻辑结构，而非表面语法。它包括节点集、布尔、数字和字符串四种基本类型，支持节点测试、轴、谓词和缩写语法等功能，并定义了一个核心函数库。

摘要由CSDN通过智能技术生成

XML Path Language (XPath)

XML路径语言
Version 1.0

W3C Recommendation 16 November 1999

This version:

http://www.w3.org/TR/1999/REC-xpath-19991116
(available in XML or HTML)

Latest version:

http://www.w3.org/TR/xpath

Previous versions:

http://www.w3.org/TR/1999/PR-xpath-19991008
http://www.w3.org/1999/08/WD-xpath-19990813
http://www.w3.org/1999/07/WD-xpath-19990709
http://www.w3.org/TR/1999/WD-xslt-19990421

Editors:

James Clark <jjc@jclark.com>
Steve DeRose (Inso Corp. and Brown University ) <Steven_DeRose@Brown.edu>

Abstract

XPath is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer.

Xpath是一门用于定址xml文档部分的语言，被设计由xslt和xpointer使用。

Status of this document

[略]

This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from other documents. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.

The list of known errors in this specification is available at http://www.w3.org/1999/11/REC-xpath-19991116-errata.

Comments on this specification may be sent to www-xpath-comments@w3.org; archives of the comments are available.

The English version of this specification is the only normative version. However, for translations of this document, see http://www.w3.org/Style/XSL/translations.html.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.

This specification is joint work of the XSL Working Group and the XML Linking Working Group and so is part of the W3C Style activity and of the W3C XML activity.

Table of contents

1 Introduction
2 Location Paths
    2.1 Location Steps
    2.2 Axes
    2.3 Node Tests
    2.4 Predicates
    2.5 Abbreviated Syntax
3 Expressions
    3.1 Basics
    3.2 Function Calls
    3.3 Node-sets
    3.4 Booleans
    3.5 Numbers
    3.6 Strings
    3.7 Lexical Structure
4 Core Function Library
    4.1 Node Set Functions
    4.2 String Functions
    4.3 Boolean Functions
    4.4 Number Functions
5 Data Model
    5.1 Root Node
    5.2 Element Nodes
        5.2.1 Unique IDs
    5.3 Attribute Nodes
    5.4 Namespace Nodes
    5.5 Processing Instruction Nodes
    5.6 Comment Nodes
    5.7 Text Nodes
6 Conformance

Appendices

A References
A.1 Normative References
A.2 Other References
B XML Information Set Mapping (Non-Normative)

1 Introduction

简介

XPath is the result of an effort to provide a common syntax and semantics for functionality shared between XSL Transformations [XSLT] and XPointer [XPointer]. The primary purpose of XPath is to address parts of an XML [XML] document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and booleans. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.

为了向xslt和xpointer共同需要的功能提供统一的语法和语义而设计了xpath。Xpath的主要功能是指称xml文档的(多个)部分。为了支持这一功能，xpath还提供了用于处理字符串、数字、布尔的基本机制。Xpath使用了一种紧凑的非xml语法以便使其便于在uri和xml属性值中使用。Xpath对xml文档的操作基于逻辑结构而不是字面语法。Xpath的名字来源自它作为url中的路径符号的用法，以在具有层次的xml文档中指称特定部分。

In addition to its use for addressing, XPath is also designed so that it has a natural subset that can be used for matching (testing whether or not a node matches a pattern); this use of XPath is described in XSLT.

除了用于定位，xpath还设计有一个真子集，可用用于匹配（测试一个节点是否符合某种模式）。Xpath的这一用法在xslt中描述。

XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string-value for each type of node. Some types of nodes also have names. XPath fully supports XML Namespaces [XML Names]. Thus, the name of a node is modeled as a pair consisting of a local part and a possibly null namespace URI; this is called an expanded-name. The data model is described in detail in [5 Data Model].

Xpath将xml文档解析为一个节点树。由许多类型的节点，其中包括元素节点、属性节点和文本节点。Xpath定义了一种将各种节点映射为一个字符串值的方法。一些节点具有名称。Xpath完全支持xml命名空间。因此一个节点的名称被解析为局部名和可能为空的命名空间uri组成的二元组，这被称为展开名。数据模型在下文详述。

The primary syntactic construct in XPath is the expression. An expression matches the production Expr. An expression is evaluated to yield an object, which has one of the following four basic types:

Xpath中的原始语法构造是表达式，表达式由表达式产出式定义。一个表达式被求值以产生一个对象，可能是以下几种类型：

node-set (an unordered collection of nodes without duplicates) 节点集（节点的无重复无序集合）
boolean (true or false) 布尔
number (a floating-point number) 数字
string (a sequence of UCS characters) 字符串（一个统一字符集中字符的序列）

Expression evaluation occurs with respect to a context. XSLT and XPointer specify how the context is determined for XPath expressions used in XSLT and XPointer respectively. The context consists of:

表达式在一个上下文中求值。Xslt和xpointer指定了各自的上下文决定机制。上下文由以下元素构成：

a node (the context node) 一个节点（上下文节点）
a pair of non-zero positive integers (the context position and the context size) 一对正整数（上下文位置和上下文大小）
a set of variable bindings一个变量集
a function library一个函数库
the set of namespace declarations in scope for the expression
表达式处于其作用域中的命名空间声明集合

The context position is always less than or equal to the context size.

上下文位置总是小于或等于上下文大小。

The variable bindings consist of a mapping from variable names to variable values. The value of a variable is an object, which can be of any of the types that are possible for the value of an expression, and may also be of additional types not specified here.

变量绑定由一个从变量名到变量值得映射构成。变量的值是一个对象，可以是任何表达式结果类型，也可以这里没有说明的附加类型。

The function library consists of a mapping from function names to functions. Each function takes zero or more arguments and returns a single result. This document defines a core function library that all XPath implementations must support (see [4 Core Function Library]). For a function in the core function library, arguments and result are of the four basic types. Both XSLT and XPointer extend XPath by defining additional functions; some of these functions operate on the four basic types; others operate on additional data types defined by XSLT and XPointer.

函数库是一个从函数名到函数的映射。每个函数接受有限个参数并返回单一结果。这篇文档定义了一个所有xpath实现都必须支持的核心函数库。核心函数的参数和结果都是四基本类型之一。Xslt和xpoint都扩展了函数库，扩展函数中的一部分是四种基本类型上的映射，另一些涉及到xslt或xpointer中额外定义的数据类型。

The namespace declarations consist of a mapping from prefixes to namespace URIs.

命名空间声明集合是一个从前缀到uri的映射。

The variable bindings, function library and namespace declarations used to evaluate a subexpression are always the same as those used to evaluate the containing expression. The context node, context position, and context size used to evaluate a subexpression are sometimes different from those used to evaluate the containing expression. Several kinds of expressions change the context node; only predicates change the context position and context size (see [2.4 Predicates]). When the evaluation of a kind of expression is described, it will always be explicitly stated if the context node, context position, and context size change for the evaluation of subexpressions; if nothing is said about the context node, context position, and context size, they remain unchanged for the evaluation of subexpressions of that kind of expression.

用于求子表达式值的变量集、函数库和命名空间声明集合总是和求解包含它们的的父表达式时所使用到的相同。而上下文节点和上下文大小、位置则可能变化。一些表达式可能会改变上下文节点，但只有谓词会改变上下文位置和上下文大小。因此当描述一种表达式的求值时，我们总会显示指明上下文节点、上下文位置和上下文大小是否会因此改变，如果没有指明就是不会变化。

XPath expressions often occur in XML attributes. The grammar specified in this section applies to the attribute value after XML 1.0 normalization. So, for example, if the grammar uses the character <, this must not appear in the XML source as < but must be quoted according to XML 1.0 rules by, for example, entering it as <. Within expressions, literal strings are delimited by single or double quotation marks, which are also used to delimit XML attributes. To avoid a quotation mark in an expression being interpreted by the XML processor as terminating the attribute value the quotation mark can be entered as a character reference (" or '). Alternatively, the expression can use single quotation marks if the XML attribute is delimited with double quotation marks or vice-versa.

Xpath表达式经常出现在xml属性中。本节中描述的语法适用于xml1.0标准之后的属性值。如果语法中出现了<,那么在xml中使用时必须根据xml1.0规则转义。例如通过<进行实体引用；在表达式中单一号和双引号用于分隔字符串字面量，而它们也同样用于分隔xml属性，因此它们在xml属性中也必须以字符引用的形式出现。或者，如果属性使用双引号分隔，再表达式中可以使用单引号，如果属性使用的是单引号，那么在表达式中可以使用双引号。(一句话这里描述的xpath是独立于xml的标准，如果要在xml中表达则xpath是目标语言)

One important kind of expression is a location path. A location path selects a set of nodes relative to the context node. The result of evaluating an expression that is a location path is the node-set containing the nodes selected by the location path. Location paths can recursively contain expressions that are used to filter sets of nodes. A location path matches the production LocationPath.

一类重要的表达式是定位路径。定位路径选定相对于上下文节点的一个节点集合。定位路径表达式的求值结果是包含其选中节点的节点集。定位路径可以递归地包含用于过滤节点或集合的表达式。定位路径由定位路径产出式定义。

In the following grammar, the non-terminals QName and NCName are defined in [XML Names], and S is defined in [XML]. The grammar uses the same EBNF notation as [XML] (except that grammar symbols always have initial capital letters).

下面语法中的非终止符好qname和ncnane在xml names规范中定义。S在xml规范中定义。本语法使用与xml规范相同的EBNF符号。

Expressions are parsed by first dividing the character string to be parsed into tokens and then parsing the resulting sequence of tokens. Whitespace can be freely used between tokens. The tokenization process is described in [3.7 Lexical Structure].

表达式通过将要分析的字符序列划分为token来解析，然后再分析产生的token序列。空白字符可以在token之间自由出现。Token解析过程在下文描述。

2 Location Paths

Although location paths are not the most general grammatical construct in the language (a LocationPath is a special case of an Expr), they are the most important construct and will therefore be described first.

虽然定位路径并不是最一般的语法构造（定位路径是特殊的表达式），但由于其最重要性首先介绍。

Every location path can be expressed using a straightforward but rather verbose syntax. There are also a number of syntactic abbreviations that allow common cases to be expressed concisely. This section will explain the semantics of location paths using the unabbreviated syntax. The abbreviated syntax will then be explained by showing how it expands into the unabbreviated syntax (see [2.5 Abbreviated Syntax]).

任何定位路径可以用一个直白但冗长的语法表达。当然也有简写它的语法。本节将通过定位路径的非缩写语法解释其语义。缩写规则在下文描述。

Here are some examples of location paths using the unabbreviated syntax:

这是一些使用非所略形式表达的定位路径的例子：

child::para selects the para element children of the context node
选择上下文节点的名为para的子元素
child::* selects all element children of the context node
选择上下文节的所有子元素
child::text() selects all text node children of the context node
选择上下文节点的所有文本子节点
child::node() selects all the children of the context node, whatever their node type
选择上下文节点的所有子节点，无论它们的类型
attribute::name selects the name attribute of the context node
选择上下文节点名为name的属性
attribute::* selects all the attributes of the context node
选择上下文节点的所有属性
descendant::para selects the para element descendants of the context node
选择上下文节点的所有名为para的子孙
ancestor::div selects all div ancestors of the context node
选择上下文节点所有名为div的祖先
ancestor-or-self::div selects the div ancestors of the context node and, if the context node is a div element, the context node as well
选择上下文节点所有名为div的祖先，如果上下文节点名称为div，则一并被选择
descendant-or-self::para selects the para element descendants of the context node and, if the context node is a para element, the context node as well
选择上下文节点名为para的子孙，如果上下文节点名为para，则一并被选择
self::para selects the context node if it is a para element, and otherwise selects nothing
如果上下文节点名为para（元素）则选择它否则什么都不选择
child::chapter/descendant::para selects the para element descendants of the chapter element children of the context node
选择上下文节点的chapter子元素的para元素子孙(/表迭代计算)
child::*/child::para selects all para grandchildren of the context node
选择上下文节点的所有孙元素
/ selects the document root (which is always the parent of the document element)
选择文档元素
/descendant::para selects all the para elements in the same document as the context node
将所有文档中para选作上下文节点(由于计算会改变上下文节点，因此说是迭代)
/descendant::olist/child::item selects all the item elements that have an olist parent and that are in the same document as the context node
选择上下文节点所在文档中所有具有olist父元素的item元素
child::para[position()=1] selects the first para child of the context node
算则上下文节点第一个名为para的子
child::para[position()=last()] selects the last para child of the context node
选择上下文节点名为para的最后一个子
child::para[position()=last()-1] selects the last but one para child of the context node
选择上下文节点倒数第二个子
child::para[position()>1] selects all the para children of the context node other than the first para child of the context node
选择上下文节点所有名为para的子，除了第一个
following-sibling::chapter[position()=1] selects the next chapter sibling of the context node
选择上下文节点的下一个chapter同胞
preceding-sibling::chapter[position()=1] selects the previous chapter sibling of the context node
选择上下文节点的前一个chapter同胞
/descendant::figure[position()=42] selects the forty-second figure element in the document
选择文档中第42个figure元素
/child::doc/child::chapter[position()=5]/child::section[position()=2] selects the second section of the fifth chapter of the doc document element
选择doc文档元素的第五个chapter元素的第二个section元素
child::para[attribute::type="warning"] selects all para children of the context node that have a type attribute with value warning
选择上下文节点的所有具有名为type的属性且值warning的para子元素
child::para[attribute::type='warning'][position()=5] selects the fifth para child of the context node that has a type attribute with value warning
选择上下文节点具有type属性且其值为warning的第五个para子
child::para[position()=5][attribute::type="warning"] selects the fifth para child of the context node if that child has a type attribute with value warning
选择上下文节点的第五个para，如果它具有名为type的属性且值为warning
child::chapter[child::title='Introduction'] selects the chapter children of the context node that have one or more title children with string-value equal to Introduction
选择上下文节点的具有一个或多个（字符串值为Introduction的title子元素）的chapter子
child::chapter[child::title] selects the chapter children of the context node that have one or more title children
选择上下文节点的chapter子元素，如果该子元素具有title子元素
child::*[self::chapter or self::appendix] selects the chapter and appendix children of the context node
选择上下文节点的所有chapter和appendix子
child::*[self::chapter or self::appendix][position()=last()] selects the last chapter or appendix child of the context node
选择上下文节点所有chapter和chapter子元素中的最后一个

There are two kinds of location path: relative location paths and absolute location paths.

有两种定位路径：相对定位路径和绝对定位路径。

A relative location path consists of a sequence of one or more location steps separated by /. The steps in a relative location path are composed together from left to right. Each step in turn selects a set of nodes relative to a context node. An initial sequence of steps is composed together with a following step as follows. The initial sequence of steps selects a set of nodes relative to a context node. Each node in that set is used as a context node for the following step. The sets of nodes identified by that step are unioned together. The set of nodes identified by the composition of the steps is this union. For example, child::div/child::para selects the para element children of the div element children of the context node, or, in other words, the para element grandchildren that have div parents.

相对定位路径是一个或多个定位步骤组成的序列，其间由/分隔。在一个相对定位路径中的步自左向右结合。每一步依次选择一个相对于上下文节点的节点集。初始定位步序列同一个作为其后续步骤的定位步结合在一起，然后初始定位步序列选择了相对于上下文节点的一集节点，最后这个集合中的每个节点被作为后续定位步的上下文节点，由后续步标识出的所有节点集被合并在一起。整个定位路径的结果就是这个合并的结果。例如，child::div/child::para选择了上下文节的div子的para子，或者说具有div父元素的孙元素。

An absolute location path consists of / optionally followed by a relative location path. A / by itself selects the root node of the document containing the context node. If it is followed by a relative location path, then the location path selects the set of nodes that would be selected by the relative location path relative to the root node of the document containing the context node.

绝对路径由一个/构成，可以跟一个可选的相对定位路径。一个/自身选择包含着上下文节点的文档的根节点。如果后面跟着一个相对定位路径，那么此绝对定位路径选择的节点集就是相对定位路径相对于包含上下文节点的文档根节点选择的节点集。

Location Paths

[1]	LocationPath	::=	RelativeLocationPath
			\| AbsoluteLocationPath
[2]	AbsoluteLocationPath	::=	'/' RelativeLocationPath?
			\| AbbreviatedAbsoluteLocationPath
[3]	RelativeLocationPath	::=	Step
			\| RelativeLocationPath '/' Step
			\| AbbreviatedRelativeLocationPath

2.1 Location Steps

A location step has three parts:

一个定位步分为三部分：

an axis, which specifies the tree relationship between the nodes selected by the location step and the context node,
一个指定上下文节点与被步选择节点间树关系的轴。
a node test, which specifies the node type and expanded-name of the nodes selected by the location step, and
一个节点测试，指定节点类型和被选择节点的全名。
zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.
有限多个谓词，使用任意表达式以进一步精炼定位步选择的节点集。

The syntax for a location step is the axis name and node test separated by a double colon, followed by zero or more expressions each in square brackets. For example, in child::para[position()=1], child is the name of the axis, para is the node test and [position()=1] is a predicate.

定位步的语法是轴名和节点测试名，其间用::分开；然后是有限多的表达式，每一个都用[]括起来。例如：

The node-set selected by the location step is the node-set that results from generating an initial node-set from the axis and node-test, and then filtering that node-set by each of the predicates in turn.

定位步选定的节点集是由轴和节点测试确定的初始节点集，后经各谓词依次过滤而得。

The initial node-set consists of the nodes having the relationship to the context node specified by the axis, and having the node type and expanded-name specified by the node test. For example, a location step descendant::para selects the para element descendants of the context node: descendant specifies that each node in the initial node-set must be a descendant of the context; para specifies that each node in the initial node-set must be an element named para. The available axes are described in [2.2 Axes]. The available node tests are described in [2.3 Node Tests]. The meaning of some node tests is dependent on the axis.

初始节点集是由同上下文节点具有轴所指定的关系，并具有节点测试指定的类型与全名的节点构成的集合。例如定位步：descentant::para选择上下文节点所有名为para的子孙。Decendant指定初始节点集中的所有节点都必须是上下文节点的子孙，节点测试para指定初始节点集中的每个节点都必须是名为para的元素。某些节点测试的意义由轴决定。

The initial node-set is filtered by the first predicate to generate a new node-set; this new node-set is then filtered using the second predicate, and so on. The final node-set is the node-set selected by the location step. The axis affects how the expression in each predicate is evaluated and so the semantics of a predicate is defined with respect to an axis. See [2.4 Predicates].

初始节点集由第一个为此过滤以产生一个新的节点集，这个新节点集然后被用第二个谓词过滤，如此继续。最终的节点集就是整个定位步选定的节点集。轴会影响各个谓词中表达式的求值，因此谓词的表达式的语义是同轴相关的。

[$求是是针对上下文节点而不是针对上下文节点集。是针对上一步产生的节点的中的每一个节点进行，它们每一个经过定位步后产生一个节点集，整个定位步的结果是这些集合的并集。$]

，Location Steps