java xpath使用_在XPath中使用Java语言NamespaceContext对象

最新推荐文章于 2024-04-23 18:52:22 发布

cuyi7076

最新推荐文章于 2024-04-23 18:52:22 发布

阅读量193

点赞数

文章标签： java python 编程语言 javascript xml ViewUI

原文链接：https://www.ibm.com/developerworks/xml/library/x-nmspccontext/index.html

版权

java xpath使用

先决条件和示例

在本文中，我假设您熟悉Brett McLaughlin的“从Java™平台评估XPath”中描述的技术细节。如果你不知道如何使用XPath运行Java程序，请参阅Brett的文章（请参阅相关主题的文章链接。）同样是真实的加载一个XML文件，并评估XPath表达式所需的API 。

所有示例都将使用以下XML文件：

清单1.示例XML

<?xml version="1.0" encoding="UTF-8"?>

<books:booklist
  xmlns:books="http://univNaSpResolver/booklist"
  xmlns="http://univNaSpResolver/book"
  xmlns:fiction="http://univNaSpResolver/fictionbook">
  <science:book xmlns:science="http://univNaSpResolver/sciencebook">
    <title>Learning XPath</title>
    <author>Michael Schmidt</author>
  </science:book>
  <fiction:book>
    <title>Faust I</title>
    <author>Johann Wolfgang von Goethe</author>
  </fiction:book>
  <fiction:book>
    <title>Faust II</title>
    <author>Johann Wolfgang von Goethe</author>
  </fiction:book>
</books:booklist>

这个XML示例在根元素中声明了三个命名空间，在结构中更深的元素中声明了一个命名空间。您将看到此设置导致的差异。

常用缩略语

API：应用程序编程接口
DOM：文档对象模型
URI：通用资源标识符
XHTML：可扩展超文本标记语言
XML：可扩展标记语言
XSD：XML架构定义
XSLT：可扩展样式表语言转换

关于此XML示例的第二件有趣的事情是，元素booklist具有三个子级，均名为book 。但是第一个子级具有命名空间science ，而第二个子级具有命名空间fiction 。这意味着这些元素与XPath完全不同。您将在下面的示例中看到结果。

关于示例源代码的一点说明：该代码不是为维护而优化的，而是为了提高可读性。这意味着它具有一些冗余。通过System.out.println()以最简单的方式产生输出。与输出有关的所有代码行均在本文中缩写为“ ...”。另外，我不在本文中介绍辅助方法，但它们包含在下载文件中（请参阅下载）。

理论背景

命名空间的含义是什么，为什么要关心它们？名称空间是元素或属性的标识符的一部分。您可以具有具有相同本地名称但具有不同名称空间的元素或属性。他们是完全不同的。请参见上面的示例（ science:book和fiction:book ）。如果合并来自不同来源的XML文件，则需要命名空间来解决命名冲突。以一个XSLT文件为例。它由XSLT命名空间的元素，您自己的命名空间的元素和XHTML命名空间的元素（通常）组成。使用名称空间，可以避免有关具有相同本地名称的元素的歧义。

名称空间由URI定义（在本示例中为http://univNaSpResolver/booklist ）。为了避免使用此长字符串，请定义与此URI关联的前缀（在示例中为books ）。请记住，前缀就像一个变量：其名称无关紧要。如果两个前缀引用相同的URI，则带前缀的元素的名称空间将相同（有关此示例，请参见清单5中的示例1）。

XPath表达式使用前缀（例如， books:booklist/science:book ），并且您必须提供与每个前缀关联的URI。这就是NamespaceContext进入的地方。它正是这样做的。

本文介绍了在前缀和URI之间提供映射的不同方法。

在XML文件中，映射由xmlns属性提供，例如： xmlns:books="http://univNaSpResolver/booklist"或xmlns="http://univNaSpResolver/book" （默认名称空间）。

提供名称空间解析的必要性

如果您具有使用名称空间的XML，那么如果不提供NamespaceContext，则XPath表达式将失败。清单2中的示例0显示了这种情况。 XPath对象是在加载的XML文档上构造和评估的。首先，尝试写一个表达式没有任何名称空间前缀（ result1 ）。在第二部分中，使用名称空间前缀（ result2 ）编写表达式。

清单2.没有名称空间解析的示例0

private static void example0(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Zero example - no namespaces provided ***");

        XPath xPath = XPathFactory.newInstance().newXPath();

...
        NodeList result1 = (NodeList) xPath.evaluate("booklist/book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
    }

这将导致以下输出。

清单3.示例0的输出

*** Zero example - no namespaces provided ***
First try asking without namespace prefix:
--> booklist/book
Result is of length 0
Then try asking with namespace prefix:
--> books:booklist/science:book
Result is of length 0
The expression does not work in both cases.

在这两种情况下，XPath评估均不返回任何节点，也不例外。 XPath找不到节点，因为缺少前缀到URI的映射。

硬编码的名称空间解析

可以将名称空间作为硬编码值提供，看起来像清单4中的类：

清单4.硬编码的名称空间解析

public class HardcodedNamespaceResolver implements NamespaceContext {

    /**
     * This method returns the uri for all prefixes needed. Wherever possible
     * it uses XMLConstants.
     * 
     * @param prefix
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null) {
            throw new IllegalArgumentException("No prefix provided!");
        } else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return "http://univNaSpResolver/book";
        } else if (prefix.equals("books")) {
            return "http://univNaSpResolver/booklist";
        } else if (prefix.equals("fiction")) {
            return "http://univNaSpResolver/fictionbook";
        } else if (prefix.equals("technical")) {
            return "http://univNaSpResolver/sciencebook";
        } else {
            return XMLConstants.NULL_NS_URI;
        }
    }

    public String getPrefix(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

}

请注意，名称空间http://univNaSpResolver/sciencebook绑定到前缀technical （而不是以前的science ）。您将在下面的示例中看到后果（清单6）。在清单5中，使用此解析器的代码使用新的前缀。

清单5.具有硬编码的名称空间解析的示例1

private static void example1(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** First example - namespacelookup hardcoded ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new HardcodedNamespaceResolver());

...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/technical:book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate("books:booklist/technical:book/:author",
                example);
...
    }

这是此示例的输出。

清单6.示例1的输出

*** First example - namespacelookup hardcoded ***
Using any namespaces results in a NodeList:
--> books:booklist/technical:book
Number of Nodes: 1
<?xml version="1.0" encoding="UTF-8"?>
  <science:book xmlns:science="http://univNaSpResolver/sciencebook">
    <title xmlns="http://univNaSpResolver/book">Learning XPath</title>
    <author xmlns="http://univNaSpResolver/book">Michael Schmidt</author>
  </science:book>
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
  <fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
    <title xmlns="http://univNaSpResolver/book">Faust I</title>
    <author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
  </fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
  <fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
    <title xmlns="http://univNaSpResolver/book">Faust II</title>
    <author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
  </fiction:book>
The default namespace works also:
--> books:booklist/technical:book/:author
Michael Schmidt

如您所见，XPath现在可以找到节点。优点是您可以根据需要重命名前缀，这是我对前缀science所做的。 XML文件包含前缀science ，而XPath使用另一个前缀technical 。因为URI相同，所以XPath可以找到节点。缺点是您必须在更多地方维护名称空间：XML，也许是XSD，XPath表达式和名称空间上下文。

从文档中读取名称空间

名称空间及其前缀记录在XML文件中，因此您可以从那里使用它们。最简单的方法是将查找委托给文档。

清单7.直接从文档中解析名称空间

public class UniversalNamespaceResolver implements NamespaceContext {
    // the delegate
    private Document sourceDocument;

    /**
     * This constructor stores the source document to search the namespaces in
     * it.
     * 
     * @param document
     *            source document
     */
    public UniversalNamespaceResolver(Document document) {
        sourceDocument = document;
    }

    /**
     * The lookup for the namespace uris is delegated to the stored document.
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return sourceDocument.lookupNamespaceURI(null);
        } else {
            return sourceDocument.lookupNamespaceURI(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return sourceDocument.lookupPrefix(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // not implemented yet
        return null;
    }

}

记住这些事情：

如果在使用XPath之前对文档进行了更改，则此更改仍将反映在名称空间的查找中，因为委托是在需要时使用文档的当前版本完成的。
名称空间或前缀的查找是在所用节点（在本例中为sourceDocument的祖先中完成的。这意味着，使用提供的代码，您只会获得在根节点上声明的名称空间。在我们的示例中找不到名称空间science 。
XPath评估时会调用查找，因此会花费一些额外的时间。

这是示例代码：

清单8.直接从文档中进行名称空间解析的示例2

private static void example2(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Second example - namespacelookup delegated to document ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceResolver(example));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

该示例的输出为：

清单9.示例2的输出

*** Second example - namespacelookup delegated to document ***
Try to use the science prefix: no result
--> books:booklist/science:book
The resolver only knows namespaces of the first level!
To be precise: Only namespaces above the node, passed in the constructor.
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
  <fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
    <title xmlns="http://univNaSpResolver/book">Faust I</title>
    <author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
  </fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
  <fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
    <title xmlns="http://univNaSpResolver/book">Faust II</title>
    <author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
  </fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

如您在输出中看到的那样，未解析在book元素上声明的带有前缀science的名称空间。评估方法抛出XPathExpressionException。要解决此问题，您可以从文档中提取节点science:book并将该节点用作委托。但这意味着需要额外的文档解析，并且不够优雅。

从文档中读取名称空间并进行缓存

下一个版本的NamespaceContext更好。它仅提前一次在构造函数中读取名称空间。每次对命名空间的调用都会从缓存中得到答复。因此，文档的更改无关紧要，因为名称空间列表是在Java对象创建时缓存的。

清单10.从文档缓存名称空间解析

public class UniversalNamespaceCache implements NamespaceContext {
    private static final String DEFAULT_NS = "DEFAULT";
    private Map<String, String> prefix2Uri = new HashMap<String, String>();
    private Map<String, String> uri2Prefix = new HashMap<String, String>();

    /**
     * This constructor parses the document and stores all namespaces it can
     * find. If toplevelOnly is true, only namespaces in the root are used.
     * 
     * @param document
     *            source document
     * @param toplevelOnly
     *            restriction of the search to enhance performance
     */
    public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
        examineNode(document.getFirstChild(), toplevelOnly);
        System.out.println("The list of the cached namespaces:");
        for (String key : prefix2Uri.keySet()) {
            System.out
                    .println("prefix " + key + ": uri " + prefix2Uri.get(key));
        }
    }

    /**
     * A single node is read, the namespace attributes are extracted and stored.
     * 
     * @param node
     *            to examine
     * @param attributesOnly,
     *            if true no recursion happens
     */
    private void examineNode(Node node, boolean attributesOnly) {
        NamedNodeMap attributes = node.getAttributes();
        for (int i = 0; i < attributes.getLength(); i++) {
            Node attribute = attributes.item(i);
            storeAttribute((Attr) attribute);
        }

        if (!attributesOnly) {
            NodeList chields = node.getChildNodes();
            for (int i = 0; i < chields.getLength(); i++) {
                Node chield = chields.item(i);
                if (chield.getNodeType() == Node.ELEMENT_NODE)
                    examineNode(chield, false);
            }
        }
    }

    /**
     * This method looks at an attribute and stores it, if it is a namespace
     * attribute.
     * 
     * @param attribute
     *            to examine
     */
    private void storeAttribute(Attr attribute) {
        // examine the attributes in namespace xmlns
        if (attribute.getNamespaceURI() != null
                && attribute.getNamespaceURI().equals(
                        XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
            // Default namespace xmlns="uri goes here"
            if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
                putInCache(DEFAULT_NS, attribute.getNodeValue());
            } else {
                // The defined prefixes are stored here
                putInCache(attribute.getLocalName(), attribute.getNodeValue());
            }
        }

    }

    private void putInCache(String prefix, String uri) {
        prefix2Uri.put(prefix, uri);
        uri2Prefix.put(uri, prefix);
    }

    /**
     * This method is called by XPath. It returns the default namespace, if the
     * prefix is null or "".
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return prefix2Uri.get(DEFAULT_NS);
        } else {
            return prefix2Uri.get(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return uri2Prefix.get(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not implemented
        return null;
    }

}

请注意，代码中有调试输出。检查并存储每个节点的属性。不检查子级，因为构造函数中的boolean toplevelOnly设置为true 。如果布尔值设置为false ，则将在存储属性后开始对子代的检查。有一点要考虑有关的代码：在DOM中，第一个节点代表的文档作为一个整体，所以，要获得元素book来读的命名空间，你必须去给孩子只有一个时间。

在这种情况下，使用NamespaceContext非常简单：

清单11.具有高速缓存的名称空间解析的示例3（仅顶层）

private static void example3(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Third example - namespaces of toplevel node cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, true));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

结果为以下输出：

清单12.示例3的输出

*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
  <fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
    <title xmlns="http://univNaSpResolver/book">Faust I</title>
    <author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
  </fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
  <fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
    <title xmlns="http://univNaSpResolver/book">Faust II</title>
    <author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
  </fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

此代码仅查找根元素的名称空间。确切地说：构造函数将节点的命名空间传递给方法examineNode 。这可以加快构造函数的速度，因为它不必遍历整个文档。但是，从输出中可以看到，无法解析science前缀。 XPath表达式导致一个异常（ XPathExpressionException ）。

从文档及其所有元素中读取名称空间并进行缓存

此版本从XML文件读取所有名称空间声明。现在，甚至前缀science上的XPath都可以使用。一种情况使该版本变得复杂：如果前缀过载（在不同URI上的嵌套元素中声明），则最后一个获胜。在现实世界中，这通常不是问题。

在此示例中使用NamespaceContext与上一个示例相同。构造函数中的布尔toplevelOnly必须设置为false 。

清单13.具有高速缓存的名称空间解析的示例4（所有级别）

private static void example4(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Fourth example - namespaces all levels cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, false));
...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }