三、XML

最新推荐文章于 2024-08-18 21:36:31 发布

大橘为重yo

最新推荐文章于 2024-08-18 21:36:31 发布

阅读量250

点赞数

分类专栏： java核心基础卷2 读书笔记文章标签： java xml

本文链接：https://blog.csdn.net/qq_42270665/article/details/107484201

版权

java核心基础卷2 读书笔记专栏收录该内容

5 篇文章 0 订阅

订阅专栏

1、XML概述

<configuration>
    <title>
        <front>
            <name>Helvetica</name>
            <size>36</size>
        </front>
    </title>
    <body>
        <font>
            <name>Times Roman</name>
            <size>12</size>
        </font>
    </body>
    <window>
        <width>400</width>
        <heigth>200</heigth>
    </window>
    <color>
        <red>0</red>
        <green>50</green>
        <blue>100</blue>
    </color>
    <menu>
        <item>Times Roman</item>
        <item>Helvetica</item>
        <item>Goudy Old Style </item>
    </menu>
</configuration>

XML格式能够表达层次结构，并且重复的元素不会被曲解

尽管HTML和XML同宗同源，但是两者之间存在着重要的区别：
· XML大小写敏感
· XML中结束标签绝对不能省略
· 在XML中，只有单个标签而没有相对应的结束的元素必须以 / 结尾
· 在XML中，属性值必须用引号括起来
· 在XML中，所有属性必须都有属性值

1.1 XML文档的结构

<?xml version="1.0" encoding="UTF-8" ?> <!--以一个文档头开始-->

<!--文档头之后通常是文档类型定义（DTD）-->
<!DOCTYPE web-app PUBLIC
        "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
        "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">
<!-- XML正文 -->
<web-app>

</web-app>

    <title>
        <front>
            <name>Helvetica</name>
            <!--XML元素可以包含属性-->
            <size unit="pt">36</size>
        </front>
    </title>
    <body>
        <font>
            <name>linda</name>
            <age>12</age>
        </font>
        <!--进行简化-->
        <font name="linda" age="12"/>
    </body>

元素和文本是XML文档“只要的支撑要素”，你可能还会遇到的其他一些标记：
· 字符引用：&# 十进制值；&#x 十六进制值
· 实体引用的形式是 &name
· CDATA部分是 <![CDATA[ 和 ]]> 来限定其界限
· 处理指定是那些专门在处理XML文档的应用程序中使用的指令，它们由 <? 和 ?>来限定其界限

2、解析XML文档

java提供了两种XML解析器：
· 像文档对象模型（DOM）解析器这样的树形解析器，它们将读入的XML文档转换成树结构。
· 像XML简单API解析器这样的流机制解析器，它们在读入XML文档时生成相应的事件。

public class Demo01 {
    public static void main(String[] args) throws Exception {
        //1、通过DocumentBuilderFactory得到DocumentBuilder对象
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        //2、从文件中读入某个文档
        File f = Paths.get("").toFile();
        Document doc1 = builder.parse(f);
        //也可以指定一个任意的输入流
        InputStream in = new FileInputStream(f);
        Document doc2 = builder.parse(in);

        //3、可以通过调用getDocumentElement方法来启动对文档内容的分析，它将返回根元素。
        Element root = doc1.getDocumentElement();

        //getTagName返回标签名
        String tagName = root.getTagName();

        //getChildNodes获得root的子元素
        NodeList childNodes = root.getChildNodes();
        //遍历子元素
        for (int i = 0; i< childNodes.getLength(); i++){
            Node item = childNodes.item(i);
        }

        //也可以用 getLastChild 得到最后一项子元素，用 getNextSibling 得到下一个兄弟节点
        for (Node child = root.getFirstChild();
             child != null;
             child = child.getNextSibling()){
            System.out.println(child);
        }

        //如果要枚举节点的属性，可以调用getAttributes方法，它返回一个NamedNodeMap对象
        NamedNodeMap attributes = root.getAttributes();
        //遍历
        for (int i=0; i<attributes.getLength(); i++){
            Node item = attributes.item(i);
            String nodeName = item.getNodeName();
            String nodeValue = item.getNodeValue();
            System.out.println("nodeName: "+ nodeName + "---" + "nodeValue: "+ nodeValue);
        }
        //如果知道属性名，也可以直接获取相应的属性值
        root.getAttribute("unit");
    }
}

3、验证XML文档

3.1 XML Schema

<?xml version="1.0" encoding="UTF-8" ?>
<configuration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xsi:noNamespaceSchemaLocation="demo04.xsd" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:simpleType name="StyleType">
    <xsd:restriction base="xsd:string">
        <xsd:enumeration value="PLAIN"/>
        <xsd:enumeration value="BOLD"/>
        <xsd:enumeration value="ITALIC"/>
        <xsd:enumeration value="BOLD_ITALIC"/>
    </xsd:restriction>
</xsd:simpleType>
</configuration>

解析带有Schema的XML文件和解析带有DTD的文件相似，但有3点差别：
1.必须打开对命名空间的支持，即使在XML文件里你可能不会用到它
factory.setNamespaceAware(true);
2.必须通过下列方式处理准备Schema的工厂

final String JAXP_SCHEMA_LANGUAGE = "http://java.sun.com/xml/jaxp/properties/schemaLanguage";
final String W3C_XML_SCHEMA = "http://www.w3.org/2001/XMLSchema";
factory.setAttribute(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);

3.解析器不会丢弃元素中的空白字符。

4、使用XPath来定位信息

如果要定位每个XML文档中的一段特定信息，XPath语言使得访问树节点变得很容易。
例如：

<configuration>
	...
    <database>
        <username>dbuser</username>
        <password>secret</password>
    </database>
</configuration>

可以通过对XPath表达式 /configuration/database/username 求值来得到database中的username的值

· XPath可以描述XML文档中的一个节点集，例如：/gridbag/row 描述了根元素gridbag的子元素中所有的row元素。
· 可以用 [ ] 操作符来选择特定元素，例如：/gridbag/row[1] 这表示的是第一行（索引号从1开始）
· 使用@操作符可以得到属性值，例如：/gridbag/row[1]/cell[1]/@anchor 描述了第一行第一个单元格的anchor属性
· XPath有很多函数，例如：count(/gridbag/row) 返回gridbag根元素的row子元素的数量
· 精细的XPath表达式还有很多，请参考 http://www.w3.org/TR/xpath 的规范

public class Demo03 {
    public static void main(String[] args) throws Exception {
        //1、首先需要从XPathFactory 创建一个XPath对象
        XPathFactory xpFactory = XPathFactory.newInstance();
        XPath xPath = xpFactory.newXPath();

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        File file = new File("...");
        Document doc = builder.parse(file);

        //2、调用evaluate 来计算XPath表达式
        String evaluate = xPath.evaluate("/configuration/database/username", doc);
        System.out.println(evaluate);
    }
}

6、流机制解析器

DOM解析器会完整地读入XML文档，然后将其转换成一个属性的数据结构。但是，如果文档很大，并且处理算法又非常简单，可以在运行时解析节点，而不必看到完整的树形结构，那么DOM可能就会显得效率低下了。在这种情况下，我们应该使用流机制解析器。

6.1 使用SAX解析器

SAX解析器在解析XML输入数据的各个组成部分时会报告时间，但不会以任何范式存储文档，而是由事件处理器建立相应的数据结构。实际上，DOM解析器是在SAX解析器的基础上构建的，它在接收到解析器事件时构建DOM树。
在使用SAX解析器时，需要一个处理器来为各种解析器时间定义事件动作。

在这里插入图片描述

public class SAXTest {
    public static void main(String[] args) throws Exception{
        String url;
        if (args.length == 0){
            url = "file://D:\\codes\\study\\src\\main\\java\\com\\java02\\day03\\html\\start.html";
            System.out.println("Using: " + url);
        }else {
            url = args[0];
        }

        SAXParserFactory factory = SAXParserFactory.newInstance();
        factory.setNamespaceAware(true);
        factory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
        SAXParser saxParser = factory.newSAXParser();
        InputStream in = new URL(url).openStream();

        DefaultHandler handler = new DefaultHandler(){
            //遇到起始标签时调用
            @Override
            public void startElement(String nameSpaceURI, String lname, String qname, Attributes attrs){
                if ("a".equals(lname) && attrs != null){
                    for (int i=0; i<attrs.getLength(); i++){
                        String aname = attrs.getLocalName(i);
                        if ("href".equals(aname)){
                            System.out.println(attrs.getValue(i));
                        }
                    }
                }
            }
        };

        saxParser.parse(in, handler);
    }
}

6.2 使用StAX解析器

StAX解析器是一种“拉解析器”，与安装事件处理器不同，你只需使用下面这样的基本循环来迭代所有的事情：

    InputStream in = url.openStream();
    XMLInputFactory factory = XMLInputFactory.newInstance();
    XMLStreamReader reader = factory.createXMLStreamReader(in);
    while (reader.hasNext()){
        int next = reader.next();
        //...coding...
    }

在这里插入图片描述
要分析这些属性值，需要调用XMLStreamReader类中恰当的方法，例如：String units = reader.getAttributeValue(null, "units"); 它可以获取当前元素的units属性。

7、生成XML文档

7.1 不带命名空间的文档

public class Demo05 {
    public static void main(String[] args) throws Exception{
        //通过DocumentBuilderFactory -> DocumentBuilder -> Document
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.newDocument();

        String rootName = "front";
        String childName = "item";
        //使用Document的createElement构建文档里的元素
        Element rootElement = doc.createElement(rootName);
        Element childElement = doc.createElement(childName);
        //使用createTextNode构建文本节点
        String textContents = "Hello";
        Text textNode = doc.createTextNode(textContents);
        //使用appendChild给文档添加根节点，给父节点添加子节点
        doc.appendChild(rootElement);
        rootElement.appendChild(childElement);
        childElement.appendChild(textNode);
        //使用setAttribute设置一些其他属性
        rootElement.setAttribute("name", "Tina");
    }
}

7.2 带命名空间的文档

public class Demo06 {
    public static void main(String[] args) throws Exception{
        //设置工厂为命名空间感知的
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.newDocument();
        //使用createElementNS创建节点
        String namespace = "http://www.w3.org/2000/svg";
        Element rootElement = doc.createElementNS(namespace, "svg");
        //设置属性
        rootElement.setAttributeNS(namespace, "name", "Tina");
    }
}

7.3 使用StAX写出XML文档

public class Demo07 {
    public static void main(String[] args) throws Exception{
        //通过OutputStream + XMLOutputFactory 构建一个XMLStreamWriter对象
        OutputStream out = new FileOutputStream("/../..");
        XMLOutputFactory factory = XMLOutputFactory.newInstance();
        XMLStreamWriter writer = factory.createXMLStreamWriter(out);

        //XML文件头
        writer.writeStartDocument();
        //添加节点
        writer.writeStartElement("front");
        //添加属性
        writer.writeAttribute("unit", "pt");
        //写出字符
        writer.writeCharacters("Hello World!");
        //然后就可以重复上面的接着写下一个节点了
        //...

        //写出没有子节点的元素（例如<img ... />）
        writer.writeEmptyElement("img");

        //写完所有节点后调用writeEndElement，元素关闭
        writer.writeEndElement();
        //最后在文档结尾调用writeEndDocument，关闭所有打开的元素
        writer.writeEndDocument();

    }
}

8、XSL转换

XSL转换（XSLT）机制可以指定将XML文档转换为其他格式的规则，例如，转换为纯文本、XHTML或任何其他的XML格式。XSLT通常用来将某种机器可读的XML格式转译为另一种机器可读的XML格式，或者将XML转译为适于人类阅读的表示格式。
在这里插入图片描述

大橘为重yo

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
三、XML

1、XML概述<configuration> <title> <front> <name>Helvetica</name> <size>36</size> </front> </title> <body> <font> <nam
复制链接

扫一扫