（十一）XML 文件解析以及工具实现(详解)

HB0o0

已于 2023-04-05 16:59:45 修改

阅读量1.5k

点赞数 3

分类专栏： Java基础 Java 工具类文章标签： xml java 开发语言

于 2023-04-05 16:17:17 首次发布

本文链接：https://blog.csdn.net/SwaggerHB/article/details/129969784

版权

Java基础同时被 2 个专栏收录

16 篇文章 3 订阅

订阅专栏

Java 工具类

3 篇文章 0 订阅

订阅专栏

XML 文件解析详解以及工具实现

文章目录

XML 文件解析详解以及工具实现

前言

个人博客：XML——可扩展标记语言

解析 XML 文档

在编程的学习过程中，我们更注重用程序实现对XML的处理。事实上通过编程,我们既可以生成、修改、添加、删除XML文档及其数据内容，也可以通过编程实现从XML文档中取得数据，而后者就是 XML文档的解析工作。
XML文档的解析，就是通过程序设计的方式从XML文档中取出特定的标签内容，或者属性值，并将这些值转换成对应的类对象，下面将实现从 XML 中取值。

XML 解析器与 W3C

Java 提供专门的 XML 解析器，而且解析器也并非一家。这里我们使用的是 W3C 标准化了的 DOM 解析器。

DOM （Document Obiect Model)，文件对象模型，是一种“树型解析器"。如果大家仔细观察前面所写的两个
XML文档就可以发现，XML 的标签与标签之间，存在着明显的“一对多”关系，一对多关系形成的就是“树型”结构。

这里对于 XML 的解析过程 不纠结细节，不急于深探。

这里先将上一篇博客中的 XML 文件拿过来。

<?xml version="1.0" encoding="UTF-8"?>
<informations>
	<information stuId="03207076" name="HB" password="123456" sex="male" birth="2002-10-20">
		<hobby id="0">编程</hobby>
		<hobby id="1">打篮球</hobby>
		<introduce>帅气阳光的男孩</introduce>
	</information>
	<information stuId="2007011076" name="SQR" password="654321" sex="fmale" birth="2002-10-01">
		<hobby id="2">弹扬琴</hobby>
		<hobby id="3">跳舞</hobby>
		<introduce>美女</introduce>
	</information>
</informations>

下面给出 XML 解析程序的简单代码：

public class TestXmlParser {
    public static void main(String[] args) {
        try {
            InputStream is = TestXmlParser.class.getResourceAsStream("/information.xml");
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            Document document = db.parse(is);

            NodeList informationList = document.getElementsByTagName("information");
            for (int index = 0; index < informationList.getLength(); index++) {
                Element information = (Element) informationList.item(index);
                String stuId = information.getAttribute("stuId");
                String name = information.getAttribute("name");
                String password = information.getAttribute("password");
                String sex = information.getAttribute("sex");
                String birth = information.getAttribute("birth");

                System.out.println(stuId + "," + name + "," + password + "," + sex + "," + birth);

                NodeList hobbyList = document.getElementsByTagName("hobby");
                for (int i = 0; i < hobbyList.getLength(); i++) {
                    Element hobby = (Element) hobbyList.item(i);
                    String hobbyId = hobby.getAttribute("id");
                    String hobbyName = hobby.getTextContent();

                    System.out.println("\t" + hobbyId + ":" + hobbyName);
                }

                Element introduceElement = (Element) information.getElementsByTagName("introduce").item(0);
                String introduce  = introduceElement.getTextContent();
                System.out.println("\t" + introduce);
            }
        } catch (ParserConfigurationException e) {
            throw new RuntimeException(e);
        } catch (IOException e) {
            throw new RuntimeException(e);
        } catch (SAXException e) {
            throw new RuntimeException(e);
        }
    }
}

注意：这里导入的包是 W3C的包

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;

成功解析后，输出结果：

XML 的层次结构

通过这些思考，我们考虑是否可以将 XML 解析过程工具化？

XML 解析工具化

我教主朱某人多次强调：“工具化”编程思想很重要，利用XML文档及其存储的结构化数据，可以为我们的Java编程提供极大的便利，因而XML文档解析的工作也会经常进行。

如果以后每一次进行XML解析，都需要编写与上面程序雷同的代码，有点浪费精力，所以，能不能将 XML解析过程变成一个工具，尽量简化这个过程的代码,让我们的编程重点更多的放到最终要处理的问题，而非解析过程的代码编写上？
其实，这就是典型的面向对象程序设计思想！

XML 的解析方法至少有 4 种，这里只实现其中的一种。

工具化分析

InputStream is = TestXmlParser.class.getResourceAsStream("/information.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(is);

上述 4 行代码中的前两行可以理解为一个固定套路，而且从最后一行可以看出，前两行就是要得到一个 DocumentBuilder 类的对象，而且对于不同的 XML 文档解析，这个对象有一个就够，所以，在 XML 解析类中，可以将其设置为 static 成员，而且在类被装载后，未产生 XML 解析类对象前，就可以直接生成一个对象。

public calss XMLParser {
    private static DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
}

上述代码会出现一个错误，说异常没有被处理，改进后如下：

public calss XMLParser {
    private static DocumentBuilder db;
    
    static {
        try {
            db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        } catch(ParserConfigurationException e) {
            e.printStackTrace();
        }
    }
}

本地块与静态本地块

所谓“本地块”就是一段 Java 代码。
本地块前如果有static 关键字修饰，则这个本地块就是静态的。
本地块在每一次实例化该类对象时都被执行。
静态本地块（静态块）在第一次实例化该类对象时被执行，以后不再执行。

因为静态块只被执行一次的特点，因此，DocumentBuilder对象肯定只产生一个，这就是所谓的“单例"模式，在得到了DocumentBuilder对象后，我们还需要通过 parse(InputStream) 方法才能“打开”要处理的XML文档，并得到Document类对象。而这一步需要用户提供的 XML 文档路径参数,因此，需要做成一个方法：

public Document openXml(String xmlPath)throws SAXException,IOException {
    Document document = null;
    InputStream is = Class.class.getResourceAsStream(xmlPath);
    document = db.parse(is);
    
    return document;
}

至此我们的 XML 解析工具已经可以得到 Document 对象了。

”静“ 与 ”动“的分析

仔细观察前面的解析过程：

			NodeList informationList = document.getElementsByTagName("information");
            for (int index = 0; index < informationList.getLength(); index++) {
                Element information = (Element) informationList.item(index);  
				//根据 information ，解析其属性和内容；
                NodeList hobbyList = document.getElementsByTagName("hobby");
                for (int i = 0; i < hobbyList.getLength(); i++) {
                    Element hobby = (Element) hobbyList.item(i);
                    //根据 hobby，解析器属性和内容；
                }
            }

上述代码是XML文档解析的核心。在上述代码中，第1行和第5行都是取得指定标签的节点列表，但是 document对象和student对象的类型不同，分别是 Document类和 Element类，这需要两个参数类型不同的方法分别处理。
又观察到，在执行 getElementsByTagName() 方法时，标签名称也不同，而且应该由用户提供，所以，解析方法应该存在这个字符串类型的参数：

	public void parse(Document document, String tagname) {
		NodeList nodeList = document.getElementsByTagName(tagname);
		for (int index = 0; index < nodeList.getLength(); index++) {
			Element element = (Element) nodeList.item(index);
			//TODO——?
		}
	}
	
	public void parse(Element element, String tagname) {
		NodeList nodeList = element.getElementsByTagName(tagname);
		for (int index = 0; index < nodeList.getLength(); index++) {
			Element ele = (Element) nodeList.item(index);
			//TODO——?
		}
	}

所以我们这里的 //TODO 应该做什么？这里也就是我们所提到的”静“ 与 ”动“的问题。

具体地说，Document openXml(String xmlPath)方法和两个重载的方法 void parse(Document document, String tagname) 、void parse(Element element, String tagname) 其中的内容就是”静“的部分。

而 TODO 部分的内容，是由用户决定的，我们作为工具制造者是不用关心的，也无权决定用户对内容的具体处理过程。

这时候就需要使用”抽象方法“来解决这个问题。

抽象化 XML 解析类

下面即是 XML 文件解析类的完整代码：

public abstract class XMLParser {
    private static DocumentBuilder documentBuilder;

    static {
        try {
            documentBuilder = DocumentBuilderFactory.newDefaultInstance().newDocumentBuilder();
        } catch (ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
    }

    public XMLParser() {
    }

    public abstract void dealElement(Element element,int index);

    public static Document getDocument(String xmlPath) throws Exception {
        InputStream is = XMLParser.class.getResourceAsStream(xmlPath);

        if(is == null) {
            throw new Exception("XML文件【" + xmlPath + "】不存在！");
        }
        Document document = XMLParser.documentBuilder.parse(is);

        return  document;
    }

    public void parse(Document document, String targetName) {
        NodeList nodeList = document.getElementsByTagName(targetName);
        for (int index = 0; index < nodeList.getLength(); index++) {
            Element element = (Element) nodeList.item(index);
            dealElement(element, index);
        }
    }

    public void parse(Element element, String targetName) {
        NodeList nodeList = element.getElementsByTagName(targetName);
        for (int index = 0; index < nodeList.getLength(); index++) {
            Element ele = (Element) nodeList.item(index);
            dealElement(ele, index);
        }
    }
}

这就形成了我们的新工具 XML 文件解析器，然后进行打包，导出 jar 包。

在之前所创建 XML 文件的工程中导入 jar 包，进行测试：

编写测试类 Test:

public class Test {
    public static void main(String[] args) throws Exception {
        new XMLParser() {

            @Override
            public void dealElement(Element element, int index) {
                String stuId = element.getAttribute("stuId");
                String name = element.getAttribute("name");
                String password = element.getAttribute("password");
                String sex = element.getAttribute("sex");
                String birth = element.getAttribute("birth");

                System.out.println(stuId + "," + name + "," + password + "," + sex + "," + birth);
                new XMLParser() {

                    @Override
                    public void dealElement(Element element, int index) {
                        String hobbyId = element.getAttribute("id");
                        String hobbyName = element.getTextContent();

                        System.out.println("\t" + hobbyId + ":" + hobbyName);
                    }
                }.parse(element,"hobby");
            }
        }.parse(XMLParser.getDocument("/information.xml"),"information");

    }
}