xml文件解析

最新推荐文章于 2023-04-09 10:56:18 发布

木头有木有

最新推荐文章于 2023-04-09 10:56:18 发布

阅读量962

点赞数

分类专栏：解析文章标签： xml dom java

原文链接：https://www.cnblogs.com/longqingyang/p/5577937.html

版权

解析专栏收录该内容

1 篇文章 0 订阅

订阅专栏

xml解析

最近项目中涉及xml报文的解析，为此看了不少关于xml文件的解析，总结一下！

xml的解析方式有四种，这里只讲前3种：

1.DOM解析；

2.SAX解析；

3.DOM4J解析;

4.JDOM解析；

bookstore.xml(里面的内容随意编排，无需关注)

<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
    <book id="1">
        <name>三国演义</name>
        <author>罗贯中</author>
        <year>1998</year>
        <price>59</price>
    </book>
    <book id="2">
        <name>水浒传</name>
        <author>施耐庵</author>
        <year>1997</year>
        <price>46</price>
    </book>
    <book id="3">
        <name>西游记</name>
        <author>吴承恩</author>
        <year>2013</year>
        <price>59</price>
    </book>
    <book id="4">
        <name>红楼梦</name>
        <author>曹雪芹</author>
        <year>1996</year>
        <price>56</price>
    </book>
</bookstore>

Book.java

public class Book {

    private String id;
    private String name;
    private String author;
    private String year;
    private double price;

    //getter/setter方法自行生成
    
}

1.DOM解析：

全称：Document Object Model，即文档对象模型。

基于DOM的XML分析器将一个XML文档转换成对象模型(即DOM树)；

优点：
- 树结构，便于理解与书写；
- 解析过程，树结构保存在内存中方便改写。
缺点：
- 读取文件消耗内存大，因是直接一次性读取文件；
- 当xml文件大的时候很容易导致内存溢出，不推荐使用。

    //  Dom解析xml path:自己目录下的xml文件
    public static void domXmlParse(String path) throws ParserConfigurationException, IOException, SAXException {
        //创建factory对象
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        //创建builder对象
        DocumentBuilder builder = factory.newDocumentBuilder();
        //通过DocumentBuilder对象的parse方法加载bookstore.xml到当前目录下
        Document document = builder.parse(path);
        //获取book所有节点
        NodeList bookList = document.getElementsByTagName("book");
        //获取book节点个数
        System.out.println("一共有" + bookList.getLength() + "本书");

        for (int i = 0; i < bookList.getLength(); i++) {
            Node book = bookList.item(i);//获取每一本书即每个book节点
            NamedNodeMap attributes = book.getAttributes();//获取book节点里的属性
            System.out.println("第" + (i + 1) + "本书有" + attributes.getLength() + "个属性");
            for (int j = 0; j < attributes.getLength(); j++) {//遍历book节点的属性值
                Node node = attributes.item(j);//获取book节点的每个属性
                System.out.println("属性名：" + node.getNodeName());//获取book节点的每个属性名
                System.out.println("属性值：" + node.getNodeValue());//获取book节点的每个属性值
            }
            NodeList childNodes = book.getChildNodes();//获取book节点下的子节点
            System.out.println("第" + (i + 1) + "本书共有" + childNodes.getLength() + "个子节点");
            for (int j = 0; j < childNodes.getLength(); j++) {
                //区分出text类型的node以及element类型的node
                if (childNodes.item(j).getNodeType() == Node.ELEMENT_NODE) {
                    //获取element类型节点的节点名
                    System.out.println("第" + (j + 1) + "个节点的节点名：" + childNodes.item(j).getNodeName());
                    //获取element类型节点的节点值
                    System.out.println("--节点值是：" + childNodes.item(j).getFirstChild().getNodeValue());
                }
            }
            System.out.println("-----------------结束遍历第" + (i + 1) + "本书");
        }
    }

2.SAX解析：

全称：Simple APIs for XML，即XML简单应用程序接口。

与DOM不同的是，它是顺序模式，是一种快速读写xml数据的方式。

优点：
- 采取事件驱动模式，即当使用SAX分析器对XML文档进行分析时，会触发一系列事件，并激活相应的事件处理函数，应用程序通过这些事件处理函数实现对XML文档的访问;
- 适用于只处理XML文件中的数据。
缺点：
- 编码复杂，不容易编写；
- 很难同时访问XML文件中的不同数据。

    //  SAX解析xml
    public static void saxXmlParse(String path) {
        //获取factory对象
        SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            //获取parse对象
            SAXParser parser = factory.newSAXParser();
            //自定义SAXParserHandler类
            SAXParserHandler handler = new SAXParserHandler();
            //解析开始
            parser.parse(path, handler);
            System.out.println("共有" + handler.getBookList().size() + "本书");
            for (Book book : handler.getBookList()) {
                System.out.println(book.getAuthor());
                System.out.println(book.getId());
                System.out.println(book.getLanguage());
                System.out.println(book.getName());
                System.out.println(book.getPrice());
                System.out.println(book.getYear());
                System.out.println("----finish-----");
            }
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

SAXParserHandler.java

public class SAXParserHandler extends DefaultHandler {

    String value = null;
    Book book = null;
    private ArrayList<Book> bookList = new ArrayList<>();

    public ArrayList<Book> getBookList() {
        return bookList;
    }

    int bookIndex = 0;

    //  用来标识解析开始
    @Override
    public void startDocument() throws SAXException {
        super.startDocument();
        System.out.println("SAX解析开始-----------");
    }

    //  用来标识解析结束
    @Override
    public void endDocument() throws SAXException {
        super.endDocument();
        System.out.println("SAX解析结束-----------");
    }

    //  解析xml元素
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        super.startElement(uri, localName, qName, attributes);
        if ("book".equals(qName)) {
            bookIndex++;
            //new一个book对象
            book = new Book();
            //开始解析book元素属性
            System.out.println("开始遍历某本书---------------");
            int num = attributes.getLength();
            for (int i = 0; i < num; i++) {
                System.out.println("book元素的第" + (i + 1) + "个属性名是：" + attributes.getQName(i));
                System.out.println("属性值是：" + attributes.getValue(i));
                if ("id".equals(attributes.getQName(i))) {
                    book.setId(attributes.getValue(i));
                }
            }
        } else if (!"name".equals(qName) && "boostore".equals(qName)) {
            System.out.println("节点名是：" + qName + "---------");
        }


    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        super.endElement(uri, localName, qName);

        if ("book".equals(qName)) {
            bookList.add(book);
            book = null;
            System.out.println("结束遍历某本书----------------");
        } else if ("name".equals(qName)) {
            book.setName(value);
        } else if ("author".equals(qName)) {
            book.setAuthor(value);
        } else if ("year".equals(qName)) {
            book.setYear(value);
        } else if ("price".equals(qName)) {
            book.setPrice(Double.parseDouble(value));
        } else if ("language".equals(qName)) {
            book.setLanguage(value);
        }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        super.characters(ch, start, length);
        value = new String(ch, start, length);
        if (!"".equals(value.trim())) {
            System.out.println("节点值是：" + value);
        }
    }
}

3.DOM4J解析：

特征：

1.JDOM的一种智能分支，它合并了许多超出基本XML文档表示的功能；

2.使用接口和抽象类方法；

3.具有性能优异、灵活性好、功能强大和极端易用的特点；

4.是一个开放源码的文件。

//Dom4j解析xml
    public static void dom4jXmlParse(String path) {
        List<Book> bookList = new ArrayList<>();
        SAXReader reader = new SAXReader();
        try {
            org.dom4j.Document document = reader.read(new File(path));
            Element bookStore = document.getRootElement();
            Iterator it = bookStore.elementIterator();
            while (it.hasNext()) {
                System.out.println("--------开始遍历某一本书----------");
                Element book = (Element) it.next();
                List<Attribute> bookAttrs = book.attributes();
                for (Attribute attr : bookAttrs) {
                    System.out.println("属性名：" + attr.getName() + "，属性值：" + attr.getValue());
                }
                Iterator itt = book.elementIterator();
                while (itt.hasNext()) {
                    Element bookChild = (Element) itt.next();
                    System.out.println("节点名：" + bookChild.getName() + ",节点值" + bookChild.getStringValue());
                }
                System.out.println("--------结束遍历某一本书----------");
            }
        } catch (DocumentException e) {
            e.printStackTrace();
        }

    }