Javacore学习笔记-XML读取(DOM, SAX, StAx与JAXB)

对XML读取一般有3种方式(DOM, SAX, StAX),JAXB作为特殊的情况后面介绍。以下内容将分别以代码的形式介绍

1. 使用org.w3c.dom.Document接口读取

使用W3C.DOM方式在调用parse的时候会生成整个的树结构,主要适用于文档相对而言小的情况,代码如下:

    public void testDomXml() throws Exception {
        /**
         * DTD参看教程
         *  http://www.xmlfiles.com/dtd/    
         * 
         * 最简例子:
         *  DTD来源:http://cs.au.dk/~amoeller/XML/schemas/dtd-example.html
         *  XML来源:http://cs.au.dk/~amoeller/XML/xml/example.html
         */
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setValidating(true);
//      factory.setIgnoringElementContentWhitespace(true);

        DocumentBuilder builder = factory.newDocumentBuilder();
        builder.setErrorHandler(new ErrorHandler() {
            @Override
            public void warning(SAXParseException exception) throws SAXException {
                exception.printStackTrace();
            }

            @Override
            public void fatalError(SAXParseException exception) throws SAXException {
                exception.printStackTrace();
            }
            @Override
            public void error(SAXParseException exception) throws SAXException {
                exception.printStackTrace();
            }
        });

        Document doc = builder.parse(getClass().getResourceAsStream("1.xml"));
        Element root = doc.getDocumentElement();
        System.out.println(root.getTagName());
        System.out.println(root.getAttribute("id"));

        NodeList children = root.getChildNodes();
        for (int i = 0; i < children.getLength(); i ++) {
            Node node = children.item(i);
            if (node instanceof Element) {
                System.out.println(node.getNodeName());
                Text text = (Text)node.getFirstChild();
                if (text != null) {
                    System.out.println(text.getData());
                }
                NamedNodeMap attrs = node.getAttributes();
                for (int j = 0; j < attrs.getLength(); j++) {
                    Attr attr = (Attr)attrs.item(j);
                    System.out.println(attr.getName());
                    System.out.println(attr.getValue());
                }
            }
        }
    }
  • 补充点:在使用W3C.DOM的时候查找指定路径的DOM节点的时候是可以采用XPATH的,先上代码:
public void textXPath() throws Exception{
        /**
         * Xpath使用最简例子
         * 深入点:QName的使用
         * 
         * [XPath JAVA用法总结及代码样例] (https://my.oschina.net/cloudcoder/blog/223359)
         * 
         * 
         */
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setValidating(false);
        dbf.setIgnoringElementContentWhitespace(true);

        DocumentBuilder db = dbf.newDocumentBuilder();
        Document doc = db.parse(getClass().getResourceAsStream("xpath.xml"));

        XPathFactory xPathFactory = XPathFactory.newInstance();
        XPath xpath = xPathFactory.newXPath();
        long calories = ((Double)xpath.evaluate("//collection/recipe/nutrition/@calories", doc, XPathConstants.NUMBER)).longValue();

        System.out.println(calories);
    }

2. 使用SAX方式读取

流式读取XML文件并不是将所有的内容都会生成树形结构,需要自己实现DefaultHandler来个性化处理(更多说明请百度

public void testSax() throws Exception {
        /**
         * SAX很好的例子:http://blog.csdn.net/linminqin/article/details/6456476
         */
        SAXParserFactory spf = SAXParserFactory.newInstance();
        SAXParser sp = spf.newSAXParser();
        List<Book> books = new ArrayList<Book>();
        sp.parse(new InputSource(new InputStreamReader(getClass().getResourceAsStream("2.xml"), "UTF-8")), new DefaultHandler(){
            private String currentQname;
            private Book book;
            @Override
            public void startElement(String uri, String localName, String qName, Attributes attributes)
                    throws SAXException {
                if (qName.equals("书")) {
                    book = new Book();
                }
                currentQname = qName;
            }
            @Override
            public void characters(char[] ch, int start, int length) throws SAXException {
                if ("书名".equals(currentQname)) {
                    book.name = new String(ch, start, length);
                }
                if ("作者".equals(currentQname)) {
                    book.author = new String(ch, start, length);
                }
                if ("售价".equals(currentQname)) {
                    book.price = new String(ch, start, length);
                }
            }
            @Override
            public void endElement(String uri, String localName, String qName) throws SAXException {
                if (qName.equals("书")) {
                    books.add(book);
                    book = null;
                }
                currentQname = null;
            }
        });
        for (Book b : books) {
            System.out.println(b);
        }
    }
  • 补充点1. SAX默认只会使用UTF-8读取,如果要使用其他编码方式的话就要使用InputSouce来解决乱码问题,代码如下:
        SAXParserFactory spf = SAXParserFactory.newInstance();
        SAXParser sp = spf.newSAXParser();
        List<Book> books = new ArrayList<Book>();
        sp.parse(new InputSource(new InputStreamReader(getClass().getResourceAsStream("2.xml"), "UTF-8")), new DefaultHandler(){
            private String currentQname;
            private Book book;
            @Override
            public void startElement(String uri, String localName, String qName, Attributes attributes)
                    throws SAXException {
                if (qName.equals("书")) {
                    book = new Book();
                }
                currentQname = qName;
            }
            @Override
            public void characters(char[] ch, int start, int length) throws SAXException {
                if ("书名".equals(currentQname)) {
                    book.name = new String(ch, start, length);
                }
                if ("作者".equals(currentQname)) {
                    book.author = new String(ch, start, length);
                }
                if ("售价".equals(currentQname)) {
                    book.price = new String(ch, start, length);
                }
            }
            @Override
            public void endElement(String uri, String localName, String qName) throws SAXException {
                if (qName.equals("书")) {
                    books.add(book);
                    book = null;
                }
                currentQname = null;
            }
        });
        for (Book b : books) {
            System.out.println(b);
        }
  • 补充点2. SAX读取XML,实现非常简单为读的反过程
public void testWriteXml() throws Exception {
        XMLOutputFactory xof = XMLOutputFactory.newFactory();
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        XMLStreamWriter xsr = xof.createXMLStreamWriter(bos);
        xsr.writeStartDocument("UTF-8", "1.0");
        xsr.writeStartElement("A");
        xsr.writeCharacters("http://localhost:8080/");
        xsr.writeEndElement();
        xsr.writeEndDocument();
        bos.close();
        System.out.println(new String(bos.toByteArray()));
    }
<?xml version="1.0" encoding="UTF-8"?><A>http://localhost:8080/</A>

3. StAX方式解析

作者在书上写着这种方式比SAX好用,但是我并没有感觉出来,请看实现SAX同样功能的代码(明显感觉丑了很多)

public void testStAx() throws Exception {
        XMLInputFactory xif = XMLInputFactory.newFactory();
        XMLStreamReader xsr = xif.createXMLStreamReader(getClass().getResourceAsStream("2.xml"));
        List<Book> books = new ArrayList<Book>();
        String cLocalName = null;
        for (Book book = null; xsr.hasNext(); ) {
            int status = xsr.next();
            if (status == XMLStreamConstants.START_ELEMENT) {
                String lName = xsr.getName().getLocalPart();
                if ("书".equals(lName)) {
                    book = new Book();
                }
                cLocalName = lName;
            } else if (status == XMLStreamConstants.CHARACTERS) {
                if ("书名".equals(cLocalName)) {
                    book.name = xsr.getText();
                }
                if ("作者".equals(cLocalName)) {
                    book.author = xsr.getText();
                }
                if ("售价".equals(cLocalName)) {
                    book.price = xsr.getText();
                }
            } else if (status == XMLStreamConstants.END_ELEMENT) {
                if ("书".equals(xsr.getName().getLocalPart())) {
                    books.add(book);
                    book = null;
                }
                cLocalName = null;
            }
        }
        for (Book b : books) {
            System.out.println(b);
        }
    }

4. 使用JAXB读取和写XML

如果你的场景能使用JAXB的话,我只能说太幸运了,代码实在是太简单了,一目了然

public class JaxbTest {

    @XmlRootElement
    private static class Customer {
        String name;
        int age;
        int id;

        @XmlElement
        public String getName() {
            return name;
        }
        public void setName(String name) {
            this.name = name;
        }
        @XmlElement
        public int getAge() {
            return age;
        }
        public void setAge(int age) {
            this.age = age;
        }
        @XmlAttribute
        public int getId() {
            return id;
        }
        public void setId(int id) {
            this.id = id;
        }

        @Override
        public String toString() {
            return "Customer[id=" + id + ",name=" + name + ",age=" + age + "]";
        }
    }

    @Test
    public void marshal() throws Exception {
        Customer c = new Customer();
        c.name = "kk";
        c.id = 100;
        c.age = 10;

        Customer c2 = new Customer();
        c2.name = "kk";
        c2.id = 100;
        c2.age = 10;

        File file = new File("temp\\jaxb.xml");
        JAXBContext context = JAXBContext.newInstance(Customer.class);
        Marshaller marshaller = context.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
        marshaller.marshal(c, file);
        marshaller.marshal(c, System.out);
    }

    @Test
    public void unmarshal() throws Exception {
        JAXBContext jaxbContext = JAXBContext.newInstance(Customer.class);
        Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
        Customer cc = (Customer)unmarshaller.unmarshal(new FileReader(new File("temp\\jaxb.xml")));
        System.out.println(cc);
    }
}

总结

  1. W3CDOM和SAX方式感觉还是蛮不错的,各有侧重,使用起来也很方便,StAX并没有体现出其对于SAX的优势(至少我暂时还看出来);
  2. XPATH感觉在读取DOM节点的时候优势是很明显的,更多可以搜索下jsoupxpath
  3. 对于既读又写XML这种场景的话(特别是XML是通过对象生成这种情况下)建议采用JAXB,使用注解就可以轻松搞定.

相关链接

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值