between dom sax or stax

最新推荐文章于 2024-05-20 11:31:07 发布

desmond_assis

最新推荐文章于 2024-05-20 11:31:07 发布

阅读量579

点赞数

分类专栏： XML 文章标签： JAXB DOM SAX StAX xml

XML 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

However there are different approaches for parsing an xml source, You should select proper approach for your needs.

You may choose one of these:

DOM - Document Object Model
SAX - Simple API for XML
StAX - Streaming API for XML

Let's discuss each one.

Parsing with DOM

If you perfer this technique you should know that the whole XML will be loaded into memory. Advatage of this

technique is you can navigate/read to any node. You can append, delete or update a child node becuase data

is available in the memory. However if the XML contains a large data, then it will be very expensive to load it into

memory. Also the whole XML is loaded to memory although you are looking for something particular.

You should consider using this technique, when you need to alter xml structure and you are sure that memory

consumption is not going to be expensive. Also this is the only choice where you can navigate to parent and child

elements. This makes it easier to use.

If you are creating a XML document(which is not big) you should use the technique. However, If you are going to

export a data from a database to xml(where you do need navigation in the xml and/or data is huge) then you should

consider other approaches.

DOM API is standardized by w3c.

Parsing with SAX:

SAX has totally a different approach. It starts to read the XML document from beginning to end, but it does not store anything

to memory. Instead it fires events and you can add your event handler depending on your requirements.

Your event handler will be called for example when an element begins or ends, when processing of document begins or ends.

So you register a handler(or more than one handler) and those handlers are called when an event occurs.

Here is a sample code from a site which calculates the total amount from the xml.

import java.io.*;
 
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import org.apache.xerces.parsers.SAXParser;
 
public class Flour extends DefaultHandler {
    float amount = 0;
 
    public void startElement(String namespaceURI, String localName,
                             String qName, Attributes atts) {
        if (namespaceURI.equals("http://recipes.org") 
&& localName.equals("ingredient")) {
            String n = atts.getValue("", "name");
            if (n.equals("flour")) {
                String a = atts.getValue("", "amount"); // assume 'amount' exists
                amount = amount + Float.valueOf(a).floatValue();
            }
        }
    }
 
 
    public static void main(String[] args) {
        Flour f = new Flour();
        SAXParser p = new SAXParser();
        p.setContentHandler(f);
        try {
            p.parse(args[0]);
        } catch (Exception e) {
            e.printStackTrace();
        }
        System.out.println(f.amount);
 
    }
 
}

With SAX, first of all you do not need to worry on memory consumptions. If the performance is the criteria, (and if you are only reading the xml, not

modifying it), SAX is a much butter choice than DOM. However you are not going to have a tree structure where you can require parent or child

elements. You should be aware where you are.

Parsing with StAX

StAX is a newer technology then the others we discussed and it is the only one with a JSR(JSR-173).

Parsing with StAX look like parsing with SAX. Again StAX does not store anything to memory and the document is read from beginning to end once.

However use SAX, your event handler is called by SAX when an event occurs. In StAX to continue to next event.

You can use StaAX in two methods, the "cursor model" and the "iterator model".

Here is a simple code fragment I found on google. "cursor model" looks like:

URL u = new URL("http://www.cafeconleche.org/");
 
InputStream in = u.openStream();
 
XMLInputFactory factory = XMLInputFactory.newInstance();
 
XMLStreamReader parser = factory.createXMLStreamReader(in);
 
while (true) {
 
int event = parser.next();
 
if (event == XMLStreamConstants.END_DOCUMENT) {
 
parser.close();
 
break;
 
}
 
if (event == XMLStreamConstants.START_ELEMENT) {
 
System.out.println(parser.getLocalName());
 
}
 
}

As you see above, next event is required by us(parser.next()). In "iterator model" the logic is same but you receive an object while iterating which

contains information about the current event like:

XMLEventReader eventReader = XMLInputFactory.newInstance().createXMLEventReader(
 
new FileInputStream("abc.xml"));
 
while(eventReader.hasNext()) {
 
XMLEvent event = eventReader.next();
 
if (event instanceof StartElement)
 
{
 
System.out.println( ((Characters)eventReader.next())
 
.getData());
 
}
 
}

They were technologies, we also have implements.

After choosing your technology you can choose an implement.

Summary

DOM tree-based
- load whole XML to memory, can navigate/ read to any nodes, you also can append, update and delete any child nodes
- can generate XML
- If the xml contains a large data, it will be very expensive to load it into memory
SAX event-based(push model, observer design pattern)
- read XML from beginning to end, but it does not store anything to memory, so don't need to worry on memory consumptions
- do not need to parse whole XML, can stop anywhere when conditions are met
- can only read XML, cannot modify data
- cannot access another nodes in the document
StAX event-based(pull model, iterator design pattern)
- read XML from beginning to end, but it does not store anything to memory, so don't need to worry on memory consumptions
- do not need to parse whole XML, can stop anywhere when conditions are met
- more efficient than SAX
- can generate XML
- can only read XML, cannot modify data
- cannot access another nodes in the document

from: http://blog.sanaulla.info/2013/05/23/parsing-xml-using-dom-sax-and-stax-parser-in-java

desmond_assis

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
between dom sax or stax

However there are different approaches for parsing an xml source, You should select proper approach for your needs.You may choose one of these:DOM - Document Object ModelSAX - Simple API for XML
复制链接

扫一扫