between dom sax or stax

However there are different approaches for parsing an xml source, You should select proper approach for your needs.

You may choose one of these:

  • DOM - Document Object Model
  • SAX - Simple API for XML
  • StAX - Streaming API for XML

Let's discuss each one.

Parsing with DOM

If you perfer this technique you should know that the whole XML will be loaded into memory. Advatage of this

technique is you can navigate/read to any node. You can append, delete or update a child node becuase data

is available in the memory. However if the XML contains a large data, then it will be very expensive to load it into

memory. Also the whole XML is loaded to memory although you are looking for something particular.


You should consider using this technique, when you need to alter xml structure and you are sure that memory

consumption is not going to be expensive. Also this is the only choice where you can navigate to parent and child

elements. This makes it easier to use.


If you are creating a XML document(which is not big) you should use the technique. However, If you are going to

export a data from a database to xml(where you do need navigation in the xml and/or data is huge) then you should

consider other approaches.

DOM API is standardized by w3c.


Parsing with SAX:

SAX has totally a different approach. It starts to read the XML document from beginning to end, but it does not store anything

to memory. Instead it fires events and you can add your event handler depending on your requirements.


Your event handler will be called for example when an element begins or ends, when processing of document begins or ends.

So you register a handler(or more than one handler) and those handlers are called when an event occurs.

Here is a sample code from a site which calculates the total amount from the xml.

import java.io.*;
 
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import org.apache.xerces.parsers.SAXParser;
 
public class Flour extends DefaultHandler {
    float amount = 0;
 
    public void startElement(String namespaceURI, String localName,
                             String qName, Attributes atts) {
        if (namespaceURI.equals("http://recipes.org") 
&& localName.equals("ingredient")) {
            String n = atts.getValue("", "name");
            if (n.equals("flour")) {
                String a = atts.getValue("", "amount"); // assume 'amount' exists
                amount = amount + Float.valueOf(a).floatValue();
            }
        }
    }
 
 
    public static void main(String[] args) {
        Flour f = new Flour();
        SAXParser p = new SAXParser();
        p.setContentHandler(f);
        try {
            p.parse(args[0]);
        } catch (Exception e) {
            e.printStackTrace();
        }
        System.out.println(f.amount);
 
    }
 
}

With SAX, first of all you do not need to worry on memory consumptions. If the performance is the criteria, (and if you are only reading the xml, not

modifying it), SAX is a much butter choice than DOM. However you are not going to have a tree structure where you can require parent or child

elements. You should be aware where you are.

Parsing with StAX

StAX is a newer technology then the others we discussed and it is the only one with a JSR(JSR-173).

Parsing with StAX look like parsing with SAX. Again StAX does not store anything to memory and the document is read from beginning to end once.

However use SAX, your event handler is called by SAX when an event occurs. In StAX to continue to next event.

You can use StaAX in two methods, the "cursor model" and the  "iterator model".

Here is a simple code fragment I found on google. "cursor model" looks like:

URL u = new URL("http://www.cafeconleche.org/");
 
InputStream in = u.openStream();
 
XMLInputFactory factory = XMLInputFactory.newInstance();
 
XMLStreamReader parser = factory.createXMLStreamReader(in);
 
while (true) {
 
int event = parser.next();
 
if (event == XMLStreamConstants.END_DOCUMENT) {
 
parser.close();
 
break;
 
}
 
if (event == XMLStreamConstants.START_ELEMENT) {
 
System.out.println(parser.getLocalName());
 
}
 
}

As you see above, next event is required by us(parser.next()). In "iterator model" the logic is same but you receive an object while iterating which

contains information about the current event like:

XMLEventReader eventReader = XMLInputFactory.newInstance().createXMLEventReader(
 
new FileInputStream("abc.xml"));
 
while(eventReader.hasNext()) {
 
XMLEvent event = eventReader.next();
 
if (event instanceof StartElement)
 
{
 
System.out.println( ((Characters)eventReader.next())
 
.getData());
 
}
 
}

They were technologies, we also have implements.

After choosing your technology you can choose an implement.

Summary

  • DOM tree-based
    • load whole XML to memory, can navigate/ read to any nodes, you also can append, update and delete any child nodes
    • can generate XML
    • If the xml contains a large data, it will be very expensive to load it into memory
  • SAX event-based(push model, observer design pattern)
    • read XML from beginning to end, but it does not store anything to memory, so don't need to worry on memory consumptions
    • do not need to parse whole XML, can stop anywhere when conditions are met
    • can only read XML, cannot modify data
    • cannot access another nodes in the document
  • StAX event-based(pull model, iterator design pattern)
    • read XML from beginning to end, but it does not store anything to memory, so don't need to worry on memory consumptions
    • do not need to parse whole XML, can stop anywhere when conditions are met
    • more efficient than SAX
    • can generate XML
    • can only read XML, cannot modify data
    • cannot access another nodes in the document

from: http://blog.sanaulla.info/2013/05/23/parsing-xml-using-dom-sax-and-stax-parser-in-java

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值