先说一下DOM和SAX的区别
Structure
DOM: Tree-based
SAX: Event-Based
Processing
DOM: Batch-Reads entire doc first. Application gets control after parsing completes and a tree is built in memory
SAX: Streaming-Fires event callbacks during the doc read. Application maintains full control during parsing
Ease of use:
DOM: Complex and more difficult to use, but more flexible (Spec: 500 pages)
SAX: Easy to use, but does not actually do much
CRUD
DOM: All
SAX: Read Only
Resources
DOM: Memory and processor heavy
SAX: Less memory, fast and efficient
Typical use
DOM: Non-sequential processing and updating of structure (or read doc multiple times)
SAX: Scanning large documents
关于DOM,参考了JR上的入门文章
http://www.javaresearch.org/article/showarticle.jsp?column=5&thread=37929
顺便说一下现在才知道Factory Pattern应用之广:DOM中有DocumentFactory,Hibernate中有SessionFactory,至于其他太细节的东西,用的时候再学也不迟
关于SAX
分为三个部分:Parser Creation, Content Handler, Error Handler
private void init() {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(false);
javax.xml.parsers.SAXParser saxParser = factory.newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
SAXHandler xmlHndlr = new SAXHandler();
xmlReader.setContentHandler(xmlHndlr);
SAXErrorHandler errHndlr = new SAXErrorHandler(System.err);
xmlReader.setErrorHandler(errHndlr);
xmlReader.parse("books.xml");
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch(IOException e){
e.printStackTrace();
}
}
SAXHandler继承自DefaultHandler,中间包括很多方法,需要自己定义,比如
startDocument()
endDocument()
startElement(String uri, String localName, String qualifiedName, Attributes atts)
endElement(String uri, String localName, String qualifiedName)
characters(char ch[], int start, int length)
ErrorHandler是一个interface,SAXErrorHandler需要实现几个方法,比如:
public void error(SAXParseException exception) throws SAXException {
throw new SAXException("Error: " +getExceptionInfo(exception));
}
private String getExceptionInfo(SAXParseException exception) {
return "URL= " +exception.getSystemId()
+ "Line= " +exception.getLineNumber()+": " +exception.getMessage();
}
public void fatalError(SAXParseException exception) throws SAXException {
throw new SAXException("Fatal: " +getExceptionInfo(exception));
}
public void warning(SAXParseException exception) throws SAXException {
out.println("Warning: " +exception.getMessage());
}
SAX不将整个文档放入内存,而是以基于事件的方式来处理文档,因此在速度和性能上优于DOM。但是在可读性上,SAX却不如DOM操作清楚简单。因此在文档不是特别大的时候,还是采用DOM方法比较合适