Java 处理 XML 有很多不同的技术,主要可以分成两个大类 DOM 和SAX
具体区别如下
Table 1: SAX and DOM features
SAX | DOM |
Event based model | Tree data structure |
Serial access (flow of events) | Random access (in-memory data structure) |
Low memory usage (only events are generated) | High memory usage (the document is loaded into memory) |
To process parts of the document (catching relevant events) | To edit the document (processing the in-memory data structure) |
To process the document only once (transient flow of events) | To process multiple times (document loaded in memory) |
源自:Java Technology and XML-Part 3: Performance Improvement Tips
就不翻译了,看不懂的自己好好学习E文去,或者找个翻译软件。
来自W3C的XML的基本处理方法有:
- SAX, the Simple API for XML
- DOM, the Document Object Model API from W3C
- XSLT, the XML Style Sheet Language Transformations from W3C
- XPath, the XML Path Language from W3C
- XQuery, the XML Query Language from W3C
基于以上的XML处理方法Java和第三方也开发了许多为了提高java处理XML能力的框架和API。
大家比较熟悉的应该有:JDom,Dom4j,JAXP,JAXB,SAX,StAX。
还有比较不熟:Xerces,Crimson, Xalan (应该是效率和兼容性问题,导致被大家渐渐抛弃吧)
关于“DOM”,“推”,“拉” 模型的区别参阅:
Java6.0新特性之StAX--全面解析Java XML分析技术
DOM,SAX,StAX的简单比较参阅:
Geronimo 叛逆者: 使用集成软件包:Codehaus 的 Woodstox
JAXB, SAX, DOM 性能比较参阅:
JAXB , Woodstox 比较参阅:
XML unmarshalling benchmark: JAXB vs STax vs Woodstox
简单的JAVA XPATH实例:
CodeHaus 对StAX实现了Java 库的建立,该类库被命名为Woodstox
官网 http://woodstox.codehaus.org/
maven:http://mvnrepository.com/artifact/org.codehaus.woodstox
参阅:
SAXON 则是对Xpath 和 XQuery 进行了相当好的处理。
Saxon Documentation - Saxonica
maven:http://mvnrepository.com/artifact/net.sf.saxon/Saxon-HE/9.5.0.1
ciao