SAX是一种事件驱动的流式XML文件处理方式,区别与DOM方式的是不需要在内存中建一棵DOM树,而是根据读取XML时遇到的标签事件来顺序处理,因此具有速度快,内存占用上的优点。SAX往往是大容量XML文件处理的首选方法,SAX读取XML相对比较简单,但是写XML就稍微比DOM方式复杂一些,网上的例子也不够全面和详细,刚好在工作中用到了XML读取和写出XML,记录下来以供参考。
读取XML文件:
首先,要读取的目标XML文件如下:
<?xml version="1.0" encoding="UTF-8" ?>
<oes:Notifications xmlns:oes="http://xml.sax.test.com/oesAccessNotification">
<oes:Notification>
<oes:NotificationID>11111</oes:NotificationID>
<oes:NotificationType>AlarmNew</oes:NotificationType>
<oes:timeStamp>2009-02-25T08:57:17</oes:timeStamp>
<oes:Appendix>
<oes:MapItem key="key" value="value"/>
</oes:Appendix>
<oes:Content>
<alarmNew systemDN="PLMN-1/S3SN-1/SRME-BSS-2/SBSS-0">
<alarmId>400951</alarmId>
<alarmText>PIPE 0 IS SLOW OR NOT WORKING</alarmText>
<eventTime>2009-02-25T08:57:17+02:00</eventTime>
<eventType>processingError</eventType>
<perceivedSeverity>critical</perceivedSeverity>
<probableCause>0</probableCause>
<specificProblem>86600</specificProblem>
<additionalText1>A Raised by pipe supervision script, process ID 20848</additionalText1>
<additionalText2>A test additional text2</additionalText2>
<additionalText3>A test additional text3</additionalText3>
<additionalText4>A test additional text4</additionalText4>
<additionalText5>A Original Additional text: test alarm1 | Original Probable Cause: Toxic Leak1 Detected |
Original alarm time: 20090901183006+0530 | Automatic clearing:Y
</additionalText5>
<additionalText6>Original
</additionalText6>
</alarmNew>
</oes:Content>
</oes:Notification>
</oes:Notifications>
SAX读取该XML文件的过程如下:
(1).定义XML中各种标签:
class Constant {
public static final String NAME_SPACE = "xmlns:oes";
public static final String SCHEMA = "http://xml.sax.test.com/oesAccessNotification";
public static final String NOTIFICATIONS = "oes:Notifications";
public static final String NOTIFICATION = "oes:Notification";
public static final String NOTIFICATION_ID = "oes:NotificationID";
public static final String NOTIFICATION_TYPE ="oes:NotificationType";
public static final String TIME_STAMP = "oes:timeStamp";
public static final String APPENDIX = "oes:Appendix";
public static final String MAP_ITEM = "oes:MapItem";
public static final String KEY = "key";
public static final String VALUE = "value";
public static final String CONTENT = "oes:Content";
public static final String ALARM_NEW = "alarmNew";
public static final String SYSTEM_DN = "systemDN";
public static final String ALARM_ID = "alarmId";
public static final String ALRAM_TEXT = "alarmText";
public static final String EVENT_TIME = "eventTime";
public static final String EVENT_TYPE = "eventType";
public static final String PERCEIVED_SEVERITY = "perceivedSeverity";
public static final String PROBABLE_CAUSE= "probableCause";
public static final String SPECIFIC_PROBLEM = "specificProblem";
public static final String ADDITION_TEXT1 = "additionalText1";
public static final String ADDITION_TEXT2 = "additionalText2";
public static final String ADDITION_TEXT3 = "additionalText3";
public static final String ADDITION_TEXT4 = "additionalText4";
public static final String ADDITION_TEXT5 = "additionalText5";
public static final String ADDITION_TEXT6 = "additionalText6";
public static final String ADDITION_TEXT7 = "additionalText7";
}
这些定义会在读取XML的处理过程中用到。
(2).定义XML文件节点对应的java对象:
class EventFactory {
private XMLReader xmlReader;
public static class InternalEvent {
private String notificationType = "";
private Map<String, String> props = new HashMap<String, String>();
public String getNotificationType() {
return notificationType;
}
public String getProp(String name) {
String str = props.get(name);
if (str == null) {
return "";
} else {
return str;
}
}
public Map<String, String> getProps(){
return props;
}
public void setNotificationType(String notificationType) {
this.notificationType = notificationType;
}
public void putAttribute(String name, String value) {
this.props.put(name, value);
}
}
//调用SAX读取XML的方法,XML文件的数据会被存放到该List中
public List<InternalEvent> read(String xmlPath) throws ParserConfigurationException, SAXException {
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser saxParser = spf.newSAXParser();
xmlReader = saxParser.getXMLReader();
List<InternalEvent> container = new LinkedList<InternalEvent>();
ContentHandler handler = new ReadXMLHandler(container);
xmlReader.setContentHandler(handler);
try {
xmlReader.parse(new InputSource(xmlPath));
} catch (IOException e) {
e.printStackTrace();
}
return container;
}
}
该对象会在SAX读取XML文件时,将XML数据转换为内存中的java对象。
(3).SAX读取XML文件:
class ReadXMLHandler extends DefaultHandler {
private List<EventFactory.InternalEvent> eventContainer;
private StringBuilder buf = new StringBuilder();
private EventFactory.InternalEvent event;
private static final Set<String> ATTR_TAGS = new HashSet<String>();
static {
ATTR_TAGS.add(Constant.EVENT_TIME);
ATTR_TAGS.add(Constant.SPECIFIC_PROBLEM);
ATTR_TAGS.add(Constant.ALRAM_TEXT);
ATTR_TAGS.add(Constant.PERCEIVED_SEVERITY);
ATTR_TAGS.add(Constant.ADDITION_TEXT1);
ATTR_TAGS.add(Constant.ADDITION_TEXT2);
ATTR_TAGS.add(Constant.ADDITION_TEXT3);
ATTR_TAGS.add(Constant.ADDITION_TEXT4);
ATTR_TAGS.add(Constant.ADDITION_TEXT5);
ATTR_TAGS.add(Constant.ADDITION_TEXT6);
ATTR_TAGS.add(Constant.ADDITION_TEXT7);
ATTR_TAGS.add(Constant.EVENT_TYPE);
}
public ReadXMLHandler(List<EventFactory.InternalEvent> eventContainer) {
this.eventContainer = eventContainer;
}
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
buf.setLength(0);
if (qName.equals("oes:Notification")) {
event = new EventFactory.InternalEvent();
eventContainer.add(event);
}
else if(qName.equals(Constant.MAP_ITEM)){
//获取元素中的属性值,如<a key="key" value="value"/>,获取key和value
String key = attributes.getValue(Constant.KEY);
event.putAttribute(Constant.KEY, key);
String value = attributes.getValue(Constant.VALUE);
event.putAttribute(Constant.VALUE, value);
}
else if(qName.equals(Constant.ALARM_NEW)){
String systemDn = attributes.getValue(Constant.SYSTEM_DN);
event.putAttribute(Constant.SYSTEM_DN, systemDn);
}
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (qName.equals(Constant.NOTIFICATION_TYPE)) {
event.setNotificationType(buf.toString());
}
else if (ATTR_TAGS.contains(qName)){
event.putAttribute(qName, buf.toString());
}
}
//获取元素值,如<a>abc</a>,获取其中的abc
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
buf.append(ch, start, length);
}
}
至此XML就读取成功
转换并写出XML文件:
比起SAX读取XML来,SAX写XML要相对复杂一些,流程如下:
(1).对读取的XML对象做一个简单的转换:
class Convert {
public static String convertString(String value){
return value + "_TEST";
}
}
转换很简单,即将XML标签加一个“_TEST”,同时给值也加一个“_TEST“
(2).写XML文件:
class WriteXML {
SAXTransformerFactory fac = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
private TransformerHandler handler = null;
private OutputStream outStream = null;
private String fileName;
private AttributesImpl atts;
private String rootElement;
//元素层次,用于控制XML缩进
private static int level = 0;
//每个层次父级缩进4个空格,即一个tab
private static String tab = " ";
//系统换行符,Windows为:"\n",Linux/Unix为:"/n"
private static final String separator = System.getProperties().getProperty("os.name").toUpperCase().indexOf("WINDOWS") != -1 ? "\n" : "/n";
public WriteXML(String fileName, String rootElement) {
this.fileName = fileName;
this.rootElement = rootElement;
init();
}
public void init() {
try {
handler = fac.newTransformerHandler();
Transformer transformer = handler.getTransformer();
//设置输出采用的编码方式
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
//是否自动添加额外的空白
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//是否忽略xml声明
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
outStream = new FileOutputStream(fileName);
Result resultxml = new StreamResult(outStream);
handler.setResult(resultxml);
atts = new AttributesImpl();
start();
}catch (Exception e) {
e.printStackTrace();
}
}
private void start() {
try {
handler.startDocument();
//设置schema和名称空间
atts.addAttribute("", "", Constant.NAME_SPACE, String.class.getName(), Constant.SCHEMA);
handler.startElement("", "", rootElement, atts);
} catch (Exception e) {
e.printStackTrace();
}
}
//元素里面会嵌套子节点,因此元素的开始和结束分开写
//如:<a><b>bcd</b></a>
private void startElement(String objectElement, AttributesImpl attrs)
throws SAXException {
if(attrs == null){
attrs = new AttributesImpl();
}
level++;
appendTab();
if (objectElement != null) {
//注意,如果atts.addAttribute设置了属性,则会输出如:<a key="key" value="value">abc</a>格式
//如果没有设置属性,则输出如:<a>abc</a>格式
handler.startElement("", "", objectElement, attrs);
}
}
//正常元素结束标记,如:</a>
private void endElement(String objectElement) throws SAXException{
level--;
appendTab();
if (objectElement != null) {
handler.endElement("", "", objectElement);
}
}
//自封闭的空元素,如<a key="key" value="value"/>,不用换行,写在一行时XML自动会自封闭
private void endEmptyElement(String objectElement) throws SAXException{
handler.endElement("", "", objectElement);
}
//无子节点的元素成为属性,如<a>abc</a>
private void writeAttribute(String key, String value) throws SAXException{
atts.clear();
level++;
appendTab();
handler.startElement("", "", key, atts);
handler.characters(value.toCharArray(), 0, value.length());
handler.endElement("", "", key);
level--;
}
public void end() {
try {
handler.endElement("", "", rootElement);
// 文档结束,同步到磁盘
handler.endDocument();
outStream.close();
}catch (Exception e) {
e.printStackTrace();
}
}
//Tab缩进,SAX默认不自动缩进,因此需要手动根据元素层次进行缩进控制
private void appendTab() throws SAXException{
String indent = separator + " ";
for(int i = 1 ; i< level; i++){
indent += tab;
}
handler.characters(indent.toCharArray(), 0, indent.length());
}
public void writeNotification(InternalEvent event) throws SAXException{
Map<String, String> props = event.getProps();
Set<String> keys = props.keySet();
level = 0;
//写<oes:Notification>节点
startElement(Constant.NOTIFICATION, null);
//写oes:NotificationID
writeAttribute(Convert.convertString(Constant.NOTIFICATION_ID), Convert.convertString(props.get(Constant.NOTIFICATION_ID)));
keys.remove(Constant.NOTIFICATION_ID);
//写oes:NotificationType
writeAttribute(Convert.convertString(Constant.NOTIFICATION_TYPE), Convert.convertString(event.getNotificationType()));
//写oes:timeStamp
writeAttribute(Convert.convertString(Constant.TIME_STAMP), Convert.convertString(props.get(Constant.TIME_STAMP)));
keys.remove(Constant.TIME_STAMP);
//写<oes:Appendix>节点
startElement(Constant.APPENDIX, null);
//写oes:MapItem
atts = new AttributesImpl();
atts.addAttribute("", "", Convert.convertString(Constant.KEY), String.class.getName(), Convert.convertString(props.get(Constant.KEY)));
keys.remove(Constant.KEY);
atts.addAttribute("", "", Convert.convertString(Constant.VALUE), String.class.getName(), Convert.convertString(props.get(Constant.VALUE)));
keys.remove(Constant.VALUE);
startElement(Constant.MAP_ITEM, atts);
//结束oes:MapItem,由于MapItem是个自封闭的元素,需要特殊处理
endEmptyElement(Constant.MAP_ITEM);
keys.remove(Constant.MAP_ITEM);
//结束oes:MapItem节点
endElement(Constant.APPENDIX);
keys.remove(Constant.APPENDIX);
//写oes:Content节点
startElement(Constant.CONTENT, null);
keys.remove(Constant.CONTENT);
//写alarmNew节点
atts = new AttributesImpl();
atts.addAttribute("", "", Convert.convertString(Constant.SYSTEM_DN), String.class.getName(), Convert.convertString(props.get(Constant.SYSTEM_DN)));
startElement(Constant.ALARM_NEW, atts);
keys.remove(Constant.ALARM_NEW);
//写Alarm节点内的属性
for(String key : keys){
writeAttribute(Convert.convertString(key), Convert.convertString(props.get(key)));
}
//结束alarmNew节点
endElement(Constant.ALARM_NEW);
//结束oes:Content节点
endElement(Constant.CONTENT);
//结束<oes:Notification>节点
endElement(Constant.NOTIFICATION);
}
}
(3).先用SAX读取XML文件,然后使用SAX处理写出的demo程序如下:
public class FlexMapping {
private static String inputFile = "input/input.xml";
private static String outputFile = "output/output.xml";
private static List<InternalEvent> events;
public static void main(String[] args) {
long start = System.currentTimeMillis();
try {
events = new EventFactory().read(inputFile);
WriteXML xml = new WriteXML(outputFile, Constant.NOTIFICATIONS);
for(InternalEvent event : events){
xml.writeNotification(event);
}
xml.end();
}catch (Exception e) {
e.printStackTrace();
}
System.out.println("耗时:" + (System.currentTimeMillis() - start) + "ms.");
}
}
写出的XML文件如下:
<?xml version="1.0" encoding="UTF-8"?>
<oes:Notifications xmlns:oes="http://xml.sax.test.com/oesAccessNotification">
<oes:Notification>
<oes:NotificationID_TEST>11111_TEST</oes:NotificationID_TEST>
<oes:NotificationType_TEST>AlarmNew_TEST</oes:NotificationType_TEST>
<oes:timeStamp_TEST>2009-02-25T08:57:17_TEST</oes:timeStamp_TEST>
<oes:Appendix>
<oes:MapItem key_TEST="key_TEST" value_TEST="value_TEST"/>
</oes:Appendix>
<oes:Content>
<alarmNew systemDN_TEST="PLMN-1/S3SN-1/SRME-BSS-2/SBSS-0_TEST">
<additionalText1_TEST>A Raised by pipe supervision script, process ID 20848_TEST</additionalText1_TEST>
<systemDN_TEST>PLMN-1/S3SN-1/SRME-BSS-2/SBSS-0_TEST</systemDN_TEST>
<additionalText2_TEST>A test additional text2_TEST</additionalText2_TEST>
<eventTime_TEST>2009-02-25T08:57:17+02:00_TEST</eventTime_TEST>
<probableCause_TEST>0_TEST</probableCause_TEST>
<additionalText3_TEST>A test additional text3_TEST</additionalText3_TEST>
<alarmText_TEST>PIPE 0 IS SLOW OR NOT WORKING_TEST</alarmText_TEST>
<specificProblem_TEST>86600_TEST</specificProblem_TEST>
<additionalText6_TEST>Original_TEST</additionalText6_TEST>
<additionalText5_TEST>A Original Additional text: test alarm1 | Original Probable Cause: Toxic Leak1 Detected |
Original alarm time: 20090901183006+0530 | Automatic clearing:Y
_TEST</additionalText5_TEST>
<perceivedSeverity_TEST>critical_TEST</perceivedSeverity_TEST>
<additionalText4_TEST>A test additional text4_TEST</additionalText4_TEST>
<alarmId_TEST>400951_TEST</alarmId_TEST>
<eventType_TEST>processingError_TEST</eventType_TEST>
</alarmNew>
</oes:Content>
</oes:Notification>
</oes:Notifications>
Demo例子,仅供参考,不过已经基本上涵盖了SAX读写XML的流程。