问题描述
在使用DOM解析XML并输出文件的过程中,实体定义不在输出的文件。
原XML:
<?xml version="1.0" encoding="UTF-8"?>
<!--Arbortext, Inc., 1988-2011, v.4002-->
<!DOCTYPE book PUBLIC "-//DocBook//DTD DocBook XML V4.0//EN" "dockbook.dtd" [
<!ENTITY logo SYSTEM "../graphics/logo.png" NDATA PNG>
]>
<book>
<title>Programming in Java<title>
<para>XXXX XXX</para>
</book>
使用如下代码解析并且输出到文件:
String xmlFilePath = "c:\\temp\\source.xml";
File xml = new File(xmlFilePath);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
CatalogResolver resolver = CatalogResolverFactory.getInstance().getCatalogResolver();
builder.setEntityResolver(resolver);
document = builder.parse();
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
Properties props = new Properties();
props.put(OutputKeys.INDENT, "yes");
props.put(OutputKeys.ENCODING, "utf-8");
transformer.setOutputProperties(props);
DOMSource source = new DOMSource(document);
String outputPath = "C:\\temp\\test.xml";
StreamResult result = new StreamResult(new File(outputPath));
transformer.transform(source, result);
输出结果:
<?xml version="1.0" encoding="UTF-8"?>
<!--Arbortext, Inc., 1988-2011, v.4002-->
<book>
<title>Programming in Java<title>
<para>XXXX XXX</para>
</book>
解决办法
代码中使用了Transformer进行XML输出,使用LSSerializer进行输出。代码如下:
StringBuffer xmlFileName = new StringBuffer();
xmlFileName.append("C:\\temp\\test.xml");
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
LSSerializer writer = impl.createLSSerializer();
LSOutput output = impl.createLSOutput();
FileWriter out = new FileWriter(new File(xmlFileName.toString()));
output.setCharacterStream(out);
writer.write(currentDocument, output);