我正在解析包含数字字符实体字符的XML,例如(但不限于) &安培;#13; &安培; LT; &安培; GT; (Java中的换行回车<>).在解析时,我将节点的文本内容附加到StringBuffer,以便稍后将其写入文本文件.
但是,当我将String写入文件或将其打印出来时,这些unicode字符将被解析或转换为换行符/空格.
在Java中迭代XML文件的节点并将文本内容节点存储到String时,如何保留原始数字字符实体字符符号?
demo xml文件的示例:
示例Java代码.它加载XML,遍历节点并将每个节点的文本内容收集到StringBuffer.迭代结束后,它将StringBuffer写入控制台,也写入文件(但没有 )符号.
将这些符号存储到字符串时,保留这些符号的方法是什么?你可以帮我吗?谢谢.
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerException {
DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
Document document = null;
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
document = documentBuilder.parse(new File("path/to/demo.xml"));
StringBuilder sb = new StringBuilder();
NodeList nodeList = document.getElementsByTagName("*");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
NamedNodeMap nnp = node.getAttributes();
for (int j = 0; j < nnp.getLength(); j++) {
sb.append(nnp.item(j).getTextContent());
}
}
}
System.out.println(sb.toString());
try (Writer writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("path/to/demo_output.xml"), "UTF-8"))) {
writer.write(sb.toString());
}
}