java字符实体,保留数字字符实体字符,如`
 
`在Java中解析XML时

I am parsing XML that contains numeric character entity characters such as (but not limited to) < > (line feed carriage return < >) in Java. While parsing, I am appending text content of nodes to a StringBuffer to later write it out to a textfile.

However, these unicode characters are resolved or transformed into newlines/whitespace when I write the String to a file or print it out.

How can I keep the original numeric character entity characters symbols when iterating over nodes of an XML file in Java and storing the text content nodes to a String?

Example of demo xml file:

Example Java code. It loads the XML, iterates over the nodes and collects the text content of each node to a StringBuffer. After the iteration is over, it writes the StringBuffer to the console and also to a file (but no ) symbols.

What would be a way to keep these symbols when storing them to a String? Could you please help me? Thank you.

public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerException {

DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();

Document document = null;

DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();

document = documentBuilder.parse(new File("path/to/demo.xml"));

StringBuilder sb = new StringBuilder();

NodeList nodeList = document.getElementsByTagName("*");

for (int i = 0; i < nodeList.getLength(); i++) {

Node node = nodeList.item(i);

if (node.getNodeType() == Node.ELEMENT_NODE) {

NamedNodeMap nnp = node.getAttributes();

for (int j = 0; j < nnp.getLength(); j++) {

sb.append(nnp.item(j).getTextContent());

}

}

}

System.out.println(sb.toString());

try (Writer writer = new BufferedWriter(new OutputStreamWriter(

new FileOutputStream("path/to/demo_output.xml"), "UTF-8"))) {

writer.write(sb.toString());

}

}

解决方案

You need to escape all the XML entities before parsing the file into a Document. You do that by escaping the ampersand & itself with its corresponding XML entity &. Something like,

DocumentBuilder documentBuilder =

DocumentBuilderFactory.newInstance().newDocumentBuilder();

String xmlContents = new String(Files.readAllBytes(Paths.get("demo.xml")), "UTF-8");

Document document = documentBuilder.parse(

new InputSource(new StringReader(xmlContents.replaceAll("&", "&"))

));

Output :

2A string followed by special symbols

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值