I have some XML documents with errors in - sometimes end tags are missing - and I want to find the places where this happens and fix them (manually).
I've used XOM to parse the documents and it handily says "missing end tag" at the right times, and tells me the name of the element, but doesn't guide me very well to where the problem is in the file.
I could write my own parser that helps to do this, but I wonder if there's already a solution? I don't want automatic tidying, as I want to make sure end tags are inserted in the right place. I just want to know the line number of the start tag.
解决方案
I think it simple and can be done without any 3rd party library. Java has standart class
javax.xml.stream.XMLEventReader, and it will throw XMLException when it find missed end tag. Then call e.getLocation().getLineNumber() to get line number.
a bit complecated sample:
InputStream is = new FileInputStream("test.xml");
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLEventReader eventReader = inputFactory.createXMLEventReader(is, "utf-8");
Stack stack = new Stack();
while (eventReader.hasNext()) {
try {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
StartElement startElement = event.asStartElement();
System.out.println("processing element: " + startElement.getName().getLocalPart());
stack.push(startElement);
}
if(event.isEndElement()){
stack.pop();
}
}catch(XMLStreamException e){
System.out.println("error in line: " +e.getLocation().getLineNumber());
StartElement se = stack.pop();
System.out.println("non-closed tag:" + se.getName().getLocalPart() + " " + se.getLocation().getLineNumber());
throw e;
}
}