在使用dom4j解析xml文件时,可能会对一些节点做检测,判断是否符合schema,对一些不符合的节点要作出提示。为了使作出的提示更友好,还需要指出错误在哪里。但是dom4j并没有提供相关的功能,或者说这个功能隐藏的很深。搜索了一下,发现这个问题的答案很少。我在一个mail(https://www.mail-archive.com/dom4j-user@lists.sourceforge.net/msg02769.html)中得到了提示,摸索了出来,巨麻烦。该mail提到了主要的过程:
org.xml.sax.Locator locator = new …;
DocumentFactory documentFactory = new DocumentFactoryWithLocator(locator);
SAXContentHandler contentHandler = new SAXContentHandler(documentFactory);
contentHandler.setDocumentLocator(locator);
org.xml.sax.XMLReader reader = …;
reader.setContentHandler(contentHandler );
reader.parse(…);
Document document = contentHandler.getDocument();
在DocumentFactory中:
public Element createElement(QName qname) {
ElementWithLocation element = new ElementWithLocation (qname);
element.setLocation(locator.getLineNumber(),
locator.getColumnNumber());
return element;
}
public void startDocument(XMLLocator locator, String encoding,
NamespaceContext namespaceContext, Augmentations augs)
throws XNIException {
fNamespaceContext = namespaceContext;
try {
// SAX1
if (fDocumentHandler != null) {
if (locator != null) {
fDocumentHandler.setDocumentLocator(new LocatorProxy(locator));
}
fDocumentHandler.startDocument();
}
// SAX2
if (fContentHandler != null) {
if (locator != null) {
fContentHandler.setDocumentLocator(new LocatorProxy(locator));
}
fContentHandler.startDocument();
}
}
catch (SAXException e) {
throw new XNIException(e);
}
}
就是这个方法覆盖了contentHandler的locator。我的做法就是在AbstractSAXParser重新赋值locator的时候获取这个值,传递给DocumentFactory。
上代码,首先需要扩展原来的Element类,使之可以记录节点的位置信息(需要记录Attribute等的同理)。
public class GokuElement extends DefaultElement {
private int lineNum = 0, colNum = 0;
public GokuElement(QName qname) {
super(qname);
// TODO Auto-generated constructor stub
}
public GokuElement(QName qname, int attrCount) {
super(qname, attrCount);
}
public GokuElement(String name) {
super(name);
}
public GokuElement(String name, Namespace namespace) {
super(name, namespace);
}
public int getColumnNumber() {
return this.colNum;
}
public int getLineNumber() {
return this.lineNum;
}
public void setLocation(int lineNum, int colNum) {
this.lineNum = lineNum;
this.colNum = colNum;
}
}
然后,扩展DocumentFactory,让factory生成我们定义的Element:
public class DocumentFactoryWithLocator extends DocumentFactory {
private Locator locator;
public DocumentFactoryWithLocator(Locator locator) {
super();
this.locator = locator;
}
@Override
public Element createElement(QName qname) {
GokuElement element = new GokuElement(qname);
element.setLocation(this.locator.getLineNumber(), this.locator.getColumnNumber());
return element;
}
@Override
public Element createElement(String name) {
GokuElement element = new GokuElement(name);
element.setLocation(this.locator.getLineNumber(), this.locator.getColumnNumber());
return element;
}
public void setLocator(Locator locator) {
this.locator = locator;
}
}
然后扩展SAXContentHandler:
public class GokuSAXContentHandler extends SAXContentHandler {
private DocumentFactoryWithLocator documentFactory = null;
public GokuSAXContentHandler(DocumentFactory documentFactory2, ElementHandler dispatchHandler) {
// TODO Auto-generated constructor stub
super(documentFactory2, dispatchHandler);
}
public void setDocFactory(DocumentFactoryWithLocator fac) {
this.documentFactory = fac;
}
@Override
public void setDocumentLocator(Locator documentLocator) {
super.setDocumentLocator(documentLocator);
if (this.documentFactory != null)
this.documentFactory.setLocator(documentLocator);
}
}
最后扩展SAXReader
public class GokuSAXReader extends SAXReader {
DocumentFactory docFactory;
Locator locator;
public GokuSAXReader(DocumentFactory docFactory) {
// TODO Auto-generated constructor stub
super(docFactory);
this.docFactory = docFactory;
}
public GokuSAXReader(DocumentFactory docFactory, Locator locator) {
// TODO Auto-generated constructor stub
super(docFactory);
this.locator = locator;
this.docFactory = docFactory;
}
@Override
protected SAXContentHandler createContentHandler(XMLReader reader) {
return new GokuSAXContentHandler(this.getDocumentFactory(), super.getDispatchHandler());
}
@Override
public Document read(InputSource in) throws DocumentException {
try {
XMLReader reader = this.getXMLReader();
reader = this.installXMLFilter(reader);
EntityResolver thatEntityResolver = super.getEntityResolver();
if (thatEntityResolver == null) {
thatEntityResolver = this.createDefaultEntityResolver(in.getSystemId());
super.setEntityResolver(thatEntityResolver);
}
reader.setEntityResolver(thatEntityResolver);
SAXContentHandler contentHandler = this.createContentHandler(reader);
contentHandler.setEntityResolver(thatEntityResolver);
contentHandler.setInputSource(in);
boolean internal = this.isIncludeInternalDTDDeclarations();
boolean external = this.isIncludeExternalDTDDeclarations();
contentHandler.setIncludeInternalDTDDeclarations(internal);
contentHandler.setIncludeExternalDTDDeclarations(external);
contentHandler.setMergeAdjacentText(this.isMergeAdjacentText());
contentHandler.setStripWhitespaceText(this.isStripWhitespaceText());
contentHandler.setIgnoreComments(this.isIgnoreComments());
reader.setContentHandler(contentHandler);
this.configureReader(reader, contentHandler);
((GokuSAXContentHandler) contentHandler).setDocFactory((DocumentFactoryWithLocator) this.docFactory);
contentHandler.setDocumentLocator(this.locator);
reader.parse(in);
return contentHandler.getDocument();
} catch (Exception e) {
if (e instanceof SAXParseException) {
// e.printStackTrace();
SAXParseException parseException = (SAXParseException) e;
String systemId = parseException.getSystemId();
if (systemId == null) {
systemId = "";
}
String message = "Error on line " + parseException.getLineNumber() + " of document " + systemId + " : "
+ parseException.getMessage();
throw new DocumentException(message, e);
} else {
throw new DocumentException(e.getMessage(), e);
}
}
}
}
重写了read(InputSource)方法,使用我们写的SAXContentHandler。其他签名的read方法最后都调用了这个read方法。
解析文件的代码:
Locator locator = new LocatorImpl();
DocumentFactory docFactory = new DocumentFactoryWithLocator(locator);
SAXReader reader = new GokuSAXReader(docFactory, locator);
Document doc = reader.read(new File("goku.xml"));
需要获取Element信息时:
System.out.println(((GokuElement) element).getLineNumber());