很多情况下我们为了优化XSD文件的可读性和可维护性,以及复用等问题的时候我们需要将schema文件拆分成多个,本文将着重关注于使用多个schema文件验证单一XML文件的问题(注: XML validation for multiple schemas)
下面将通过以下几个步骤演示如何使用多个schema(XSD)文件验证单一XML文件
1. 创建需要被验证的XML文件
2. 根据XML反向创建XSD文件
3. 使用多个schema验证XML文件
4. 运行测试
现在将逐步展开演示:
1. 创建需要被验证的XML文件
<?xml version="1.0" encoding="utf-8" ?> <employees xmlns:admin="http://www.company.com/management/employees/admin"> <admin:employee> <admin:userId>johnsmith@company.com</admin:userId> <admin:password>abc123_</admin:password> <admin:name>John Smith</admin:name> <admin:age>24</admin:age> <admin:gender>Male</admin:gender> </admin:employee> <admin:employee> <admin:userId>christinechen@company.com</admin:userId> <admin:password>123456</admin:password> <admin:name>Christine Chen</admin:name> <admin:age>27</admin:age> <admin:gender>Female</admin:gender> </admin:employee> </employees>
2. 根据XML反向创建XSD文件
注:本文是反向生成的XSD文件,当然您可能是已经有XSD文件,那就可以直接跳过第二步了。
通过观察employees.xml的格式我们可以反向的创建出employees.xsd文件,但是为了快捷起见,我们可以选择使用转换工具(XML to XSD)来完成这项工作,这里我将使用trang:http://www.thaiopensource.com/relaxng/trang.html
首先下载最新版的trang.jar文件,然后将employees.xml和trang.jar放在同一个目录下,运行如下命令行:
java -jar trang.jar employees.xml employees.xsd
运行之后将会在当前目录下生成两个XSD文件:employees.xsd, admin.xsd, 如下:
employees.xsd
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:admin="http://www.company.com/management/employees/admin"> <xs:import namespace="http://www.company.com/management/employees/admin" schemaLocation="admin.xsd"/> <xs:element name="employees"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="admin:employee"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
admin.xsd
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" targetNamespace="http://www.company.com/management/employees/admin" xmlns:admin="http://www.company.com/management/employees/admin"> <xs:import schemaLocation="employees.xsd"/> <xs:element name="employee"> <xs:complexType> <xs:sequence> <xs:element ref="admin:userId"/> <xs:element ref="admin:password"/> <xs:element ref="admin:name"/> <xs:element ref="admin:age"/> <xs:element ref="admin:gender"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="userId" type="xs:string"/> <xs:element name="password" type="xs:NMTOKEN"/> <xs:element name="name" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="gender" type="xs:NCName"/> </xs:schema>
当然你也可以自己手动的去书写XSD文件。
3. 使用多个schema验证XML文件
如果想验证使用单一shema的XML,应该不会遇到太多问题,示例如下:
public static boolean validateSingleSchema(File xml, File xsd) {
boolean legal = false;
try {
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(xsd);
Validator validator = schema.newValidator();
validator.validate(new StreamSource(xml));
legal = true;
} catch (Exception e) {
legal = false;
log.error(e.getMessage());
}
return legal;
}
但是当使用多个schema验证的时候会导致无法加载classpath外部的使用<xs:import>/<xs:include>加载的XSD文件,导致如下error message:
org.xml.sax.SAXParseException: src-resolve: Cannot resolve the name 'admin:employee' to a(n) 'element declaration' component.
为了解决这个问题我们需要使用LSResourceResolver, SchemaFactory在解析shcema的时候可以使用LSResourceResolver加载外部资源。
代码如下:
package com.javaeye.terrencexu.jaxb;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.io.Reader;
import java.net.URI;
import java.net.URISyntaxException;
import org.apache.log4j.Logger;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;
/**
*
* Implement LSResourceResolver to customize resource resolution when parsing schemas.
* <p>
* SchemaFactory uses a LSResourceResolver when it needs to locate external resources
* while parsing schemas, although exactly what constitutes "locating external resources"
* is up to each schema language.
* </p>
* <p>
* For example, for W3C XML Schema, this includes files <include>d or <import>ed,
* and DTD referenced from schema files, etc.
*</p>
*
*/
class SchemaResourceResolver implements LSResourceResolver {
private static final Logger log = Logger.getLogger(SchemaResourceResolver.class);
/**
*
* Allow the application to resolve external resources.
*
* <p>
* The LSParser will call this method before opening any external resource, including
* the external DTD subset, external entities referenced within the DTD, and external
* entities referenced within the document element (however, the top-level document
* entity is not passed to this method). The application may then request that the
* LSParser resolve the external resource itself, that it use an alternative URI,
* or that it use an entirely different input source.
* </p>
*
* <p>
* Application writers can use this method to redirect external system identifiers to
* secure and/or local URI, to look up public identifiers in a catalogue, or to read
* an entity from a database or other input source (including, for example, a dialog box).
* </p>
*/
public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
log.info("\n>> Resolving " + "\n"
+ "TYPE: " + type + "\n"
+ "NAMESPACE_URI: " + namespaceURI + "\n"
+ "PUBLIC_ID: " + publicId + "\n"
+ "SYSTEM_ID: " + systemId + "\n"
+ "BASE_URI: " + baseURI + "\n");
String schemaLocation = baseURI.substring(0, baseURI.lastIndexOf("/") + 1);
if(systemId.indexOf("http://") < 0) {
systemId = schemaLocation + systemId;
}
LSInput lsInput = new LSInputImpl();
URI uri = null;
try {
uri = new URI(systemId);
} catch (URISyntaxException e) {
e.printStackTrace();
}
File file = new File(uri);
FileInputStream is = null;
try {
is = new FileInputStream(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
lsInput.setSystemId(systemId);
lsInput.setByteStream(is);
return lsInput;
}
/**
*
* Represents an input source for data
*
*/
class LSInputImpl implements LSInput {
private String publicId;
private String systemId;
private String baseURI;
private InputStream byteStream;
private Reader charStream;
private String stringData;
private String encoding;
private boolean certifiedText;
public LSInputImpl() {}
public LSInputImpl(String publicId, String systemId, InputStream byteStream) {
this.publicId = publicId;
this.systemId = systemId;
this.byteStream = byteStream;
}
public String getBaseURI() {
return baseURI;
}
public InputStream getByteStream() {
return byteStream;
}
public boolean getCertifiedText() {
return certifiedText;
}
public Reader getCharacterStream() {
return charStream;
}
public String getEncoding() {
return encoding;
}
public String getPublicId() {
return publicId;
}
public String getStringData() {
return stringData;
}
public String getSystemId() {
return systemId;
}
public void setBaseURI(String baseURI) {
this.baseURI = baseURI;
}
public void setByteStream(InputStream byteStream) {
this.byteStream = byteStream;
}
public void setCertifiedText(boolean certifiedText) {
this.certifiedText = certifiedText;
}
public void setCharacterStream(Reader characterStream) {
this.charStream = characterStream;
}
public void setEncoding(String encoding) {
this.encoding = encoding;
}
public void setPublicId(String publicId) {
this.publicId = publicId;
}
public void setStringData(String stringData) {
this.stringData = stringData;
}
public void setSystemId(String systemId) {
this.systemId = systemId;
}
}
}
最后要做的事情就是创建一个validator去封装XML验证的逻辑代码, 如下:
package com.javaeye.terrencexu.jaxb;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringWriter;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Source;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import org.apache.log4j.Logger;
import org.xml.sax.SAXException;
public final class XMLParser {
private static final Logger log = Logger.getLogger(XMLParser.class);
private XMLParser() {}
public static boolean validateWithSingleSchema(File xml, File xsd) {
boolean legal = false;
try {
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(xsd);
Validator validator = schema.newValidator();
validator.validate(new StreamSource(xml));
legal = true;
} catch (Exception e) {
legal = false;
log.error(e.getMessage());
}
return legal;
}
public static boolean validateWithMultiSchemas(InputStream xml, List<File> schemas) {
boolean legal = false;
try {
Schema schema = createSchema(schemas);
Validator validator = schema.newValidator();
validator.validate(new StreamSource(xml));
legal = true;
} catch(Exception e) {
legal = false;
log.error(e.getMessage());
}
return legal;
}
/**
* Create Schema object from the schemas file.
*
* @param schemas
* @return
* @throws ParserConfigurationException
* @throws SAXException
* @throws IOException
*/
private static Schema createSchema(List<File> schemas) throws ParserConfigurationException, SAXException, IOException {
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
SchemaResourceResolver resourceResolver = new SchemaResourceResolver();
sf.setResourceResolver(resourceResolver);
Source[] sources = new Source[schemas.size()];
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
docFactory.setValidating(false);
docFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
for(int i = 0; i < schemas.size(); i ++) {
org.w3c.dom.Document doc = docBuilder.parse(schemas.get(i));
DOMSource stream = new DOMSource(doc, schemas.get(i).getAbsolutePath());
sources[i] = stream;
}
return sf.newSchema(sources);
}
}
4. 运行测试
public static void testValidate() throws SAXException, FileNotFoundException {
InputStream xml = new FileInputStream(new File("C:\\eclipse\\workspace1\\JavaStudy\\test\\employees.xml"));
List<File> schemas = new ArrayList<File>();
schemas.add(new File("C:\\eclipse\\workspace1\\JavaStudy\\test\\employees.xsd"));
schemas.add(new File("C:\\eclipse\\workspace1\\JavaStudy\\test\\admin.xsd"));
XMLParser.validateWithMultiSchemas(xml, schemas);
}
注:如果两个schema文件在同一个目录下,那么可以只传递一个主schema文件(employees.xsd)即可, SchemaResourceResolver会帮我们加载admin.xsd