TinyXmlParser开源喽~~~

最新推荐文章于 2024-07-11 18:01:45 发布

weixin_33895516

最新推荐文章于 2024-07-11 18:01:45 发布

阅读量78

点赞数

文章标签： python java

原文链接：https://my.oschina.net/tinyframework/blog/194574

版权

2019独角兽企业重金招聘Python工程师标准>>>

优点：

高效、简单、易用的Xml解析器。

学习时间，分分钟。

支持中文标签名与属性名，支持下划线，减号等分隔符。

解析速度超过，查找速度超快，支持格式化。

缺点：不支持Xml Schema,DTD校验。

Maven引用坐标：

<dependency>
<groupId>org.tinygroup</groupId>
<artifactId>xmlparser</artifactId>
<version>0.0.12</version>
</dependency>

解析下面xml

<?xml version="1.0"?>
<students>
    <student>
        <name>John</name>
        <grade>B</grade>
        <age>12</age>
    </student>
    <student>
        <name>Mary</name>
        <grade>A</grade>
        <age>11</age>
    </student>
    <student>
        <name>Simon</name>
        <grade>A</grade>
        <age>18</age>
    </student>
</students>

示例代码：

public class TestXmlParser {
    public static void main(String[] args) throws Throwable {
        File file = new File("E:/test/students.xml ");
        XmlStringParser parser = new XmlStringParser();
        XmlDocument document = parser.parse(IOUtils.readFromInputStream(
                new FileInputStream(file), "utf-8"));
        printStudents(document.getRoot());
    }
    private static void printStudents(XmlNode studentsNode) {
        for(XmlNode studentNode:studentsNode.getSubNodes("student")){
            printStuent(studentNode);
        }
    }
    private static void printStuent(XmlNode studentNode) {
        printSubTagByName(studentNode,"name");
        printSubTagByName(studentNode,"grade");
        printSubTagByName(studentNode,"age");
    }
    private static void printSubTagByName(XmlNode studentNode,String tagName) {
        System.out.println( studentNode.getSubNode(tagName).getContent());
    }
}

格式化示例：

XmlDocument doc;
doc = new XmlStringParser()
        .parse("<html 中='文'><head><title>aaa</title></head></html>");
XmlFormater f = new XmlFormater();
System.out.println(f.format(doc));

运行结果：

<html 中="文">
  <head>
    <title>
      aaa      
    </title>
  </head>
</html>

性能测试：

构建下面的节点规模：

HtmlNode node = null;

	public NameFilterTest() {
		node = new HtmlNode("root");
		for (int i = 0; i < 60; i++) {
			HtmlNode a = node.addNode(new HtmlNode("a" + i));
			for (int j = 0; j < 60; j++) {
				HtmlNode b = a.addNode(new HtmlNode("b" + j));
				for (int k = 0; k < 60; k++) {
					b.addNode(new HtmlNode("c" + k));
				}
			}
		}
	}

也就是节点数60+60*60+60*60*60个节点数时，进行下面的查找：

long t21 = System.currentTimeMillis();
FastNameFilter fast = new FastNameFilter(node);
long t22 = System.currentTimeMillis();
System.out.println("fast初始化用时" + (t22 - t21));
long t1 = System.currentTimeMillis();
String nodeName = null;
for (int x = 0; x < 10000; x++) {
	nodeName = fast.findNode("b6").getNodeName();
}
long t2 = System.currentTimeMillis();
System.out.println("FastNameFilter用时" + (t2 - t1));

运行结果：

fast初始化用时130
FastNameFilter用时39

也就是说在219661个节点规模下，查找指定节点10000次，只用时39ms，还有比这个更快的么？

如果到此为止，其实也没有啥，它提供的过滤功能可以满足绝大多数的应用场景，先看看接口：

public interface NodeFilter<T extends Node<T>> {
	/**
	 * 初始化节点
	 * 
	 * @param node
	 */
	void init(T node);

	/**
	 * 设置必须包含的属性及对应属性的值，必须存在
	 * 
	 * @param includeAttributes
	 */
	void setIncludeAttribute(Map<String, String> includeAttributes);

	/**
	 * 设置必须包含的属性及对应的属性的值，必须存在
	 * 
	 * @param key
	 * @param value
	 */
	void setIncludeAttribute(String key, String value);

	/**
	 * 设置必须包含的属性
	 * 
	 * @param includeAttribute
	 */
	void setIncludeAttributes(String... includeAttribute);

	/**
	 * 设置必须排除的属性及对应属性值 如果包含属性，但属性的值与Map中不相同，允许存在该属性 若包含属性且属性的值与Map中相同，则不允许存在该属性
	 * 
	 * @param excludeAttribute
	 */
	void setExcludeAttribute(Map<String, String> excludeAttribute);

	/**
	 * 设置必须排除的属性，指定的属性不能存在
	 * 
	 * @param excludeAttribute
	 */
	void setExcludeAttribute(String... excludeAttribute);

	/**
	 * 设置必须包含的内容，只需要context中包include该值就行
	 * 
	 * @param includeText
	 */
	void setIncludeText(String... includeText);

	/**
	 * 设置必须排除的内容
	 * 
	 * @param excludeText
	 */
	void setExcludeText(String... excludeText);

	/**
	 * 设置必须包含的子节点
	 * 
	 * @param includeNode
	 */
	void setIncludeNode(String... includeNode);

	/**
	 * 设置父节点不允许的节点名称
	 * 
	 * @param excludeByNode
	 */

	void setExcludeByNode(String... excludeByNode);

	/**
	 * 设置父节点必须包含的节点名称
	 * 
	 * @param includeByNode
	 */
	void setIncludeByNode(String... includeByNode);

	/**
	 * 设置必须排除的子节点
	 * 
	 * @param excludeNode
	 */

	void setExcludeNode(String... excludeNode);

	/**
	 * 设置至少包含一个指定名称的节点
	 * 
	 * @param xorSubNode
	 */
	void setXorSubNode(String... xorSubNode);

	/**
	 * 设置至少包含一个指定名称属性
	 * 
	 * @param xorProperties
	 */
	void setXorProperties(String... xorProperties);

	/**
	 * 清除过滤条件
	 */
	void clearCondition();

	/**
	 * 设置要搜索的节点名称
	 */
	void setNodeName(String nodeName);

	/**
	 * 查找指定节点名称及满足其他条件的节点列表
	 * 
	 * @param nodeName
	 * @return
	 */
	List<T> findNodeList(String nodeName);

	/**
	 * 根据名字及其他条件查找节点，如果有多个，也只返回第一个
	 * 
	 * @param nodeName
	 *            要查找的节点名称
	 * @return
	 */
	T findNode(String nodeName);

	/**
	 * 搜索符合设置的节点名称的节点，如果有多个，则只返回找到的第一个
	 * 
	 * @return
	 */
	T findNode();

	/**
	 * 搜索符合设置的节点名称的节点列表
	 * 
	 * @return
	 */
	List<T> findNodeList();
}

从上面的接口，就可以看到，它支持属性及属性值过滤，支持属性名过滤，支持排除性名过滤，包含的文本过滤，包含的节点名过滤，被节点包含的名字过滤，排除子节点名过滤，至少包含一个节点名过滤，至少包含一个属性过滤，节点名过滤，这些过滤条件是可以组合使用的。

有了这么强大的节点过滤功能，程序员们对于Xml的使用就简单便捷多了。

转载于:https://my.oschina.net/tinyframework/blog/194574