dom4j解析xml

最新推荐文章于 2025-06-14 15:55:34 发布

iteye_123

最新推荐文章于 2025-06-14 15:55:34 发布

阅读量79

点赞数

分类专栏：代码文章标签： xml dom4j

本文链接：https://blog.csdn.net/iteye_123/article/details/82443392

版权

代码专栏收录该内容

2 篇文章

订阅专栏

在XML中，所有的内容都可以看作是一个节点，以下用node代替。
DOM中这样规定：
整个文档——document node
每一个XML元素——element node
XML中的文本元素——text node
属性——attribute node
注释——comment node
CDATA——CDATA section node 以上是常用的一些node类型，更多的node类型参考：DOM Node Types

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>

<price>30.00</price>

</book>

<book category="children" title="Harry Potter" author="J K. Rowling" year="2005" price="$29.9"/>



<![CDATA[
这里的所有内容都会被忽略
]]>

</bookstore>
在此例中：
整个文档是一个document node
bookstore、book、title、author等都是element node
category是节点的attribute node，同样，lang是title的attribute node
需要特别注意的是，<author>包含的“Giada De Laurentiis”是text node的值，而非<author>元素的值。
<!– –>是xml中的注释部分，也可以看作是node
CDATA也可以看作是node XML DOM定义了实体对象来描述Node，并为其定义了属性和方法，上面提到的所有的node都继承了这个Node对象，详见：DOM Node

需要注意的是:

1）Node对象中的属性方法并非适用于所有的node，比如在Node对象中定义了获取子节点的方法，但是text node没有子节点，此时调用这些方法则会出错;

2）子节点可以作为父节点的属性而存在，例如片段2-1中的第二个book节点。此时如果不需要子节点，那book标签可以使用/来结束该标签；

3）XML必须有一个根元素，其他所有元素都是该元素的后代元素。

以解析片段2-1所示xml为例：

class org.dom4j.io.SAXReader

read 提供多种读取xml文件的方式，返回一个Domcument对象
interface org.dom4j.Document

iterator 使用此法获取node
getRootElement 获取根节点
interface org.dom4j.Node

getName 获取node名字，例如获取根节点名称为bookstore
getNodeType 获取node类型常量值，例如获取到bookstore类型为1——Element
getNodeTypeName 获取node类型名称，例如获取到的bookstore类型名称为Element
interface org.dom4j.Element

attributes 返回该元素的属性列表
attributeValue 根据传入的属性名获取属性值
elementIterator 返回包含子元素的迭代器
elements 返回包含子元素的列表
interface org.dom4j.Attribute

getName 获取属性名
getValue 获取属性值
interface org.dom4j.Text

getText 获取Text节点值
interface org.dom4j.CDATA

getText 获取CDATA Section值
interface org.dom4j.Comment

getText 获取注释

/**
* 解析包含有DB连接信息的XML文件
* 格式必须符合如下规范：
* 1. 最多三级，每级的node名称自定义；
* 2. 二级节点支持节点属性，属性将被视作子节点；
* 3. CDATA必须包含在节点中，不能单独出现。
*
* 示例1——三级显示：
* <db-connections>
* <connection>
* <name>DBTest</name>
* <jndi></jndi>
* <url>
* <![CDATA[jdbc:mysql://localhost:3306/db_test?useUnicode=true&characterEncoding=UTF8]]>
* </url>
* <driver>org.gjt.mm.mysql.Driver</driver>
* <user>test</user>
* <password>test2012</password>
* <max-active>10</max-active>
* <max-idle>10</max-idle>
* <min-idle>2</min-idle>
* <max-wait>10</max-wait>
* <validation-query>SELECT 1+1</validation-query>
* </connection>
* </db-connections>
*
* 示例2——节点属性：
* <bookstore>
* <book category="cooking">
* <title lang="en">Everyday Italian</title>
* <author>Giada De Laurentiis</author>
* <year>2005</year>
* <price>30.00</price>
* </book>
*
* <book category="children" title="Harry Potter" author="J K. Rowling" year="2005" price="$29.9"/>
* </bookstore>
*
* @param configFile
* @return
* @throws Exception
*/
public static List<Map<String, String>> parseDBXML(String configFile) throws Exception {
List<Map<String, String>> dbConnections = new ArrayList<Map<String, String>>();
InputStream is = Parser.class.getResourceAsStream(configFile);
SAXReader saxReader = new SAXReader();
Document document = saxReader.read(is);
Element connections = document.getRootElement();

Iterator<Element> rootIter = connections.elementIterator();
while (rootIter.hasNext()) {
Element connection = rootIter.next();
Iterator<Element> childIter = connection.elementIterator();
Map<String, String> connectionInfo = new HashMap<String, String>();
List<Attribute> attributes = connection.attributes();
for (int i = 0; i < attributes.size(); ++i) { // 添加节点属性
connectionInfo.put(attributes.get(i).getName(), attributes.get(i).getValue());
}
while (childIter.hasNext()) { // 添加子节点
Element attr = childIter.next();
connectionInfo.put(attr.getName().trim(), attr.getText().trim());
}
dbConnections.add(connectionInfo);
}

return dbConnections;
}