java使用dom4j解析xml

最新推荐文章于 2023-07-13 17:16:02 发布

DanielMeng9527

最新推荐文章于 2023-07-13 17:16:02 发布

阅读量833

点赞数

分类专栏： java 文章标签： java dom4j

java 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

XML解析技术有两种 DOM SAX

DOM方式
根据XML的层级结构在内存中分配一个树形结构,把XML的标签,属性和文本等元素都封装成树的节点对象
- 优点: 便于实现增 删 改 查
- 缺点: XML文件过大可能造成内存溢出
SAX方式
采用事件驱动模型边读边解析:从上到下一行行解析,解析到某一元素, 调用相应解析方法
- 优点: 不会造成内存溢出,
- 缺点: 查询不方便,但不能实现 增 删 改

不同的公司和组织提供了针对DOM和SAX两种方式的解析器

SUN的jaxp
Dom4j组织的dom4j(最常用:如Spring)
JDom组织的jdom
关于这三种解析器渊源可以参考java解析xml文件四种方式.

一.前言

Dom4j是JDom的一种智能分支,从原先的JDom组织中分离出来,提供了比JDom功能更加强大,性能更加卓越的Dom4j解析器(比如提供对XPath支持).

二.代码详情

dom4j是一个第三方开发组开发出的插件，所以在我们使用dom4jf的时候我们要去下载一下dom4j对应版本的jar导入在我们项目中。

1)xml文件:

<?xml version="1.0" encoding="UTF-8"?> 
<books> 
  <book id="001"> 
      <title>Harry Potter</title> 
      <author>J K. Rowling</author> 
  </book> 
  <book id="002"> 
      <title>Learning XML</title> 
      <author>Erik T. Ray</author> 
  </book> 
</books>

示例一：用List列表的方式来解析xml

SAXReader就是一个管道，用一个流的方式，把xml文件读出来

import java.io.File;
import java.util.List;  
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;  
public class Demo {  
    public static void main(String[] args) throws Exception {
        SAXReader reader = new SAXReader();
        File file = new File("books.xml");
        Document document = reader.read(file);
        Element root = document.getRootElement();
        List<Element> childElements = root.elements();
        for (Element child : childElements) {
            //未知属性名情况下获取属性的内容。attr.getName()获取标签名，attr.getValue()获取属性值
            /*List<Attribute> attributeList = child.attributes();
            for (Attribute attr : attributeList) {

//        	// 字符串转XML
//        	String xmlStr = "......";
//        	Document document = DocumentHelper.parseText(xmlStr);
//        	// XML转字符串 
//        	Document document = ...;
//        	String text = document.asXML();
                System.out.println(attr.getName() + ": " + attr.getValue());
            }*/              
            //已知属性名情况下
            System.out.println("id: " + child.attributeValue("id"));              
            //未知子元素名情况下获取元素的内容
            /*List<Element> elementList = child.elements();
            for (Element ele : elementList) {
                System.out.println(ele.getName() + ": " + ele.getText());
            }
            System.out.println();*/              
            //已知子元素名的情况下，elementText("title")根据标签名获取值
            System.out.println("title" + child.elementText("title"));

示例XML如下,下面我们会使用Dom4j对他进行增 删 改 查操作:

config.xml

<?xml version="1.0" encoding="utf-8"?>
<beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns="http://www.fq.me/context"
       xsi:schemaLocation="http://www.fq.me/context http://www.fq.me/context/context.xsd">
    <bean id="id1" class="com.fq.benz">
        <property name="name" value="benz"/>
    </bean>
    <bean id="id2" class="com.fq.domain.Bean">
        <property name="isUsed" value="true"/>
        <property name="complexBean" ref="id1"/>
    </bean>
</beans>
 
 1
2
3
4
5
6
7
8
9
10
11
12
 
 1
2
3
4
5
6
7
8
9
10
11
12

context.xsd

<?xml version="1.0" encoding="utf-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
        targetNamespace="http://www.fq.me/context"
        elementFormDefault="qualified">
    <element name="beans">
        <complexType>
            <sequence>
                <element name="bean" maxOccurs="unbounded">
                    <complexType>
                        <sequence>
                            <element name="property" maxOccurs="unbounded">
                                <complexType>
                                    <attribute name="name" type="string" use="required"/>
                                    <attribute name="value" type="string" use="optional"/>
                                    <attribute name="ref" type="string" use="optional"/>
                                </complexType>
                            </element>
                        </sequence>
                        <attribute name="id" type="string" use="required"/>
                        <attribute name="class" type="string" use="required"/>
                    </complexType>
                </element>
            </sequence>
        </complexType>
    </element>
</schema>
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

/**
 * @author jifang
 * @since 16/1/18下午4:02.
 */
public class Dom4jRead {

    @Test
    public void client() throws DocumentException {
        SAXReader reader = new SAXReader();
        Document document = reader.read(ClassLoader.getSystemResource("config.xml"));
        // ...
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
 
 1
2
3
4
5
6
7
8
9
10
11
12
13

与JAXP类似Document也是一个接口(org.dom4j包下),其父接口是Node, Node的子接口还有Element Attribute Document Text CDATA Branch等

Node

`Node`常用方法	释义
`Element getParent()`	getParent returns the parent Element if this node supports the parent relationship or null if it is the root element or does not support the parent relationship.

Document

`Document`常用方法	释义
`Element getRootElement()`	Returns the root Elementfor this document.

Element

`Element`常用方法	释义
`void add(Attribute/Text param)`	Adds the given Attribute/Text to this element.
`Element addAttribute(String name, String value)`	Adds the attribute value of the given local name.
`Attribute attribute(int index)`	Returns the attribute at the specified indexGets the
`Attribute attribute(String name)`	Returns the attribute with the given name
`Element element(String name)`	Returns the first element for the given local name and any namespace.
`Iterator elementIterator()`	Returns an iterator over all this elements child elements.
`Iterator elementIterator(String name)`	Returns an iterator over the elements contained in this element which match the given local name and any namespace.
`List elements()`	Returns the elements contained in this element.
`List elements(String name)`	Returns the elements contained in this element with the given local name and any namespace.

Branch

`Branch`常用方法	释义
`Element addElement(String name)`	Adds a new Element node with the given name to this branch and returns a reference to the new node.
`boolean remove(Node node)`	Removes the given Node if the node is an immediate child of this branch.

Dom4j查询

打印所有属性信息:

/**
 * @author jifang
 * @since 16/1/18下午4:02.
 */
public class Dom4jRead {

    private Document document;

    @Before
    public void setUp() throws DocumentException {
        document = new SAXReader()
                .read(ClassLoader.getSystemResource("config.xml"));
    }

    @Test
    @SuppressWarnings("unchecked")
    public void client() {
        Element beans = document.getRootElement();

        for (Iterator iterator = beans.elementIterator(); iterator.hasNext(); ) {
            Element bean = (Element) iterator.next();
            String id = bean.attributeValue("id");
            String clazz = bean.attributeValue("class");
            System.out.println("id: " + id + ", class: " + clazz);

            scanProperties(bean.elements());
        }
    }

    public void scanProperties(List<? extends Element> properties) {
        for (Element property : properties) {
            System.out.print("name: " + property.attributeValue("name"));
            Attribute value = property.attribute("value");
            if (value != null) {
                System.out.println("," + value.getName() + ": " + value.getValue());
            }
            Attribute ref = property.attribute("ref");
            if (ref != null) {
                System.out.println("," + ref.getName() + ": " + ref.getValue());
            }
        }
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

Dom4j添加节点

在第一个<bean/>标签末尾添加<property/>标签

<bean id="id1" class="com.fq.benz"> 
    <property name="name" value="benz"/>  
    <property name="refBean" ref="id2">新添加的标签</property>
</bean>  
 
 1
2
3
4
 
 1
2
3
4

/**
 * @author jifang
 * @since 16/1/19上午9:50.
 */
public class Dom4jAppend {

    //...

    @Test
    public void client() {
        Element beans = document.getRootElement();
        Element firstBean = beans.element("bean");
        Element property = firstBean.addElement("property");
        property.addAttribute("name", "refBean");
        property.addAttribute("ref", "id2");
        property.setText("新添加的标签");
    }

    @After
    public void tearDown() throws IOException {
        // 回写XML
        OutputFormat format = OutputFormat.createPrettyPrint();
        XMLWriter writer = new XMLWriter(new FileOutputStream("src/main/resources/config.xml"), format);
        writer.write(document);
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

我们可以将获取读写XML操作封装成一个工具, 以后调用时会方便些:

/**
 * @author jifang
 * @since 16/1/19下午2:12.
 */
public class XmlUtils {

    public static Document getXmlDocument(String config) {
        try {
            return new SAXReader().read(ClassLoader.getSystemResource(config));
        } catch (DocumentException e) {
            throw new RuntimeException(e);
        }
    }

    public static void writeXmlDocument(String path, Document document) {
        try {
            new XMLWriter(new FileOutputStream(path), OutputFormat.createPrettyPrint()).write(document);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

在第一个<bean/>的第一个<property/>后面添加一个<property/>标签

<bean id="id1" class="com.fq.benz"> 
    <property name="name" value="benz"/>  
    <property name="rate" value="3.14"/>
    <property name="refBean" ref="id2">新添加的标签</property> 
</bean>  
 
 1
2
3
4
5
 
 1
2
3
4
5

public class Dom4jAppend {

    private Document document;

    @Before
    public void setUp() {
        document = XmlUtils.getXmlDocument("config.xml");
    }

    @Test
    @SuppressWarnings("unchecked")
    public void client() {
        Element beans = document.getRootElement();
        Element firstBean = beans.element("bean");
        List<Element> properties = firstBean.elements();

        //Element property = DocumentHelper
        // .createElement(QName.get("property", firstBean.getNamespaceURI()));
        Element property = DocumentFactory.getInstance()
                .createElement("property", firstBean.getNamespaceURI());
        property.addAttribute("name", "rate");
        property.addAttribute("value", "3.14");
        properties.add(1, property);
    }

    @After
    public void tearDown() {
        XmlUtils.writeXmlDocument("src/main/resources/config.xml", document);
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Dom4j修改节点

将id1 bean的第一个<property/>修改如下:

<property name="name" value="翡青"/>  
 
 1
 
 1

@Test
@SuppressWarnings("unchecked")
public void client() {
    Element beans = document.getRootElement();
    Element firstBean = beans.element("bean");
    List<Element> properties = firstBean.elements();

    Element property = DocumentFactory.getInstance()
            .createElement("property", firstBean.getNamespaceURI());
    property.addAttribute("name", "rate");
    property.addAttribute("value", "3.14");
    properties.add(1, property);
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
 
 1
2
3
4
5
6
7
8
9
10
11
12
13

Dom4j 删除节点

删除刚刚修改的节点

@Test
@SuppressWarnings("unchecked")
public void delete() {
    List<Element> beans = document.getRootElement().elements("bean");
    for (Element bean : beans) {
        if (bean.attributeValue("id").equals("id1")) {
            List<Element> properties = bean.elements("property");
            for (Element property : properties) {
                if (property.attributeValue("name").equals("name")) {
                    // 执行删除动作
                    property.getParent().remove(property);
                    break;
                }
            }
            break;
        }
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Dom4j实例

在Java 反射一文中我们实现了根据JSON配置文件来加载bean的对象池,现在我们可以为其添加根据XML配置(XML文件同前):

/**
 * @author jifang
 * @since 16/1/18下午9:18.
 */
public class XmlParse {

    private static final ObjectPool POOL = ObjectPoolBuilder.init(null);

    public static Element parseBeans(String config) {
        try {
            return new SAXReader().read(ClassLoader.getSystemResource(config)).getRootElement();
        } catch (DocumentException e) {
            throw new RuntimeException(e);
        }
    }

    public static void processObject(Element bean, List<? extends Element> properties)
            throws ClassNotFoundException, IllegalAccessException, InstantiationException, NoSuchFieldException {
        Class<?> clazz = Class.forName(bean.attributeValue(CommonConstant.CLASS));
        Object targetObject = clazz.newInstance();

        for (Element property : properties) {
            String fieldName = property.attributeValue(CommonConstant.NAME);
            Field field = clazz.getDeclaredField(fieldName);
            field.setAccessible(true);
            // 含有value属性
            if (property.attributeValue(CommonConstant.VALUE) != null) {
                SimpleValueSetUtils.setSimpleValue(field, targetObject, property.attributeValue(CommonConstant.VALUE));
            } else if (property.attributeValue(CommonConstant.REF) != null) {
                String refId = property.attributeValue(CommonConstant.REF);
                Object object = POOL.getObject(refId);
                field.set(targetObject, object);
            } else {
                throw new RuntimeException("neither value nor ref");
            }
        }

        POOL.putObject(bean.attributeValue(CommonConstant.ID), targetObject);
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

注: 上面代码只是对象池项目的XML解析部分,完整项目可参考git@git.oschina.net:feiqing/commons-frame.git

XPath

XPath是一门在XML文档中查找信息的语言,XPath可用来在XML文档中对元素和属性进行遍历.

表达式	描述
`/`	从根节点开始获取(`/beans`:匹配根下的`<beans/>`; `/beans/bean`:匹配`<beans/>`下面的`<bean/>`)
`//`	从当前文档中搜索,而不用考虑它们的位置(`//property`: 匹配当前文档中所有`<property/>`)
`*`	匹配任何元素节点(`/*`: 匹配所有标签)
`@`	匹配属性(例: `//@name`: 匹配所有`name`属性)
`[position]`	位置谓语匹配(例: `//property[1]`: 匹配第一个`<property/>`;`//property[last()]`: 匹配最后一个`<property/>`)
`[@attr]`	属性谓语匹配(例: `//bean[@id]`: 匹配所有带id属性的标签; `//bean[@id='id1']`: 匹配所有id属性值为’id1’的标签)

谓语: 谓语用来查找某个特定的节点或者包含某个指定的值的节点.

XPath的语法详细内容可以参考W3School XPath 教程.

Dom4j对XPath的支持

默认的情况下Dom4j并不支持XPath, 需要在pom下添加如下依赖:

<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.1.6</version>
</dependency>
 
 1
2
3
4
5
 
 1
2
3
4
5

Dom4jNode接口提供了方法对XPath支持:

方法
`List selectNodes(String xpathExpression)`
`List selectNodes(String xpathExpression, String comparisonXPathExpression)`
`List selectNodes(String xpathExpression, String comparisonXPathExpression, boolean removeDuplicates)`
`Object selectObject(String xpathExpression)`
`Node selectSingleNode(String xpathExpression)`

XPath实现查询

查询所有bean标签上的属性值

/**
 * @author jifang
 * @since 16/1/20上午9:28.
 */
public class XPathRead {

    private Document document;

    @Before
    public void setUp() throws DocumentException {
        document = XmlUtils.getXmlDocument("config.xml");
    }

    @Test
    @SuppressWarnings("unchecked")
    public void client() {
        List<Element> beans = document.selectNodes("//bean");
        for (Element bean : beans) {
            System.out.println("id: " + bean.attributeValue("id") +
                    ", class: " + bean.attributeValue("class"));
        }
    }
}
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

XPath实现更新

删除id=”id2”的<bean/>

@Test
public void client() {
    Node bean = document.selectSingleNode("//bean[@id=\"id2\"]");
    bean.getParent().remove(bean);
}

DanielMeng9527

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录