dom4j学习总结


一、Dom4j介绍


dom4j是由JDOM开发团队分裂后开发出来的包;在hibernate、JAXM中都使用了dom4j;
性能来说:Dom4j>JDom>JAXP;

二、dom4j API


DocumentHelper类中有:


(1)Document document = DocumentHelper.createDocument();//创建一个document对象,通常用于新建一个xml文档

(2)Element element = DocumentHelper.createElement();//创建一个element对象,即创建一个标签

(3)Document document = DocumentHelper.parseText(String xml);//将xml字符串转换成以document为根节点的DOM树


SAXReader类中有:


(1)SAXReader reader = new SAXReader();

(2)Document document = reader.read(new File("1.xml"));//读取并解析1.xml文档,并返回document


Document类中有:


(1)String text = document.asXML(Document); //将一颗DOM树转为XML字符串

(2)Element root = document.getRootElement(); //获得根节点


Element中有:


(1)Element newelem = elem.addElement("child"); //加入名为child的子标签,并返回此element

(2)newelem.addAttribute("name","value"); //标签添加一个属性

(3)newelem.addText("xxxx"); //为标签添加一个标签值

(4)newelem.getText(); //获得标签的标签值

(5)String value = newelem.attributeValue("name");//获得标签的属性值

(6)Iterator iter = newelem.attributeIterator() ;//标签的属性迭代器

(7)List childs = newelem.elements(); //获得标签的全部子元素

(8)Element child = newelem.element("name"); //获得标签的子标签中的多个<name>标签中的第一个元素

(9)List childs = newelem.elements("name"); //获得标签的子标签中的全部<name>标签

(10)newelem.remove(elem); //删除elem标签


XMLWriter类中有:


(1)XMLWriter writer = new XMLWriter(OutputStream out,OutputFormat format);

(2)writer.write(document); //输出document

(3)writer.close(); //关闭XMLWriter流


OutputFormat类中有:


(1)OutputFormat format = OutputFormat.createPrettyFormat();//输出时排版整齐

(2)OutputFormat format = OutputFormat.createCompactFormat();//输出时排版紧实

(3)format.setEncoding("UTF-8"); //设置<?xml ?>中的encoding属性,默认为UTF-8


Attribute类中有:


(1)attr.setValue("value"); //设置属性

(2)String value =attr.getValue();


三、dom4j中的CRUD


1.Create


创建一个文档:

private static void create() throws Exception { Document document = DocumentHelper.createDocument(); Element person = DocumentHelper.createElement("person"); document.add(person); Element name = person.addElement("name").addAttribute("a", "x").addText("xiazdong"); Element age = person.addElement("age").addText("20"); OutputFormat format = OutputFormat.createPrettyPrint(); format.setEncoding("utf-8"); XMLWriter writer = new XMLWriter(new FileOutputStream("output.xml"),format); writer.write(document); writer.close(); }

插入:

private static void insert(Document document) throws Exception { Element root = document.getRootElement(); List list = root.elements("person"); Element person = (Element)list.get(1); Element tmpElement = person.addElement("tmpChild"); tmpElement.setText("tmp"); //添加标签值 tmpElement.addAttribute("tmpname", "tmpvalue"); //添加属性 Element tmp2 = DocumentHelper.createElement("tmpChild2");//创建一个element tmp2.setText("tmp2"); list.add(1,tmp2); //在指定位置添加元素 XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml")); writer.write(document); writer.close(); }

2.Read


private static void read(Document document) throws Exception{ Element root = document.getRootElement(); Element person = (Element) root.elements("person").get(1); String value = person.element("name").getText(); String attri = person.element("name").attributeValue("a"); System.out.println(value); System.out.println(attri); }

3.Update


private static void update(Document document) throws Exception{ Element root = document.getRootElement(); Element name = root.element("person").element("name"); name.setText("xiazdong"); //更新标签值 name.attribute("a").setValue("bb"); //更新属性值 OutputFormat format = OutputFormat.createPrettyPrint(); format.setEncoding("utf-8"); XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"),format); writer.write(document); writer.close(); }

4.Delete

private static void delete(Document document) throws Exception { Element root = document.getRootElement(); Element name = root.element("person").element("name"); name.remove(name.attribute("a")); //删除attribute name.getParent().remove(name); //删除element OutputFormat format = OutputFormat.createPrettyPrint(); format.setEncoding("utf-8"); XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"),format); writer.write(document); writer.close(); }

四、乱码问题



在导入中文时,可能会出现乱码问题,乱码图示:



解决方法:

format.setEncoding("UTF-8");

并且用字节流输出




补充:dom4j处理大文件问题(比如100G)


因为dom方法是将整个XML文件读入内存,因此如果文件太大,会出现问题;

我们采用ElementHandler进行解决:每读一个分支节点,就处理一个分支节点。

SAXReader reader = new SAXReader(); reader.addHandler("/subwaycard/card", //当处理<subwaycard>元素下的<card>子元素时 new ElementHandler() { public void onEnd(ElementPath arg0) { // 处理</card>时 Element card = arg0.getCurrent(); //获得<card>节点 card.getParent().remove(card); card.detach(); //将dom树上的card节点剪枝 } public void onStart(ElementPath arg0) {//处理<card>时 } });
以上函数会在Document document = reader.read(new File("1.xml")); 调用。


以下代码可以实现大文件的删除操作:

package org.xiazdong.xml; import java.io.File; import java.io.FileOutputStream; import java.util.List; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.Element; import org.dom4j.ElementHandler; import org.dom4j.ElementPath; import org.dom4j.io.OutputFormat; import org.dom4j.io.SAXReader; import org.dom4j.io.XMLWriter; public class ElementHandlerTest { public static void main(String[] args) throws Exception { SAXReader reader = new SAXReader(); reader.addHandler("/subwaycard/card", new ElementHandler() { public void onEnd(ElementPath arg0) { Element card = arg0.getCurrent(); card.getParent().remove(card); card.detach(); } public void onStart(ElementPath arg0) { } }); Document document = reader.read(new File("1.xml")); OutputFormat format = OutputFormat.createPrettyPrint(); format.setEncoding("GBK"); XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"), format); writer.write(document); writer.close(); } }


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值