XML解析

32 篇文章 0 订阅
Dom解析XML

将整个文档加载到内存中,拿到了树状结构根节点相当于拿到了全部节点。

优点:方便的实现增加,修改,删除的操作。

缺点:一次在内存中分配一个树形结构,容易造成内存的溢出。

编程思路:DocumentBuilderFacory->DocumentBuilder->Document->NodeList->Node

若是修改xml,则需注意应该把内存中的document实例化到文件中,否则硬盘上的文件没有发生变化。

修改思路:TransformerFactory->Transformer->transform(DOMSource,StreamResult); DOMSource是document根节点,StreamResult是文件。

import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.junit.Test;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class DomParse {
	@Test
	public void domparse() throws ParserConfigurationException, SAXException, IOException
	{
		DocumentBuilderFactory builderFactory=DocumentBuilderFactory.newInstance();
		DocumentBuilder builder=builderFactory.newDocumentBuilder();
		Document document =builder.parse("book.xml");
		NodeList list=document.getElementsByTagName("Name");
		Node node=list.item(1);  //第二个节点
		String content=node.getTextContent();
		System.out.println(content);	
	}
	@Test
	public void domModify() throws ParserConfigurationException, SAXException, IOException, TransformerException
	{
		DocumentBuilderFactory builderFactory=DocumentBuilderFactory.newInstance();
		DocumentBuilder builder=builderFactory.newDocumentBuilder();
		Document document =builder.parse("book.xml");
		NodeList list=document.getElementsByTagName("Name");
		Node node=list.item(1);  //第二个节点
		node.setTextContent("wanhao");
		
		TransformerFactory factory=TransformerFactory.newInstance();
		Transformer transformer=factory.newTransformer();
		Source xmlSource=new DOMSource(document);
		Result outputTarget=new StreamResult("book.xml");
		transformer.transform(xmlSource, outputTarget);
	}
}

book.xml放在项目根路径下

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<AddressBook>
    <Address id="1" isLocal="true">
        <Name solutation="Mr.">Sam</Name>
        <Street>QuanKouRd.</Street>
        <City>ShangHai</City>
        <State>US</State>
        <Country>China</Country>
        <Pin>111</Pin>
    </Address>
    <Address id="2" isLocal="false">
        <Name solutation="Mrs.">wanhao</Name>
        <Street>JiaHangRd.</Street>
        <City>ShangHai</City>
        <State>US</State>
        <Country>India</Country>
        <Pin>222</Pin>
    </Address>
</AddressBook>
SAX解析XML

一个元素一个元素的解析,不会造成内存溢出,当然也不可以修改xml。

思路SAXParserFactory -> SAXParser ->parse ->匿名内部类DefaultHandler

@Test
		public void saxParse() throws ParserConfigurationException, SAXException, IOException{
			SAXParserFactory parseFactory=SAXParserFactory.newInstance();
			SAXParser parser=parseFactory.newSAXParser();
			parser.parse(new File("book.xml"), new DefaultHandler(){

				@Override
				public void startDocument() throws SAXException {
					System.out.println("文档开始");
				}

				@Override
				public void endDocument() throws SAXException {
					System.out.println("文档结束");
				}
				//qName 标签名字 ,attributes属性值,虽不是list,也可以通过get方法遍历,开始一个标签
				@Override
				public void startElement(String uri, String localName, String qName, Attributes attributes)
						throws SAXException {
					System.out.print("<"+qName);
					for(int  i=0;i<attributes.getLength();++i)
					{
						System.out.print(" "+attributes.getQName(i)+"="+attributes.getValue(i));
					}
					System.out.print(">");
				}

				@Override
				public void endElement(String uri, String localName, String qName) throws SAXException {
					System.out.print("</"+qName+">");
				}
				//文本内容
				@Override
				public void characters(char[] ch, int start, int length) throws SAXException {
					String content=new String(ch,start,length);
					System.out.print(content);
				}
			});
		}
若要获得第二个标签名为Name的标签中的值,也十分简单。

一边用int记录是第几个标签名为Name的标签,一边读取文本内容。

	   private static int cur=0;
		@Test
		public void getSecondName() throws ParserConfigurationException, SAXException, IOException
		{
			SAXParserFactory factory=SAXParserFactory.newInstance();
			SAXParser parser=factory.newSAXParser();
			parser.parse(new File("book.xml"), new DefaultHandler(){

				@Override
				public void startElement(String uri, String localName, String qName, Attributes attributes)
						throws SAXException {
					if("Name".equals(qName))
					{
						cur++;
					}
				}

				@Override
				public void characters(char[] ch, int start, int length) throws SAXException {
					if(cur==2)
					{
						cur++;
						System.out.println("第二个Name标签中值为"+new String(ch,start,length));
					}
				}
			});
		}

若是想要把原来的xml文件转化成List<JavaBean>然后可以随意遍历,举个例子:

		List<Address>list=new ArrayList<>();
		Address address;
		Name name;
		boolean isName;
		boolean isStreet;
		boolean isCity;
		boolean isState;
		boolean isCountry;
		boolean isPin;
		@Test
		public void saxParserList() throws ParserConfigurationException, SAXException, IOException
		{
			SAXParserFactory factory=SAXParserFactory.newInstance();
			SAXParser parser=factory.newSAXParser();
			
			parser.parse(new File("book.xml"),new DefaultHandler(){
				@Override
				public void endDocument() throws SAXException {
					System.out.println(list);
				}

				@Override
				public void startElement(String uri, String localName, String qName, Attributes attributes)
						throws SAXException {
					if("Name".equals(qName)){
						isName=true;
						name=new Name();
						name.setSolutation(attributes.getValue(0));
					}
					else if("Address".equals(qName)){
						address=new Address();
						address.setId(Integer.parseInt(attributes.getValue(0)));
						address.setLocal(Boolean.parseBoolean(attributes.getValue(1)));						
					}
					else if("Street".equals(qName)){
						isStreet=true;
					}
					else if("City".equals(qName)){
						isCity=true;
					}
					else if("State".equals(qName)){
						isState=true;
					}
					else if("Country".equals(qName)){
						isCountry=true;
					}
					else if("Pin".equals(qName)){
						isPin=true;
					}
				}

				@Override
				public void endElement(String uri, String localName, String qName) throws SAXException {
					if("Street".equals(qName)){
						isStreet=false;
					}
					else if("City".equals(qName)){
						isCity=false;
					}
					else if("State".equals(qName)){
						isState=false;
					}
					else if("Country".equals(qName)){
						isCountry=false;
					}
					else if("Pin".equals(qName)){
						isPin=false;
					}
					else if("Name".equals(qName)){
						isName=false;
						address.setName(name);
					}	
					else if("Address".equals(qName)){
						list.add(address);
					}
				}

				@Override
				public void characters(char[] ch, int start, int length) throws SAXException {
					if(isName){
						name.setContent(new String(ch,start,length));
					}
					if(isStreet){
						address.setStreet(new String(ch,start,length));
					}
					if(isCity){
						address.setCity(new String(ch,start,length));
					}
					if(isState){
						address.setState(new String(ch,start,length));
					}
					if(isCountry){
						address.setCountry(new String(ch,start,length));
					}
					if(isPin){
						address.setPin(new String(ch,start,length));
					}
				}
				
			});
			
		}
思路:创建链表,读到节点属性读入对象,读完节点,把对象加入链表或者设置为另一对象的属性。读完文档便可以遍历文档。
写javabean时,注意应该写上toString()方法。
避免读到换行和空格的方法是当读完标签将flag设为false。
Pull解析XML

Pull的原理和SAX基本类似,但Pull相对于SAX,有几个优点

1.Pull面向对象的程度更好,所以它只需在一个方法里便可以调取全部信息,不必设置各种flag,因此简化了代码。
2.Pull随时可以终止解析Xml,节约了CPU资源。而SAX每次解析都会把XML解析完。想要从中间终止解析xml,建议用break out;

仿照上一个例子:

将xml实例化到ArrayList

	private List<Address>list=new ArrayList<>();
	private Address address;
	private Name name;
	@Test
	public void pullTest() throws XmlPullParserException, IOException{
		XmlPullParserFactory  factory=XmlPullParserFactory.newInstance();
		XmlPullParser parser=factory.newPullParser();
		parser.setInput(new FileInputStream("book.xml"), "utf-8");
		int eventType=parser.getEventType();
		while(eventType!=XmlPullParser.END_DOCUMENT)
		{
			switch(eventType)
			{
			case XmlPullParser.START_TAG:
				if("Address".equals(parser.getName())){
					address=new Address();
					address.setId(Integer.parseInt(parser.getAttributeValue(0)));
					address.setLocal(Boolean.parseBoolean(parser.getAttributeValue(1)));
				}
				else if("Name".equals(parser.getName())){
					name=new Name();
					name.setSolutation(parser.getAttributeValue(0));
					name.setContent(parser.nextText());
					address.setName(name);
				}
				else if("Street".equals(parser.getName())){
					address.setStreet(parser.nextText());	
				}		
				else if("City".equals(parser.getName())){
					address.setCity(parser.nextText());
				}	
				else if("State".equals(parser.getName())){
					address.setState(parser.nextText());
				}	
				else if("Country".equals(parser.getName())){
					address.setCountry(parser.nextText());
				}
				else if("Pin".equals(parser.getName())){
					address.setPin(parser.nextText());
				}
				break;
			case XmlPullParser.END_TAG:	
				if("Address".equals(parser.getName())){
					list.add(address);
				}
				break;
			}
			
			eventType=parser.next();
		}
		System.out.println(list);
		System.out.print("解析完毕");
		
	}

只是解析第一个Address:稍微修改上述代码

		out:
		while(eventType!=XmlPullParser.END_DOCUMENT)
		{
			switch(eventType)
			{
			case XmlPullParser.START_TAG:
				if("Address".equals(parser.getName())){
					address=new Address();
					address.setId(Integer.parseInt(parser.getAttributeValue(0)));
					address.setLocal(Boolean.parseBoolean(parser.getAttributeValue(1)));
				}
				else if("Name".equals(parser.getName())){
					name=new Name();
					name.setSolutation(parser.getAttributeValue(0));
					name.setContent(parser.nextText());
					address.setName(name);
				}
				else if("Street".equals(parser.getName())){
					address.setStreet(parser.nextText());	
				}		
				else if("City".equals(parser.getName())){
					address.setCity(parser.nextText());
				}	
				else if("State".equals(parser.getName())){
					address.setState(parser.nextText());
				}	
				else if("Country".equals(parser.getName())){
					address.setCountry(parser.nextText());
				}
				else if("Pin".equals(parser.getName())){
					address.setPin(parser.nextText());
				}
				break;
			case XmlPullParser.END_TAG:	
				if("Address".equals(parser.getName())){
					i++;
					if(i==1)
					{
						System.out.println(address);
						break out;
					}
				}
				break;
			}
			
			eventType=parser.next();
		}
		System.out.print("解析完毕");

记得:

eventType必须要更新 用parser的next方法,表示解析下一个,不然会出现死循环。

还有一点获得XmlPullParser pullParser=Xml.newPullParser();也可以得到XmlPullParser.




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值