XML解析-SAX的使用

1.什么是SAX?

SAX,全称Simple API for XML,是一个用于处理XML事件驱动的“推”模型,虽然它不是W3C标准,但它却是一个得到了广泛认可的API。SAX解析器不像DOM那样建立一个完整的文档树,而是在读取文档时激活一系列事件,这些事件被推给事件处理器,然后由事件处理器提供对文档内容的访问。

事件处理器类型:

  • 用于访问XML DTD内容的DTDHandler;
  • 用于低级访问解析错误的ErrorHandler;
  • 用于访问文档内容的ContentHandler,这也是最普遍使用的事件处理器。

优势

  • 提供对XML文档内容的有效低级访问;
  • 内存消耗小,因为整个文档无需一次加载到内存中;
  • 无需像在DOM中那样为所有节点创建对象;
  • 可用于广播环境,能够同时注册多个ContentHandler,并行接收事件。

劣势

  • 必须实现多个事件处理程序以便能够处理所有到来的事件;
  • 必须在应用程序代码中维护这个事件状态;
  • 不能支持随机访问。

2.使用的SAX类:

org.xml.sax:

有如下图:


org.xml.sax.ext:


org.xml.sax.helpers

3.实例

测试用的text.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!--  <!DOCTYPE country SYSTEM "country.dtd">  -->
<!DOCTYPE country [
    <!ELEMENT country (provinces?,states?,municipalites?)>
    <!ATTLIST country name CDATA #REQUIRED>
       
    <!ELEMENT provinces (province+)>
    <!ELEMENT province (cities)>
    <!ATTLIST province name CDATA #REQUIRED>
    
    <!ELEMENT cities (city+)>    
    <!ELEMENT city (#PCDATA)> 
    <!ATTLIST city name CDATA #REQUIRED>     
]>
<country name="China">
    <provinces>
        <province name="GuangDong">
            <cities>
                <city name="GuangZhou">广州</city>
                <city name="ShenZhen">深圳</city>
                <city name="ZhuHai">珠海</city>
            </cities>
        </province>
        <province name="HuNan">
            <cities>
                <city name="ChangSha">长沙</city>
                <city name="HengYang">衡阳</city>
                <city name="ChangDe">常德</city>
            </cities>
        </province>
    </provinces>
</country>

定义MyContentHandler类,实现ContentHandler接口:

public static class MyContentHandler implements ContentHandler {
     private Locator locator;
     private int tentLength = 0;//此成员变量用于打印信息的缩进,以更好地观察输出内容
    @Override
    public void characters(char[] ch, int start, int length)
            throws SAXException {
        // TODO Auto-generated method stub
        //打印空格以缩进,下同
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("characters():\""+String.copyValueOf(ch, start, length)+"\"");
        
    }

    @Override
    public void endDocument() throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("endDocument() called");
    }

    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("endElement():</"+qName+">");
    }

    @Override
    public void endPrefixMapping(String prefix) throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("endPrefixMapping():"+prefix);
    }

    @Override
    public void ignorableWhitespace(char[] ch, int start, int length)
            throws SAXException {
        // TODO Auto-generated method stub
        //System.out.println("ignorableWhitespace():"+length);
        
        tentLength = length;
        
    }

    @Override
    public void processingInstruction(String target, String data)
            throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("processingInstruction():<"+target+","+data+">");
    }

    @Override
    public void setDocumentLocator(Locator locator) {
        // TODO Auto-generated method stub
        this.locator = locator;
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("setDocumentLocator():["+locator+"]");
        
    }

    @Override
    public void skippedEntity(String name) throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("skippedEntity():"+name);
    }

    @Override
    public void startDocument() throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("startDocument() called");
    }

    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes atts) throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("startElement():<"+qName+">");
            
    }

    @Override
    public void startPrefixMapping(String prefix, String uri)
            throws SAXException {
        // TODO Auto-generated method stub
        for(int i =0;i<tentLength;i++){
            System.out.print(" ");
        }
        System.out.println("startPrefixMapping():"+prefix);
        
    }
     
 }
首先,为了弄清楚这些方法的调用顺序,我们在每个方法中将方法名和接收到的参数打印出来。以下是程序运行的main函数:

public static void main(String[] args) throws SAXException, IOException{
        File srcFile = new File("./test.xml");
        XMLReader xmlReader = XMLReaderFactory.createXMLReader();
        xmlReader.setFeature("http://xml.org/sax/features/validation",false); 
        
        xmlReader.setContentHandler(new MyContentHandler());
        xmlReader.parse("./test.xml");
    }

输出结果:

setDocumentLocator():[com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$LocatorProxy@1a2961b]
startDocument() called
startElement():<country>
     startElement():<provinces>
         startElement():<province>
             startElement():<cities>
                 startElement():<city>
                 characters():"广州"
                 endElement():</city>
                 startElement():<city>
                 characters():"深圳"
                 endElement():</city>
                 startElement():<city>
                 characters():"珠海"
                 endElement():</city>
             endElement():</cities>
         endElement():</province>
         startElement():<province>
             startElement():<cities>
                 startElement():<city>
                 characters():"长沙"
                 endElement():</city>
                 startElement():<city>
                 characters():"衡阳"
                 endElement():</city>
                 startElement():<city>
                 characters():"常德"
                 endElement():</city>
             endElement():</cities>
         endElement():</province>
     endElement():</provinces>
 endElement():</country>
 endDocument() called

接下来,我们将该XML文件解析成一个Country 类对象。先定义两个类:

    public static class Province{
        public String name;
        public ArrayList<String> cities;
        
        public Province(){
            name = "";
            cities = new ArrayList<String>(5);
        }
     }
     
     public static class Country{
         public String name;
         public ArrayList<Province> provinces;
         
         public Country(){
             name = "";
             provinces = new ArrayList<Province>();
         }
     }
在MyContentHandler中声明如下成员变量:

private Country country;
private Province curProvince;
private City curCity;
private boolean isInCityElement = false;//指示当前事件处于City 元素中,用于获取城市的中文名称

由于该XML文件结构比较简单,我们只需要修改startElement()/character()/endElement()/endDocument()四个成员方法,如下:

@Override
    public void characters(char[] ch, int start, int length)
            throws SAXException {
            if(isInCityElement&&curCity != null){//若当前处于City元素中,则获取城市中文名称
                curCity.chName = String.copyValueOf(ch, start, length);
            }
        
    }

    @Override
    public void endDocument() throws SAXException {
       //print Country object:将解析出来的Country类对象打印出来,验证解析是否正确
        System.out.println("country:"+country.name);
        int size = country.provinces.size();
        for(int i = 0;i<size;i++){
            Province prc = country.provinces.get(i);
            System.out.println("  |--"+prc.name);
            for(City city : prc.cities){
                System.out.println("  |    |--"+city.enName+"("+city.chName+")");
            }
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        if(localName.equalsIgnoreCase("province")){
            if(curProvince != null){
                country.provinces.add(curProvince);
                curProvince = null;
            }
        }else if(localName.equalsIgnoreCase("city")){
            if(curProvince !=null&&curCity != null){
                curProvince.cities.add(curCity);
                curCity = null;
            }   
            isInCityElement = false;//在此标记已不在City元素中
        }
        
    }
    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes atts) throws SAXException {
            isInCityElement = true;
            if(localName.equalsIgnoreCase("country")){
                country = new Country();
                country.name = atts.getValue("name");
            }else if(localName.equalsIgnoreCase("province")){
                curProvince =new Province();
                curProvince.name = atts.getValue("name");
            }else if(localName.equalsIgnoreCase("city")){
                isInCityElement = true;//在此标记进入city元素中
                curCity = new City();
                curCity.enName = atts.getValue("name");
            }
    }        
   

 输出结果: 

country:China
  |--GuangDong
  |    |--GuangZhou(广州)
  |    |--ShenZhen(深圳)
  |    |--ZhuHai(珠海)
  |--HuNan
  |    |--ChangSha(长沙)
  |    |--HengYang(衡阳)
  |    |--ChangDe(常德)


 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值