Android实战技巧:用Pull方法解析XML文件

Pull解析方法给应用程序完全的控制文档该怎么样被解析。Android中对Pull方法提供了支持的API,主要是

 

org.xmlpull.v1.XmlPullParser; 

org.xmlpull.v1.XmlPullParserFactory; 

二个类,其中主要使用的是XmlPullParser,XmlPullParserFactory是一个工厂,用于构建XmlPullParser对象。

应用程序通过调用XmlPullParser.next()等方法来产生Event,然后再处理Event。可以看到它与Push方法的不同,Push方法是由Parser自己主动产生Event,回调给应用程序。而Pull方法是主动的调用Parser的方法才能产生事件。

假如XML中的语句是这样的:"<author country="United States">James Elliott</author>",author是TAG,country是ATTRIBUTE,"James Elliott"是TEXT。

要想解析文档先要构建一个XmlPullParser对象

 

final XmlPullParserFactory factory = XmlPullParserFactory.newInstance(); 

factory.setNamespaceAware(true); 

final XmlPullParser parser = factory.newPullParser(); 

Pull解析是一个遍历文档的过程,每次调用next(),nextTag(), nextToken()和nextText()都会向前推进文档,并使Parser停留在某些事件上面,但是不能倒退。

然后把文档设置给Parser

 

parser.setInput(new StringReader("<author country=\"United States\">James Elliott</author>"); 

这时,文档刚被初始化,所以它应该位于文档的开始,事件应该是START_DOCUMENT,可以通过XmlPullParser.getEventType()来获取。然后调用next()会产生

START_TAG,这个事件告诉应用程序一个标签已经开始了,调用getName()会返回"author";再next()会产生

TEXT事件,调用getText()会返回"James Elliott",再next(),会产生

END_TAG,这个告诉你一个标签已经处理完了,再next(),会产生

END_DOCUMENT,它告诉你整个文档已经处理完成了。

除了next()外,nextToken()也可以使用,只不过它会返回更加详细的事件,比如COMMENT, CDSECT, DOCDECL, ENTITY等等非常详细的信息。如果程序得到比较底层的信息,可以用nextToken()来驱动并处理详细的事件。需要注意一点的是TEXT事件是有可能返回空白的White Spaces比如换行符或空格等。

另外有二个非常实用的方法nextTag()和nextText()

nextTag()--首先它会忽略White Spaces,如果可以确定下一个是START_TAG或END_TAG,就可以调用nextTag()直接跳过去。通常它有二个用处:当START_TAG时,如果能确定这个TAG含有子TAG,那么就可以调用nextTag()产生子标签的START_TAG事件;当END_TAG时,如果确定不是文档结尾,就可以调用nextTag()产生下一个标签的START_TAG。在这二种情况下如果用next()会有TEXT事件,但返回的是换行符或空白符。

nextText()--它只能在START_TAG时调用。当下一个元素是TEXT时,TEXT的内容会返回;当下一个元素是END_TAG时,也就是说这个标签的内容为空,那么空字串返回;这个方法返回后,Parser会停在END_TAG上。比如:

 

<author>James Elliott</author> 

<author></author> 

<author/> 

当START_TAG时,调用nextText(),依次返回:

"James Elliott"

""(empty)

""(empty)

这个方法在处理没有子标签的标签时很有用。比如:www.2cto.com

 

<title>What Is Hibernate</title> 

<author>James Elliott</author> 

<category>Web</category> 

就可以用以下代码来处理:

 

while (eventType != XmlPullParser.END_TAG) { 

    switch (eventType) { 

    case XmlPullParser.START_TAG: 

        tag = parser.getName(); 

        final String content = parser.nextText(); 

        Log.e(TAG, tag + ": [" + content + "]"); 

        eventType = parser.nextTag(); 

        break; 

    default: 

        break; 

    } 

这就要比用next()来处理方便多了,可读性也大大的加强了。

最后附上一个解析XML的实例Android程序

 

import java.io.IOException; 

import java.io.InputStream; 

 

import org.xmlpull.v1.XmlPullParser; 

import org.xmlpull.v1.XmlPullParserException; 

import org.xmlpull.v1.XmlPullParserFactory; 

 

import android.util.Log; 

 

public class RssPullParser extends RssParser { 

    private final String TAG = FeedSettings.GLOBAL_TAG; 

     

    private InputStream mInputStream; 

     

    public RssPullParser(InputStream is) { 

        mInputStream = is; 

    } 

     

    public void parse() throws ReaderBaseException, XmlPullParserException, IOException { 

        if (mInputStream == null) { 

            throw new ReaderBaseException("no input source, did you initialize this class correctly?"); 

        } 

        final XmlPullParserFactory factory = XmlPullParserFactory.newInstance(); 

        factory.setNamespaceAware(true); 

        final XmlPullParser parser = factory.newPullParser(); 

         

        parser.setInput(mInputStream); 

        int eventType = parser.getEventType(); 

        if (eventType != XmlPullParser.START_DOCUMENT) { 

            throw new ReaderBaseException("Not starting with 'start_document'"); 

        } 

        eventType = parseRss(parser); 

        if (eventType != XmlPullParser.END_DOCUMENT) { 

            throw new ReaderBaseException("not ending with 'end_document', do you finish parsing?"); 

        } 

        if (mInputStream != null) { 

            mInputStream.close(); 

        } else { 

            Log.e(TAG, "inputstream is null, XmlPullParser closed it??"); 

        } 

    } 

     

    /**

     * Parsing the Xml document. Current type must be Start_Document.

     * After calling this, Parser is positioned at END_DOCUMENT.

     * @param parser

     * @return event end_document

     * @throws XmlPullParserException

     * @throws ReaderBaseException

     * @throws IOException

     */ 

    private int parseRss(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException { 

        int eventType = parser.getEventType(); 

        if (eventType != XmlPullParser.START_DOCUMENT) { 

            throw new ReaderBaseException("not starting with 'start_document', is this a new document?"); 

        } 

        Log.e(TAG, "starting document, are you aware of that!"); 

        eventType = parser.next(); 

        while (eventType != XmlPullParser.END_DOCUMENT) { 

            switch (eventType) { 

            case XmlPullParser.START_TAG: { 

                Log.e(TAG, "start tag: '" + parser.getName() + "'"); 

                final String tagName = parser.getName(); 

                if (tagName.equals(RssFeed.TAG_RSS)) { 

                    Log.e(TAG, "starting an RSS feed <<"); 

                    final int attrSize = parser.getAttributeCount(); 

                    for (int i = 0; i < attrSize; i++) { 

                        Log.e(TAG, "attr '" + parser.getAttributeName(i) + "=" + parser.getAttributeValue(i) + "'"); 

                    } 

                } else if (tagName.equals(RssFeed.TAG_CHANNEL)) { 

                    Log.e(TAG, "\tstarting an Channel <<"); 

                    parseChannel(parser); 

                } 

                break; 

            } 

            case XmlPullParser.END_TAG: { 

                Log.e(TAG, "end tag: '" + parser.getName() + "'"); 

                final String tagName = parser.getName(); 

                if (tagName.equals(RssFeed.TAG_RSS)) { 

                    Log.e(TAG, ">> edning an RSS feed"); 

                } else if (tagName.equals(RssFeed.TAG_CHANNEL)) { 

                    Log.e(TAG, "\t>> ending an Channel");      

                } 

                break; 

            } 

            default: 

                break; 

            } 

            eventType = parser.next(); 

        } 

        Log.e(TAG, "end of document, it is over"); 

        return parser.getEventType(); 

    } 

     

    /**

     * Parse a channel. MUST be start tag of an channel, otherwise exception thrown.

     * Param XmlPullParser

     * After calling this function, parser is positioned at END_TAG of Channel.

     * return end tag of a channel

     * @throws XmlPullParserException

     * @throws ReaderBaseException

     * @throws IOException

     */ 

    private int parseChannel(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException { 

        int eventType = parser.getEventType(); 

        String tagName = parser.getName(); 

        if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_CHANNEL.equals(tagName)) { 

            throw new ReaderBaseException("not start with 'start tag', is this a start of a channel?"); 

        } 

        Log.e(TAG, "\tstarting " + tagName); 

        eventType = parser.nextTag(); 

        while (eventType != XmlPullParser.END_TAG) { 

            switch (eventType) { 

            case XmlPullParser.START_TAG: { 

                final String tag = parser.getName(); 

                if (tag.equals(RssFeed.TAG_IMAGE)) { 

                    parseImage(parser); 

                } else if (tag.equals(RssFeed.TAG_ITEM)) { 

                    parseItem(parser); 

                } else { 

                    final String content = parser.nextText(); 

                    Log.e(TAG, tag + ": [" + content + "]"); 

                } 

                // now it SHOULD be at END_TAG, ensure it 

                if (parser.getEventType() != XmlPullParser.END_TAG) { 

                    throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?"); 

                } 

                eventType = parser.nextTag(); 

                break; 

            } 

            default: 

                break; 

            } 

        } 

        Log.e(TAG, "\tending " + parser.getName()); 

        return parser.getEventType(); 

    } 

     

    /**

     * Parse image in a channel.

     * Precondition: position must be at START_TAG and tag MUST be 'image'

     * Postcondition: position is END_TAG of '/image'

     * @throws IOException

     * @throws XmlPullParserException

     * @throws ReaderBaseException

     */ 

    private int parseImage(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException { 

        int eventType = parser.getEventType(); 

        String tag = parser.getName(); 

        if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_IMAGE.equals(tag)) { 

            throw new ReaderBaseException("not start with 'start tag', is this a start of an image?"); 

        } 

        Log.e(TAG, "\t\tstarting image " + tag); 

        eventType = parser.nextTag(); 

        while (eventType != XmlPullParser.END_TAG) { 

            switch (eventType) { 

            case XmlPullParser.START_TAG: 

                tag = parser.getName(); 

                Log.e(TAG, tag + ": [" + parser.nextText() + "]");  

                // now it SHOULD be at END_TAG, ensure it 

                if (parser.getEventType() != XmlPullParser.END_TAG) { 

                    throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?"); 

                } 

                eventType = parser.nextTag(); 

                break; 

            default: 

                break; 

            } 

        } 

        Log.e(TAG, "\t\tending image " + parser.getName()); 

        return parser.getEventType(); 

    } 

     

    /**

     * Parse an item in a channel.

     * Precondition: position must be at START_TAG and tag MUST be 'item'

     * Postcondition: position is END_TAG of '/item'

     * @throws IOException

     * @throws XmlPullParserException

     * @throws ReaderBaseException

     */ 

    private int parseItem(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException { 

        int eventType = parser.getEventType(); 

        String tag = parser.getName(); 

        if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_ITEM.equals(tag)) { 

            throw new ReaderBaseException("not start with 'start tag', is this a start of an item?"); 

        } 

        Log.e(TAG, "\t\tstarting " + tag); 

        eventType = parser.nextTag(); 

        while (eventType != XmlPullParser.END_TAG) { 

            switch (eventType) { 

            case XmlPullParser.START_TAG: 

                tag = parser.getName(); 

                final String content = parser.nextText(); 

                Log.e(TAG, tag + ": [" + content + "]"); 

 

                // now it SHOULD be at END_TAG, ensure it 

                if (parser.getEventType() != XmlPullParser.END_TAG) { 

                    throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?"); 

                } 

                eventType = parser.nextTag(); 

                break; 

            default: 

                break; 

            } 

        } 

        Log.e(TAG, "\t\tending " + parser.getName()); 

        return parser.getEventType(); 

    } 

}   

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值