Android实战技巧：用Pull方法解析XML文件

最新推荐文章于 2022-12-19 17:26:49 发布

qjbagu

最新推荐文章于 2022-12-19 17:26:49 发布

阅读量545

点赞数

分类专栏： android

android 专栏收录该内容

476 篇文章 2 订阅

订阅专栏

Pull解析方法给应用程序完全的控制文档该怎么样被解析。Android中对Pull方法提供了支持的API，主要是

org.xmlpull.v1.XmlPullParser;

org.xmlpull.v1.XmlPullParserFactory;

二个类，其中主要使用的是XmlPullParser，XmlPullParserFactory是一个工厂，用于构建XmlPullParser对象。

应用程序通过调用XmlPullParser.next()等方法来产生Event，然后再处理Event。可以看到它与Push方法的不同，Push方法是由Parser自己主动产生Event，回调给应用程序。而Pull方法是主动的调用Parser的方法才能产生事件。

假如XML中的语句是这样的："<author country="United States">James Elliott</author>"，author是TAG，country是ATTRIBUTE，"James Elliott"是TEXT。

要想解析文档先要构建一个XmlPullParser对象

final XmlPullParserFactory factory = XmlPullParserFactory.newInstance();

factory.setNamespaceAware(true);

final XmlPullParser parser = factory.newPullParser();

Pull解析是一个遍历文档的过程，每次调用next()，nextTag(), nextToken()和nextText()都会向前推进文档，并使Parser停留在某些事件上面，但是不能倒退。

然后把文档设置给Parser

parser.setInput(new StringReader("<author country=\"United States\">James Elliott</author>");

这时，文档刚被初始化，所以它应该位于文档的开始，事件应该是START_DOCUMENT，可以通过XmlPullParser.getEventType()来获取。然后调用next()会产生

START_TAG，这个事件告诉应用程序一个标签已经开始了，调用getName()会返回"author";再next()会产生

TEXT事件，调用getText()会返回"James Elliott"，再next()，会产生

END_TAG，这个告诉你一个标签已经处理完了，再next()，会产生

END_DOCUMENT，它告诉你整个文档已经处理完成了。

除了next()外，nextToken()也可以使用，只不过它会返回更加详细的事件，比如COMMENT, CDSECT, DOCDECL, ENTITY等等非常详细的信息。如果程序得到比较底层的信息，可以用nextToken()来驱动并处理详细的事件。需要注意一点的是TEXT事件是有可能返回空白的White Spaces比如换行符或空格等。

另外有二个非常实用的方法nextTag()和nextText()

nextTag()--首先它会忽略White Spaces，如果可以确定下一个是START_TAG或END_TAG，就可以调用nextTag()直接跳过去。通常它有二个用处：当START_TAG时，如果能确定这个TAG含有子TAG，那么就可以调用nextTag()产生子标签的START_TAG事件；当END_TAG时，如果确定不是文档结尾，就可以调用nextTag()产生下一个标签的START_TAG。在这二种情况下如果用next()会有TEXT事件，但返回的是换行符或空白符。

nextText()--它只能在START_TAG时调用。当下一个元素是TEXT时，TEXT的内容会返回；当下一个元素是END_TAG时，也就是说这个标签的内容为空，那么空字串返回；这个方法返回后，Parser会停在END_TAG上。比如：

<author>James Elliott</author>

当START_TAG时，调用nextText()，依次返回：

"James Elliott"

""(empty)

这个方法在处理没有子标签的标签时很有用。比如：www.2cto.com

<title>What Is Hibernate</title>

<author>James Elliott</author>

就可以用以下代码来处理：

while (eventType != XmlPullParser.END_TAG) {

switch (eventType) {

case XmlPullParser.START_TAG:

tag = parser.getName();

final String content = parser.nextText();

Log.e(TAG, tag + ": [" + content + "]");

eventType = parser.nextTag();

break;

default:

break;

}

这就要比用next()来处理方便多了，可读性也大大的加强了。

最后附上一个解析XML的实例Android程序

import java.io.IOException;

import java.io.InputStream;

import org.xmlpull.v1.XmlPullParser;

import org.xmlpull.v1.XmlPullParserException;

import org.xmlpull.v1.XmlPullParserFactory;

import android.util.Log;

public class RssPullParser extends RssParser {

private final String TAG = FeedSettings.GLOBAL_TAG;

private InputStream mInputStream;

public RssPullParser(InputStream is) {

mInputStream = is;

}

public void parse() throws ReaderBaseException, XmlPullParserException, IOException {

if (mInputStream == null) {

throw new ReaderBaseException("no input source, did you initialize this class correctly?");

}

final XmlPullParserFactory factory = XmlPullParserFactory.newInstance();

factory.setNamespaceAware(true);

final XmlPullParser parser = factory.newPullParser();

parser.setInput(mInputStream);

int eventType = parser.getEventType();

if (eventType != XmlPullParser.START_DOCUMENT) {

throw new ReaderBaseException("Not starting with 'start_document'");

}

eventType = parseRss(parser);

if (eventType != XmlPullParser.END_DOCUMENT) {

throw new ReaderBaseException("not ending with 'end_document', do you finish parsing?");

}

if (mInputStream != null) {

mInputStream.close();

} else {

Log.e(TAG, "inputstream is null, XmlPullParser closed it??");

}

/**

* Parsing the Xml document. Current type must be Start_Document.

* After calling this, Parser is positioned at END_DOCUMENT.

* @param parser

* @return event end_document

* @throws XmlPullParserException

* @throws ReaderBaseException

* @throws IOException

private int parseRss(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException {

int eventType = parser.getEventType();

if (eventType != XmlPullParser.START_DOCUMENT) {

throw new ReaderBaseException("not starting with 'start_document', is this a new document?");

}

Log.e(TAG, "starting document, are you aware of that!");

eventType = parser.next();

while (eventType != XmlPullParser.END_DOCUMENT) {

switch (eventType) {

case XmlPullParser.START_TAG: {

Log.e(TAG, "start tag: '" + parser.getName() + "'");

final String tagName = parser.getName();

if (tagName.equals(RssFeed.TAG_RSS)) {

Log.e(TAG, "starting an RSS feed <<");

final int attrSize = parser.getAttributeCount();

for (int i = 0; i < attrSize; i++) {

Log.e(TAG, "attr '" + parser.getAttributeName(i) + "=" + parser.getAttributeValue(i) + "'");

}

} else if (tagName.equals(RssFeed.TAG_CHANNEL)) {

Log.e(TAG, "\tstarting an Channel <<");

parseChannel(parser);

}

break;

}

case XmlPullParser.END_TAG: {

Log.e(TAG, "end tag: '" + parser.getName() + "'");

final String tagName = parser.getName();

if (tagName.equals(RssFeed.TAG_RSS)) {

Log.e(TAG, ">> edning an RSS feed");

} else if (tagName.equals(RssFeed.TAG_CHANNEL)) {

Log.e(TAG, "\t>> ending an Channel");

}

break;

}

default:

break;

}

eventType = parser.next();

}

Log.e(TAG, "end of document, it is over");

return parser.getEventType();

}

/**

* Parse a channel. MUST be start tag of an channel, otherwise exception thrown.

* Param XmlPullParser

* After calling this function, parser is positioned at END_TAG of Channel.

* return end tag of a channel

* @throws XmlPullParserException

* @throws ReaderBaseException

* @throws IOException

private int parseChannel(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException {

int eventType = parser.getEventType();

String tagName = parser.getName();

if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_CHANNEL.equals(tagName)) {

throw new ReaderBaseException("not start with 'start tag', is this a start of a channel?");

}

Log.e(TAG, "\tstarting " + tagName);

eventType = parser.nextTag();

while (eventType != XmlPullParser.END_TAG) {

switch (eventType) {

case XmlPullParser.START_TAG: {

final String tag = parser.getName();

if (tag.equals(RssFeed.TAG_IMAGE)) {

parseImage(parser);

} else if (tag.equals(RssFeed.TAG_ITEM)) {

parseItem(parser);

} else {

final String content = parser.nextText();

Log.e(TAG, tag + ": [" + content + "]");

}

// now it SHOULD be at END_TAG, ensure it

if (parser.getEventType() != XmlPullParser.END_TAG) {

throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?");

}

eventType = parser.nextTag();

break;

}

default:

break;

}

Log.e(TAG, "\tending " + parser.getName());

return parser.getEventType();

}

/**

* Parse image in a channel.

* Precondition: position must be at START_TAG and tag MUST be 'image'

* Postcondition: position is END_TAG of '/image'

* @throws IOException

* @throws XmlPullParserException

* @throws ReaderBaseException

private int parseImage(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException {

int eventType = parser.getEventType();

String tag = parser.getName();

if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_IMAGE.equals(tag)) {

throw new ReaderBaseException("not start with 'start tag', is this a start of an image?");

}

Log.e(TAG, "\t\tstarting image " + tag);

eventType = parser.nextTag();

while (eventType != XmlPullParser.END_TAG) {

switch (eventType) {

case XmlPullParser.START_TAG:

tag = parser.getName();

Log.e(TAG, tag + ": [" + parser.nextText() + "]");

// now it SHOULD be at END_TAG, ensure it

if (parser.getEventType() != XmlPullParser.END_TAG) {

throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?");

}

eventType = parser.nextTag();

break;

default:

break;

}

Log.e(TAG, "\t\tending image " + parser.getName());

return parser.getEventType();

}

/**

* Parse an item in a channel.

* Precondition: position must be at START_TAG and tag MUST be 'item'

* Postcondition: position is END_TAG of '/item'

* @throws IOException

* @throws XmlPullParserException

* @throws ReaderBaseException

private int parseItem(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException {

int eventType = parser.getEventType();

String tag = parser.getName();

if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_ITEM.equals(tag)) {

throw new ReaderBaseException("not start with 'start tag', is this a start of an item?");

}

Log.e(TAG, "\t\tstarting " + tag);

eventType = parser.nextTag();

while (eventType != XmlPullParser.END_TAG) {

switch (eventType) {

case XmlPullParser.START_TAG:

tag = parser.getName();

final String content = parser.nextText();

Log.e(TAG, tag + ": [" + content + "]");

// now it SHOULD be at END_TAG, ensure it

if (parser.getEventType() != XmlPullParser.END_TAG) {

throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?");

}

eventType = parser.nextTag();

break;

default:

break;

}

Log.e(TAG, "\t\tending " + parser.getName());

return parser.getEventType();

}

qjbagu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录