NSXMLParser解析多层嵌套xml

As promised, here is a little How-I-did-it / How-To.

First off: I am not an experienced SAX-User.. So this approach might be packing the problem at it’s tail, but this is how DOM-Users feel comfortable with ;)

Let’s assume we want to parse the following XML:

tranist.xml

n human readable format, this means: We have multiple schedules with from/to etc. These schedules consist of multiple links (different connections for the same route) with departure/arrival etc. These links consist then of multiple parts/sections with various elements which are not sure to be there..

With the let’s find the element called ‘part’ - approach, you won’t get anywhere..

The Basics
So what do we want to achieve? We want a list/array of Schedules, which have the given members. On member is a list/array of Links, also consisting of the given members and a list/array of parts with the respective members.

This is also the basic idea behind my approach: for every new node-container, use a new class/object (an array will also work, but it’s kinda crap..)

Now we have a Schedule class, a Link class and a Part class.

This is an example of the Link class interface:

Link.h

We use an accessor method for the parts, because it just feels better when dealing with arrays. (Instead of later using [foo.myArray addObject:..] we have [foo addMe:..])

Also we make it easier for us, using retain properties..

The Parser setup
A short introduction into SAX:

The parsing goes node by node and is not nesting-sensitive. That means that first we get root, then schedules, then schedule, then from, then to, then links, then link, then departure etc. As soon as the parser returns you the node for example, you don’t know anymore in what schedule you were. As long as you have a clearly defined structure where always every element must be present, you could do this using a counter, but as soon as you have multiple nodes with no defined count, you have a problem.

What we do is known as recursive parsing. What does this mean? We implement some kind of memory.

In our parser, we have 4 members and 1 method (to make actual use of the parser..):

(Yes, this needs to be a NSMutableString..)

Your parseScheduleData method should look similar to the following:

arseJourneyData

Now we need those delegate methods.

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
This function is called by the parser, when it reads something between nodes. (Text that is..) Like with blah it would read “blah”. It is possible, that this method is called multiple times in one node. As you will see later, we define the property “currentProperty” only if we find a node, we care about. That’s why we test it against this property to make sure, that we need this property. This will then look something like this:

Parser

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
This is called, when the parser finds an opening element. In this case, we have a few cases, we need to distinguish. These are:

It’s standard property in the schedule (like <form> etc.) or it’s a deeper nested node (like <links>), the same for all the other nodes.

How to? We define, that we only set a member, if we are in that node. That means, only when we have entered a <part>, then currentPart is set, otherwise it’s nil. The same with the others.

We do then need to check them in reverse order of their nesting level.. Why? Because if we would check for currentLink before currentPart, currentLink would also evaluate to YES/True and hence we will have a problem if their are elements with the same name. If we aren’t in any node, then there is probably a new main node comming -> in the else..

When we hit a nested node, we need to allocate the respective member of our class, so we can use it when the parser gets deeper into it.

This will look like this:

Parser

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
Basically, the same things apply as for didStartElement above. This time, we need to clean things up and assign them if they are set :) This is a bit a pitty, since it’s a lot of code.. *(for not so much)

It’s the same checker-structure..

If we are in a deeper nested node (like <Link>) and we hit an ending element of that nested node (like </Link>), Then we need to add this element to the parent (like <Schedule>) and set it to nil

See yourself:

arser

Finally..
Well, that’s it. You can expand / shrink this principle as you like. You can also add a maxElements counter, like in the SeismicXML example of the iPhone SDK to get only a certain number of elements. You can abort the parser with [parser abortParsing]; It is important, that you don’t abort while in a deeper nested node, because this could lead to inconsistencies. You will need to skip them..

Please note, that I wrote this, while watching TV, so you may need to fix some syntax errors ;) But I hope you get the idea..

from:http://codesofa.com/blog/archive/2008/07/23/make-nsxmlparser-your-friend.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值