看完这篇文章你会知道HTML解析其实很简单~
项目中后台返回的数据是HTML格式的,感觉特别蛋疼,花了不少时间找了不少资料,感觉解析起来都特别麻烦,经过一段时间研究,发现一般HTML格式的数据都是有规律可找的,那么福利来了,下面介绍一种不常见的但是非常简单易懂的方式---> 字符串截取
不废话,上代码~
// 声明文件,
@interface GKTopic : NSObject /// 帖子ID @property (nonatomic, copy) NSString *id; /// 帖子标题 @property (nonatomic, copy) NSString *title; /// 发帖人 @property (nonatomic, copy) NSString *author; /// 头像url @property (nonatomic, copy) NSString *avatarImageUrl; + (NSArray *)topics; @end
实现文件
+ (NSArray *)topics {
// 加载html
NSString * html = [NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"v2ex" ofType:@"html"] encoding:NSUTF8StringEncoding error:nil]; NSMutableArray *topics = [NSMutableArray array]; // 设置从哪里开始截取 NSString * matchingBegin = @"cell from_"; // 这个还是需要自己看html源码找规律的~ mathcingEnd 也是一样 // 设置截取到哪里 NSString * mathcingEnd = @"</div>"; NSRange lastRange = NSMakeRange(0, 0); // 循环截取 while ((lastRange = [html rangeOfString:matchingBegin options:0 range:NSMakeRange(lastRange.location, html.length - lastRange.location)]).location != NSNotFound) { NSRange endRange = [html rangeOfString:mathcingEnd options:0 range:NSMakeRange(lastRange.location, html.length - lastRange.location)]; if (endRange.location != NSNotFound) { // 获取区间内字符串 NSString *topicString = [html substringWithRange:NSMakeRange(lastRange.location, endRange.location - lastRange.location)]; // 标签处理 GKTopic * topic = [self topicWithString:topicString]; [topics addObject:topic]; lastRange = endRange; }else { break; } } return topics; } + (GKTopic *)topicWithString:(NSString *)string { GKTopic *topic = [[GKTopic alloc]init]; // 查找发帖作者 topic.author = [string gk_rangeFromeStartString:@"<a href=\"/member/" toEndString:@"\">"]; // 查找用户头像地址 topic.avatarImageUrl = [string gk_rangeFromeStartString:@"<img src=\"" toEndString:@"\" class=\"avatar\""]; // 查找帖子id:如:<a href="/t/291493">,帖子id是291493 topic.id = [string gk_rangeFromeStartString:@"<a href=\"/t/" toEndString:@"\">"]; // 查找帖子标题 NSString *fromStr = [NSString stringWithFormat:@"t/%@\">",topic.id]; topic.title = [string gk_rangeFromeStartString:fromStr toEndString:@"</a>"]; return topic; }
上面用到的NSString分类的方法
- (NSString *)gk_rangeFromeStartString:(NSString *)startString toEndString:(NSString *)endString
{
NSRange range = [self rangeOfString:startString]; NSString *string; if (range.location != NSNotFound) { string = [self substringFromIndex:range.location + range.length]; } range = [string rangeOfString:endString]; if (range.location != NSNotFound) { string = [string substringToIndex:range.location]; } return string; }
这里简单截取了部分,其他的各位可以自己尝试下,上面返回数组的方法完全可以抽取出来,比如
/**
* @param beginString 起始位置
* @param endString 结束位置
* @return 模型数组
*/
+ (NSArray *)topicsWithBeginString:(NSString *)beginString endString:(NSString *)endStrng;
方法名字可能有点不规范啊,各位可以自己随便取,这里仅提供思路~
大概就是这样了,如果有不正确的地方欢迎批评指正,
最后放上Demo地址:https://github.com/ChrisCaixx/HtmlToObject
觉得好用的可以点下星星哦,3Q