本文是用hpple来做html parser
你还可以选择用objective-c html parser https://github.com/zootreeves/Objective-C-HMTL-Parser
或直接用libxml2的NSXMLParser http://www.theappcodeblog.com/2011/07/21/iphone-development-tutorial-parse-html/
(这2个我都没试过)
1. 创建一个single view application project (with storyboard and arc)
2. 把libxml2 library加到project里。
step 1 左边窗口选定project root node,旁边会出现一个区域,选择the node in "TARGETS"
step 2 select "build phases" tab, expand "Link Binary With Libraries",然后click "+" button
step 3 search "libxml2", 选定"libxml2.dylib", click "add" button,这时libxml2.dylib会添加到project,出于归类的目的,建议把它drag and drop to "Frameworks" folder。
step 4 重复step 1, 然后选定"Build Settings" tab,search "Header Search Paths"并expand it,对于"debug" and "release" node,均通过click "+" button来添加一个value为"${SDK_DIR}"/usr/include/libxml2的item (注意:该值是带有双引号的)
step 5 简单测试你的project是否添加libxml2成功:在你的view controller .m file里添加下列代码,然后看看是否编译成功,若成功则表示可以使用libxml2 lib
#import <libxml/HTMLparser.h>
3. 把hpple的源码添加到project。
step 1 下载hpple from https://github.com/topfunky/hpple
step 2 在你的project里create a group (即folder) named "hpple" (这是出于归类便于管理的目的),然后把下列6个files拖拽进该folder,然后勾上option "copy items into destination group's folder", 选择option "Create groups for any added folders", 勾上option "Add to Targets", click Finish button
- HTFpple.h
- HTFpple.m
- HTFppleElement.h
- HTFppleElement.m
- XPathQuery.h
- XPathQuery.m
4. 最简单的使用hpple
在你的view controller .m file里
step 1: add
#import "TFHpple.h"
- (void) testparser{
// NSString *htmlString=[NSString stringWithContentsOfURL:[NSURL URLWithString: @"https://cap.cityu.edu.hk/default.aspx"] encoding: NSUTF8StringEncoding error:nil];
NSString *htmlString=[NSString stringWithContentsOfURL:[NSURL URLWithString: @"http://www.cwb.gov.tw/eng/index.htm"] encoding: NSUTF8StringEncoding error:nil];
NSData *htmlData=[htmlString dataUsingEncoding:NSUTF8StringEncoding];
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *elements = [xpathParser searchWithXPathQuery:@"//title"]; // get the title
//NSArray *elements = [xpathParser searchWithXPathQuery:@"//td[@class='compact']/a"];
TFHppleElement *element = [elements objectAtIndex:0];
NSString *elementContent = [element content];
NSLog(@"result = %@",elementContent);
}
step 3: 在view controller的“ viewDidLoad”里添加下列代码
[self testparser];
step 4: run your app!