通过url，获取html内容，并解析

最新推荐文章于 2022-08-23 11:56:29 发布

饭盆

最新推荐文章于 2022-08-23 11:56:29 发布

阅读量3.2k

点赞数

分类专栏：技术文章标签：移动应用 url html

技术专栏收录该内容

7 篇文章 0 订阅

订阅专栏

 
   1、第一种获取方式 ：通过过stringWithContentsOfURL获取  
  NSString *urlstring= [NSString stringWithFormat:@"http://baidu.com/=%@",string1];// 此处网址不对，只是示意可以生成一个动态的urlstring  
  //抓取网页中 网释义内容  
     NSString * encodedString1 = (NSString *)CFURLCreateStringByAddingPercentEscapes( kCFAllocatorDefault, (CFStringRef)urlstring, NULL, NULL,  kCFStringEncodingUTF8 );  
     NSURL *url1 = [NSURL URLWithString:encodedString1];  
     NSString *retStr = [NSString stringWithContentsOfURL:url1 encoding:NSUTF8StringEncoding error:nil];//[[NSString alloc] initWithData:data encoding:];  
     NSLog(@" html = %@",retStr);  
   
 上述方式获取的内容时有一个弊端，当网络不是太好时，会阻塞在stringWithContentsOfURL这里，导致程序假死，（可以通过创建一个线程的方式，来获取数据，这样不好阻塞主线程），或者你可以使用第二种方式：  
   
 2、第二种方式：通过NSURLConnection获取//获取指定网页的内容  
 
 

[csharp]view plain copy 
    
  NSString *urlString= [NSStringstringWithFormat:@"http://baidu.com"];  
     NSString * encodedString = (NSString *)CFURLCreateStringByAddingPercentEscapes(kCFAllocatorDefault, (CFStringRef)urlString,NULL, NULL, kCFStringEncodingUTF8 );  
     NSURL *url =[NSURLURLWithString:encodedString];  
     NSMutableURLRequest *req=[[NSMutableURLRequestalloc]   
                               initWithURL:url   
                               cachePolicy:NSURLRequestReloadIgnoringLocalCacheData   
                               timeoutInterval:30.0];  
     
 receivedData=[[NSMutableDataalloc] initWithData:nil];//接受数据  
     [req setHTTPMethod: @"POST"];  
     NSURLConnection *connection = [[NSURLConnectionalloc] initWithRequest:req delegate:self startImmediately:YES];  
     [req release];  
     [connection release];  
   
 在委托方法中接受数据，这种方式是异步的，不会阻塞当前线程，获取数据时，你可以做别的事情  
 - (void)connection:(NSURLConnection *)connection didFailWithError:(NSError *)error  
 { //简单错误处理  
     UIAlertView *alert = [[UIAlertViewalloc]initWithTitle:@"出错了"message:@"网络链接异常" delegate:nilcancelButtonTitle:@"OK" otherButtonTitles:nil];  
     [alert show];  
     [alert release];  
 }  
   
 - (void)connection:(NSURLConnection *)connection didReceiveResponse:(NSURLResponse *)response  
 {  
     [receivedData setLength:0];//置空数据  
       
 }  
   
 //接收NSData数据  
 - (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data   
 {  
     [receivedData appendData:data];  
 }  
   
 //接收完毕,显示结果  
 - (void)connectionDidFinishLoading:(NSURLConnection *)connection   
 {  
     [connection cancel];  
     NSString *results = [[NSStringalloc]   
                          initWithBytes:[receivedDatabytes]   
                          length:[receivedDatalength]   
                          encoding:NSUTF8StringEncoding];// 注意数据编码方式，选中正确的编码方式，防止乱码  
     [selfloadingNetDescription:results];  
     [results release];  
 //    UIAlertView *alert = [[UIAlertView alloc]initWithTitle:@"数据" message:@"下载完成" delegate:nil cancelButtonTitle:@"OK" otherButtonTitles:nil];  
 //    [alert show];  
 //    [alert release];  
 //    NSLog(@"%@",results);  
 }     
   
 其实还有一种曲折的方式来获取url对应的网页内容，就是通过webView加载对应url，  
  [mWebViewloadRequest:[NSURLRequestrequestWithURL:[NSURLURLWithString:encodedString]]];  
 在其委托方法：  
 webViewDidFinishLoad中通过js来获取相应的网页内容；  
 例如：NSString *allHTML = [wView stringByEvaluatingJavaScriptFromString:@"document.body.innerHTML"];  
 更具体的可以参考：stringByEvaluatingJavaScriptFromString这个函数的用法  
   
 通过上面的方式获取到html数据后，就可以通过字符串的一些处理函数来获取自己像要的内容（当然你也可以用一些xml解析库来进行解析，此处，只简单介绍一下字符串的处理）：  
 简单实例：//获取两个body之间的内容：htmlString为上面获取的html字符串  
    NSRange range1 = [htmlString rangeOfString:@"<body>"];// 返回的location是<body>在htmlString中的起始位置，使用是注意  
     NSRange range2 = [htmlString rangeOfString:@"</body>"];  
 if(range3.location !=NSNotFound && range3.length!=0)//简单的判断，不全，使用时注意补全  
    {  
         NSString* bodyString= [retStr substringWithRange:NSMakeRange(range1.location, range2.location-range1.location+range2.length)];  
 NSLog(@"%@"bodyString);  
 }  

[csharp]view plain copy 
    
   

[csharp]view plain copy 
    
 补充一些字符串常用处理函数：  
 1、替换字符串  
  //将数据中的“回车换行”换成网页中的回车符<br/>  
     NSString *localString= [descriptionOne stringByReplacingOccurrencesOfString:@"\r\n" withString:@"<br/>"];  
   
  //从字符串开头一直截取到指定位置，但不包括该位置的字符  
 //            NSString *first =[string1 substringToIndex:range.location-1] ;   
 //            //从指定位置开始获取之后的所有字符（包括指定位置的字符）  
 //            NSString *last = [string1 substringFromIndex:range.location+1];  
 //            //按照所给位置，长度，任意的从字符串中截取子串  
 //            NSString *word  = [string1 substringWithRange:NSMakeRange(range.location-1, 1)];  

饭盆

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
通过url，获取html内容，并解析

1、第一种获取方式：通过过stringWithContentsOfURL获取 NSString *urlstring= [NSString stringWithFormat:@"http://baidu.com/=%@",string1];// 此处网址不对，只是示意可以生成一个动态的urlstring //抓取网页中网释义内容 NSString * encode
复制链接

扫一扫

专栏目录