C#网页数据采集（三）HttpWebRequest

最新推荐文章于 2023-06-14 10:14:14 发布

正在输入代码中

最新推荐文章于 2023-06-14 10:14:14 发布

阅读量1.1k

点赞数

分类专栏： C# 文章标签：数据采集

本文链接：https://blog.csdn.net/qq_26744901/article/details/50033211

版权

C# 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

截取到网页数据是js加载完以后的

HtmlWeb webClient = new HtmlWeb();
string _url = "http://news.baidu.com/";
//需要解析的url
HtmlAgilityPack.HtmlDocument html1 = webClient.Load(_url);
//获取页面编码格式
var end3 = html1.Encoding.BodyName;
//还是需要设置一次编码格式避免乱码 调用GetHtmlSource方法
string _htmlSource = GetHtmlSource(_url, System.Text.Encoding.GetEncoding(end3));

public static string GetHtmlSource(string url, Encoding charset)
        {
            string _html = string.Empty;
            try
            {
                HttpWebRequest _request = (HttpWebRequest)WebRequest.Create(url);
                HttpWebResponse _response = (HttpWebResponse)_request.GetResponse();
                using (Stream _stream = _response.GetResponseStream())
                {
                    using (StreamReader _reader = new StreamReader(_stream, charset))
                    {
                        _html = _reader.ReadToEnd();
                    }
                }
            }
            catch (WebException ex)
            {
                using (StreamReader sr = new StreamReader(ex.Response.GetResponseStream()))
                {
                    _html = sr.ReadToEnd();
                }
            }
            catch (Exception ex)
            {
                _html = ex.Message;
            }
            return _html;
        }

正在输入代码中

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
C#网页数据采集（三）HttpWebRequest

截取到网页数据是js加载完以后的 HtmlWeb webClient = new HtmlWeb(); string _url = "http://news.baidu.com/"; HtmlAgilityPack.HtmlDocument html1 = webClient.Load(_url);//是你需要解析的url
复制链接

扫一扫

专栏目录