WebClient.DownLoadData下载网页内容

最新推荐文章于 2022-04-28 21:13:14 发布

wwwen

最新推荐文章于 2022-04-28 21:13:14 发布

阅读量4.2k

点赞数

文章标签： C# WebClient

本文链接：https://blog.csdn.net/wwwen/article/details/75287361

版权

WebClient.DownloadData下载网页内容
正常得到的内容应该为
[img=http://img.bbs.csdn.net/upload/201707/18/1500349160_835235.png][/img]
不正常时有时出现下图现象
[img=http://img.bbs.csdn.net/upload/201707/18/1500348453_479511.jpg][/img]
获取网页内容代码：
string strWebData = string.Empty;

WebClient myWebClient = new WebClient();
myWebClient.Credentials = CredentialCache.DefaultCredentials;
myWebClient.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

byte[] myDataBuffer = myWebClient.DownloadData(url);
strWebData = Encoding.Default.GetString(myDataBuffer);

//获取网页字符编码描述信息
if (charSet != null && charSet != string.Empty)
{
strWebData = Encoding.GetEncoding(charSet).GetString(myDataBuffer);
}
else
{
Match charSetMatch = Regex.Match(strWebData, "(?<=\"(T|t)(E|e)(X|x)(T|t)/(H|h)(T|t)(M|m)(L|l);[\\s]*?charset=)[\\s\\S]+?(?=\")");
if (charSetMatch.Success)
{
strWebData = Encoding.GetEncoding(charSetMatch.Value).GetString(myDataBuffer);
}
}
此段代码，有时可下载到正常的网页内容，有时下载下来就是图片上那种内容，百思不得其解，不知道那种内容是什么格式，检查了Encoding的好几种转换，都不能解读，后来调试时，发现byte[] myDataBuffer 出来的字节数不相同，当byte[] myDataBuffer字节数不正常时，则出现图片内容，感觉是下载不全或者超时，顺着这个思路，再去搜索此类问题，突然就找到了，原来有可能是下载到的是压缩格式的内容。
根据这个思路，解压些段代码，原来就可以得到正确的内容

public static string gzFile(byte[] cbytes)
{
using (MemoryStream dms = new MemoryStream())
{
using (MemoryStream cms = new MemoryStream(cbytes))
{
using (System.IO.Compression.GZipStream gzip = new System.IO.Compression.GZipStream(cms, System.IO.Compression.CompressionMode.Decompress))
{
byte[] bytes = new byte[1024];
int len = 0;
//读取压缩流，同时会被解压
while ((len = gzip.Read(bytes, 0, bytes.Length)) > 0)
{
dms.Write(bytes, 0, len);
}
}
}
return (Encoding.UTF8.GetString(dms.ToArray()));
}
}

当下载的内容，检查是压缩格式时，将myDataBuffer传入再进行解压，出来的内容则OK，终于解决。

wwwen

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
WebClient.DownLoadData下载网页内容

WebClient.DownloadData下载网页内容正常得到的内容应该为[img=http://img.bbs.csdn.net/upload/201707/18/1500349160_835235.png][/img]不正常时有时出现下图现象[img=http://img.bbs.csdn.net/upload/201707/18/1500348453_479511.jpg]
复制链接

扫一扫