一般使用C#读取html代码信息时,使用
WebClient client = new WebClient();
Stream strm = client.OpenRead(textBoxWebSite.Text);
StreamReader rd = new StreamReader(strm);
但是有些网站可能有防抓取机制,这样就会返回404错误。
看了许多方法,最终只要给request添加一个UserAgent就可以解决,参照点击打开链接
HttpWebRequest myHttpWebRequest = (HttpWebRequest)HttpWebRequest.Create(textBoxWebSite.Text);
myHttpWebRequest.UserAgent = ".NET Framework Test Client";
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
Stream streamResponse = myHttpWebResponse.GetResponseStream();
StreamReader streamRead = new StreamReader(streamResponse);
string sReturn = streamRead.ReadToEnd().Trim();