public class WebSpider {
public static void main(String[] args) throws Exception {
String urlString = "http://lggege.iteye.com/blog/173840";
URL url = new URL(urlString);
Object contentObj = url.getContent();
if (contentObj instanceof InputStream) {
new InputStreamReader((InputStream) contentObj);
BufferedReader br = new BufferedReader(new InputStreamReader((InputStream) contentObj));
StringBuffer sb = new StringBuffer();
while (br.ready()) {
sb.append(br.readLine());
}
// 这步还需要处理编码问题.
System.out.println(new String(sb.toString().getBytes(), "UTF-8"));
}
}
}
上面是代码.
在这步:
Object contentObj = url.getContent();
是真正向URL服务器请求得到数据,也就是页面源代码.