java-jsoup解析html页面的内容
http://blog.csdn.net/zzq900503/article/details/10071307
java-httpclient通过title实现从baidu爬取相关网页链接
http://blog.csdn.net/zzq900503/article/details/10006751
有关失效链接的操作
http://segmentfault.com/blog/rainystars/1190000000415113
该类的API
http://jsoup.org/apidocs/org/jsoup/safety/Whitelist.html