ajax采集数据,Ajax站点数据采集研究综述*

[1] Garrett J. Ajax: A New Approach to Web Applications[EB/OL]. (2005-02-18).[2010-01-15]. http://www.adaptivepath.com/ideas/essays/archives/000385.php.

[2] Mesbah A, Van Deursen A. An Architectural Style for Ajax[C]. In: Proceedings of the 6th Working IEEE/IFIP Conference on Software Architecture,Mumbai, India. Washington, DC, USA :IEEE Computer Society,2007: 44-53.

[3] Bozdag E, Mesbah A, Van Deursen A. A Comparison of Push and Pull Techniques for Ajax[C]. In: Proceedings of the 9th IEEE International Symposium on Web Site Evolution,Paris, France.2007: 15-22.

[4] Mesbah A, Van Deursen A. Exposing the Hidden-Web Induced by Ajax[R/OL]. [2009-08-01]. http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2008-001.pdf.

[5] Frey G. Indexing Ajax Web Applications[D]. Zurich: Swiss Federal Institute of Technology Zurich, 2007.

[6] Matter R. Ajax Crawl: Making Ajax Applications Searchable[D]. Zurich: Swiss Federal Institute of Technology Zurich, 2008.

[7] Mesbah A, Bozdag E, Van Deursen A.Crawling Ajax by Inferring User Interface State Changes[C]. In: Proceedings of the 8th International Conference on Web Engineering,Yorktown Heights, NJ. Washington, DC, USA: IEEE Computer Society,2008: 122-134.

[8] 郭浩, 陆余良, 刘金红. 一种基于状态转换图的Ajax 爬行算法[J]. 计算机应用研究, 2009, 26(11): 4266-4269.

[9] Duda C, Frey G, Kossmann D, et al. AjaxSearch: Crawling, Indexing and Searching Web 2.0 Applications[J]. Proceedings of the VLDB Endowment Archive, 2008, 1(2): 1440-1443.

[10] 夏冰, 高军, 王腾蛟,等. 一种高效的动态脚本网站有效页面获取方法[J]. 软件学报, 2009, 20(z): 176-183.

[11] Xia T. Extracting Structured Data from Ajax Site[C]. In: Proceedings of 2009 International IEEE Workshop on Database Technology and Applications,Wuhan, China.2009: 259-262.

[12] Shah S. Crawling Ajax-driven Web 2.0 Applications[R/OL]. (2007-02-14). [2010-01-15].http://www.infosecwriters.com/text_resources/pdf/Crawling_AJAX_SShah.pdf.

[13] 罗兵. 支持Ajax的互联网搜索引擎爬虫设计与实现[D]. 杭州: 浙江大学, 2007.

[14] 肖卓磊. 基于Ajax技术的搜索引擎研究[D]. 武汉: 武汉理工大学, 2009.

[15] 曾伟辉, 李淼. 基于JavaScript切片的Ajax框架网络爬虫技术研究[J]. 计算机系统应用, 2009, 18(7): 169-171.

[16] Mozilla. Rhino: JavaScript for Java [EB/OL]. [2009-03-22]. http://www.mozilla.org/rhino/.

[17] Cobra: Java HTML Renderer & Parser [EB/OL]. [2009-01-19].http://lobobrowser.org/cobra.jsp.

[18] 袁小节. 基于协议驱动与事件驱动的综合聚焦爬虫研究与实现[D]. 长沙: 国防科学技术大学, 2009.

[19] Reis D C, Golgher  P B, Silva A S, et al. Automatic Web News Extraction Using Tree Edit Distance[C]. In: Proceedings of the 13th International Conference on World Wide Web, New York. New York, NY, USA: ACM Press, 2004: 502-511.

[20] Xia T. Extracting Multi-Records from Web Pages[C]. In: Proceedings of the 4th International Conference on Semantics, Knowledge and Grid, Beijing, China.2008: 396-399.

[21] Marzal A, Vidal E. Computation of Normalized Edit Distance and Applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(9): 926-932.

[22] Buttler D. A Short Survey of Document Structure Similarity Algorithm[C]. In: Proceedings of the 5th International Conference on Internet Computing,Las Vegas, US.2004: 3-9.

[23] Webrenderer [EB/OL]. [2010-02-16]. http://www.webrenderer.com/.

[24] Webclient [EB/OL]. (2007-09-23). [2010-02-16].http://www.mozilla.org/projects/blackwood/webclient/.

[25] JRex-The Java Browser Component[EB/OL]. [2009-06-21]. http://jrex.mozdev.org/.

[26] JExplorer [EB/OL].[2010-01-29]. http://www.teamdev.com/jexplorer/.

[27] Watij[EB/OL]. [2009-11-16]. http://watij.com/.

[28] Watir [EB/OL]. [2009-11-16]. http://watir.com/.

[29] HtmlUnit[EB/OL]. [2010-02-09]. http://htmlunit.sourceforge.net/.

[30] XHTML Renderer Project[EB/OL]. [2009-07-01]. https://xhtmlrenderer.dev.java.net/.

[31] CSS Parser[EB/OL]. [2009-11-16]. http://cssparser.sourceforge.net/.

[32] Crowbar[EB/OL]. [2010-01-16]. http://simile.mit.edu/wiki/Crowbar.

[33] FireWatir[EB/OL]. [2010-01-14]. http://code.google.com/p/firewatir/.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值