http://computer.howstuffworks.com/internet/basics/search-engine1.htm How Internet Search Engines Work
http://www.wordtracker.com/academy/learn-seo/technical-guides/google-spider-crawling The Google spider & you: What you need to know to get your site indexed
http://www.lunametrics.com/blog/2014/08/07/bot-spider-filtering-google-analytics/ Understanding Bot and Spider Filtering from Google Analytics
http://searchenginewatch.com/sew/news/2067357/bye-bye-crawler-blocking-parasites Bye-bye, Crawler: Blocking the Parasites
http://www.google.com/insidesearch/howsearchworks/crawling-indexing.html 谷歌教学文档
http://www.htmlbasictutor.ca/web-crawler-search-engine.htm Web Crawler - Search Engine Robots - Search Engine Spiders
http://www.htmlbasictutor.ca/search-engine-read-web-pages.htm How Search Engines Read Web Pages
http://www.htmlbasictutor.ca/search-engine-submission.htm Search Engine Submissions
http://www.htmlbasictutor.ca/search-engine-web-content.htm Web Page Content Search Engines See
http://www.wisegeek.org/what-is-a-web-crawler.htm What is a Web Crawler?
http://ruby.bastardsbook.com/chapters/web-crawling/ Writing a Web Crawler
http://www.gotomanage.com/help/about/about_crawler About the GoToAssist® Open Source Crawler
网上可以找到honeyspider lanspider等爬虫
http://monstercrawler.com/ 爬虫网站
http://socscibot.wlv.ac.uk/ Free SocSciBot
https://github.com/yasserg/crawler4j Open Source Web Crawler for Java