Robots协议(也称为爬虫协议、机器人协议等)的全称是“网络爬虫排除标准”(Robots Exclusion Protocol),网站通过Robots协议告诉搜索引擎哪些页面可以抓取,哪些页面不能抓取。
以下是phpweb专用robots.txt文件内容
# robots.txt generated at http://www.100cm.cn
User-agent: *
Disallow: /advs/admin
Disallow: /base/admin
Disallow: /comment/admin
Disallow: /dingcan/admin
Disallow: /down/admin
Disallow: /feedback/admin
Disallow: /job/admin
Disallow: /member/admin
Disallow: /menu/admin
Disallow: /news/admin
Disallow: /page/admin
Disallow: /photo/admin
Disallow: /product/admin
Disallow: /shop/admin
Disallow: /webmall/admin
Disallow: /kedit/
Disallow: /service/admin
Disallow: /tools/admin
Disallow: ../../""
Disallow: ../../-1
Allow: /comment/
Allow: /webmall/
Allow: /news/
Allow: /down/
Allow: /service/
Allow: /member/
Sitemap: ../../sitemap.xml