-
robots.txt
The robots exclusion standard, also known as the
robots
exclusion protocol or simplyrobots.txt
, is a standard used by websites to communicate withweb crawlers
and otherweb robots
.The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned. Robots are often used by search engines to categorize websits.
-
References
理解robots.txt on url||爬虫
最新推荐文章于 2022-10-10 22:07:33 发布