I need to grab some data from websites in my django website.
Now i am confused whether i should use python parsing libraries or web crawling libraries. Does search engine libraries also fall in same category
I want to know how much is the difference between the two and if i want to use those functions inside my website which should i use
解决方案
If you can get away with background web crawling use scrapy. If need to immediately grab something use html5lib (more robust) or lxml (faster). If you are going to be doing the later, use the awesome requests library. I would avoid using BeautifulSoup, mechanize, urllib2, httplib.