css vs jQuery selector
- css selectors vs jquery traversal
Performance of jquery selectors vs css3 selectors
jQuery’s selector engine shares most of the same syntax as CSS, effectively extending the selector standard. This means you can pass most valid CSS selectors (with some exceptions) to jQuery and it’ll handle them just fine.
Python
- jquery-like HTML parsing in Python?
- bs4 CSS selectors
- Scrapy selectors
- Parsing HTML in python - lxml or BeautifulSoup? Which of these is better for what kinds of purposes?
- What’s the best way of scraping data from a website?
- Scrape the web using CSS Selectors in Python
- Easy Web Scraping with Python
- High-performance XML parsing in Python with lxml
- Python XML processing with lxml
Performance
- Python HTML Parser Performance
- BeautifulSoup and lxml.html - what to prefer?
- Parsing HTML in python - lxml or BeautifulSoup? Which of these is better for what kinds of purposes?
A comparison using this gist:
==== Total trials: 1000 =====
bs4 total time: 0.9
pq total time: 0.2
lxml (cssselect) total time: 0.1
lxml (xpath) total time: 0.1
regex total time: 0.1 (doesn’t find all p)
Beautiful Soup may be slower than pyquery, while the latter is not mature enough. 1
HTML parsing
Building a parse tree, navigating, searching, and modifying the parse tree.2