爬虫请求库参考资料
基于python 3.x
urllib: https://blog.csdn.net/jiuweideqixu/article/details/80511112
urllib文档:https://docs.python.org/3/library/urllib.html
requests:https://blog.csdn.net/gyq1998/article/details/78583841
requests 文档 https://2.python-requests.org/en/master/
re:
- 正则表达式 https://blog.csdn.net/weixin_40907382/article/details/79654372
- re库常用的方法:https://www.cnblogs.com/zjltt/p/6955965.html
- re库文档:https://docs.python.org/3/library/re.html
解析网页的参考资料
使用Xpath解析网页:https://blog.csdn.net/a417197457/article/details/81143112
参考文档:http://www.w3school.com.cn/xpath/xpath_axes.asp
使用pyquery解析网页:https://blog.csdn.net/meiqi0538/article/details/81047453
参考文档:https://pythonhosted.org/pyquery/api.html
使用BeautifulSoup解析网页: https://blog.csdn.net/z714405489/article/details/83245087
中文文档:https://www.crummy.com/software/BeautifulSoup/bs3/documentation.zh.html
英文文档:https://www.crummy.com/software/BeautifulSoup/bs4/doc/index.html
注:为了方便查看,故做了参考资料的集合。以上链接都是从网络上搜集的,如有侵权,请告知。