python爬虫获取jsp页面_Python篇：Requests获取网页源码（爬虫基础）

最新推荐文章于 2024-07-30 15:54:01 发布

Cass Lin

最新推荐文章于 2024-07-30 15:54:01 发布

阅读量1.6k

点赞数

文章标签： python爬虫获取jsp页面

本文链接：https://blog.csdn.net/weixin_31955465/article/details/112372278

版权

本文介绍了使用Python的Requests库进行HTTP请求的基础知识，包括下载与安装Requests库、通过GET方法获取网页源代码、设置HTTP头部、利用正则表达式提取信息以及使用POST方法向网页提交数据。文中详细阐述了如何处理异步加载的网页，以实现多页爬取。

摘要由CSDN通过智能技术生成

1 下载与安装

见其他教程。

2 Requsts简介

Requests is an Apache2 Licensed HTTP library, written inPython, for human beings.

Python’s standard urllib2 module provides most ofthe HTTP capabilities you need, but the API is thoroughlybroken.It was built for a different time — and a different web. It requires anenormous amount of work (even method overrides) to perform the simplest oftasks.

Requests takes all of the work out of Python HTTP/1.1 — making your integrationwith web services seamless. There’s no need to manually add query strings toyour URLs, or to form-encode your POST data. Keep-alive and HTTP connectionpooling are 100% automatic, powered by urllib3,which is embedded within Requests.

------from http://www.python-requests.org/en/latest/