该模块提供了get请求和post请求,下面是获取百度请求的网页源码,仅供参考
下面是我导入的urllib模块
from urllib import request
下面是请求携带的头部,其中最重要的是User-Agent,该头部的功能是模拟浏览器像网站发起请求,其他的可有可无,头部信息可以根据抓包工具或者浏览器的开发者工具查看,具体请自行百度
header={ "Accept": "application/json, text/plain, */*", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36 Edg/98.0.1108.43", "Content-Type": "application/json;charset=UTF-8", "Origin": "http://xiaobei.dalaola.com", "Referer": "http://xiaobei.dalaola.com/user", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6", "Cookie": "AOAOSTAR_SESSID=c94a1d760ba93dc0e79aad1b9e03dd47" } 下面是网络请求的预处理阶段,url是请求网址,也就是请求主机,method是请求方法,headeder是请求携带的头部, req=request.Request("https://www.baidu.com/",method="get",headers=header) 下面是发起网络请求,res是相应的信息res=request.urlopen(req)html 是请求回来的网页源码 htmls = res.read() 网页源码一般都是被压缩过的,下面是解压源码buff = io.BytesIO(htmls) f = gzip.GzipFile(fileobj=buff) htmls = f.read().decode('utf-8') print(htmls)
下面是全部代码示例:
from urllib import request import json header={ "Accept": "application/json, text/plain, */*", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36 Edg/98.0.1108.43", "Content-Type": "application/json;charset=UTF-8", "Origin": "http://xiaobei.dalaola.com", "Referer": "http://xiaobei.dalaola.com/user", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6", "Cookie": "AOAOSTAR_SESSID=c94a1d760ba93dc0e79aad1b9e03dd47" } req=request.Request("https://www.baidu.com/",method="get",headers=header) res=request.urlopen(req) htmls = res.read() buff = io.BytesIO(htmls) f = gzip.GzipFile(fileobj=buff) htmls = f.read().decode('utf-8') print(htmls)
控制台运行结果如下: