Python网络爬虫与信息提取-第一章学习

一、Requests库的安装

win平台:‘以管理员身份运行‘  cmd ,执行 pip install requests
Ubuntu平台:sudo pip install requests

二、Requests库的安装小测试

import requests
r = requests.get("http://www.baidu.com")
print(r.status_code)
print(r.text)

运行结果:

200
<!DOCTYPE html>
<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;
charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer>
<link rel=stylesheet 此处省略余下结果..............

三、Requests库的7个主要方法

方法说明
requests.request( )构造一个请求,支撑以下各方法的基础方法
requests.get( )获取HTML网页的主要方法,对应于HTTP的GET
requests.head( )获取HTML网页头信息的方法,对应于HTTP的HEAD
requests.post( )向HTML网页提交POST请求的方法,对应于HTTP的POST
requests.put( )向HTML网页提交PUT请求的方法,对应于HTTP的PUT
requests.patch( )向HTML网页提交局部修改请求,对应于HTTP的PATCH
requests.delete( )向HTML页面提交删除请求,对应于HTTP的DELETE

四、Requests库的get( )方法

在这里插入图片描述

requests.get(url,params=None,**kwargs)
url    :   拟获取页面的url链接
params    :   url中的额外参数,字典或字节流格式,可选
**kwargs    :   12个控制访问的参数

五、Request库的2个重要对象

在这里插入图片描述

import requests
r = requests.get("http://www.baidu.com")
print(r.status_code)
print(type(r))
print(r.headers)

运行结果:

200

<class ‘requests.models.Response’>

{‘Transfer-Encoding’: ‘chunked’, ‘Date’: ‘Thu, 25 Oct 2018 14:07:37 GMT’, ‘Server’: ‘bfe/1.0.8.18’, ‘Connection’: ‘Keep-Alive’, ‘Pragma’: ‘no-cache’, ‘Cache-Control’: ‘private, no-cache, no-store, proxy-revalidate, no-transform’, ‘Content-Type’: ‘text/html’, ‘Content-Encoding’: ‘gzip’, ‘Set-Cookie’: ‘BDORZ=27315; max-age=86400; domain=.baidu.com; path=/’, ‘Last-Modified’: ‘Mon, 23 Jan 2017 13:28:16 GMT’}

六、Response对象的属性

属性说明
r.status_codeHTTP请求的返回状态,200表示连接成功,404表示失败
r.textHTTP响应内容的字符串形式,即,url对应的页面内容
r.encoding从HTTP header中猜测的响应内容编码方式
r.apparent_encoding从内容中分析出的响应内容编码方式(备选编码方式)
r.contentHTTP响应内容的二进制形式

在这里插入图片描述

>>> import requests
>>> r = requests.get("http://www.baidu.com")
>>> r.status_code
200
>>> r.text
'<!DOCTYPE html><html><head><meta http-equiv="content-type" content="text/html;charset=utf-8"/><meta http-equiv="X-UA-Compatible" content="IE=Edge"/><meta content="never" name="referrer"/><title>ç\x99¾åº¦ä¸\x80ä¸\x8bï¼\x8cä½\xa0å°±ç\x9f¥é\x81\x93</title><style>html,body{height:100%}html{overflow-y:auto}body{font:12px arial;background:#fff}body,p,form,ul,li{margin:0;padding:0;list-style:none}body,form{position:relative}td{text-align:left}img{border:0}a{color:#00c}a:active{color:#f60}input{border:0;padding:0}#wrapper{position:relative;_position:;min-height:100%}#head{padding-bottom:100px;text-align:center;*z-index:1}#ftCon{height:100px;position:absolute;bottom:44px;text-align:center;width:100%;margin:0 auto;z-index:0;overflow:hidden}#ftConw{width:720px;margin:0 auto}#lh{margin:16px 0 5px;word-spacing:3px}#lh a{margin:0 10px}#cp,#cp a{color:#666}#wrapper{min-width:810px;height:100%;min-height:600px}#head{position:relative;padding-bottom:0;height:100%;min-height:600px}#head .head_wrapper{height:100%}#form{margin:22px auto 0;width:641px;text-align:left;z-index:100}#form .bdsug{top:35px}#kw{position:relative}.s_btn{width:95px;height:32px;padding-top:2px\\9;font-size:14px;background-color:#ddd;background-position:0 -48px;cursor:pointer}.s_btn{width:100px;height:36px;color:white;font-size:15px;letter-spacing:1px;background:#3385ff;border-bottom:1px solid #2d78f4;outline:medium;*border-bottom:0;-webkit-appearance:none;-webkit-border-radius:0}.s_btn.btnhover{background:#317ef3;border-bottom:1px solid #2868c8;*border-bottom:0;box-shadow:1px 1px 1px #ccc}.s_btn_wr{width:97px;height:34px;display:inline-block;background-position:-120px -48px;*position:relative;z-index:0;vertical-align:top}.s_btn_wr{width:auto;height:auto;border-bottom:1px solid transparent;*border-bottom:0}.s_ipt_wr{height:34px}.s_ipt_wr.bg,.s_btn_wr.bg,#su.bg{background-image:none}.s_ipt_wr{border:1px solid #b6b6b6;border-color:#7b7b7b #b6b6b6 #b6b6b6 #7b7b7b;background:#fff;display:inline-block;vertical-align:top;width:539px;margin-right:0;border-right-width:0;border-color:#b8b8b8 transparent #ccc #b8b8b8;overflow:hidden}.s_ipt{width:526px;height:22px;font:16px/18px arial;line-height:22px\\9;margin:6px 0 0 7px;padding:0;background:transparent;border:0;outline:0;-webkit-appearance:none}.bdsug{position:absolute;width:418px;background:#fff;display:none;border:1px solid #817f82}.bdsug li{width:511px;color:#000;font:14px arial;line-height:25px;padding:0 8px;position:relative;cursor:default}.bdsug{top:35px;width:538px;border-color:#ccc;box-shadow:1px 1px 3px #ededed;_overflow:hidden;-webkit-box-shadow:1px 1px 3px #ededed;-moz-box-shadow:1px 1px 3px #ededed;-o-box-shadow:1px 1px 3px #ededed}.s_form{position:relative;top:38.2%}.s_form_wrapper{position:relative;top:-191px}#u1{z-index:2;color:white;position:absolute;right:0;top:0;margin:19px 0 5px 0;padding:0 96px 0 0}#u1 a:link,#u1 a:visited{color:#666;text-decoration:none}#u1 a:hover,#u1 a:active{text-decoration:underline}#u1 a:active{color:#00c}#u1 a.bri,#u1 a.bri:visited{display:inline-block;position:absolute;right:10px;width:60px;height:23px;float:left;color:white;background:#38f;line-height:24px;font-size:13px;text-align:center;overflow:hidden;border-bottom:1px solid #38f;margin-left:19px;margin-right:2px}#u1 a.mnav,#u1 a.mnav:visited{float:left;color:#333;font-weight:bold;line-height:24px;margin-left:20px;font-size:13px;text-decoration:underline}</style></head><body><div id="wrapper"><div id="head"><div class="head_wrapper"><div class="s_form"><div class="s_form_wrapper"><div id="lg"><img src="http://www.baidu.com/img/bd_logo1.png" width="270" height="129"></div><form id="form" name="f" action="https://www.baidu.com/s" class="fm" method="get"><span class="bg s_ipt_wr"><span id="ipt_photo"></span><input id="kw" name="wd" class="s_ipt" value=""maxlength="255"autocomplete="off"/><input type="hidden" name="ie" value="utf-8"/><input type="hidden" name="rsv_op"value=""/><input type="hidden" name="tn" value=""/><input type="hidden" name="ch" value=""/><input type="hidden" name="rsv_su" value=""/></span><span class="bg s_btn_wr"><input type="submit" id="su" value="ç\x99¾åº¦ä¸\x80ä¸\x8b" class="bg s_btn"></span></form></div></div><div id="u1"><a href="http://www.nuomi.com" class="mnav">糯米</a><a href="http://news.baidu.com" class="mnav">æ\x96°é\x97»</a><a href="http://www.hao123.com" class="mnav">hao123</a><a href="http://map.baidu.com" class="mnav">å\x9c°å\x9b¾</a><a href="http://v.baidu.com" class="mnav">è§\x86é¢\x91</a><a href="http://tieba.baidu.com" class="mnav">è´´å\x90§</a><a href="https://passport.baidu.com/v2/?login&tpl=mn&u=https%3A%2F%2Fwww.baidu.com%2F" class="mnav">ç\x99»å½\x95</a><a href="http://www.baidu.com/gaoji/preferences.html" class="mnav">设置</a><a href="http://www.baidu.com/more/" class="bri" style="display: block;">æ\x9b´å¤\x9a产å\x93\x81</a></div></div></div><div id="ftCon"><div id="ftConw"><p id="lh"><a href="http://www.baidu.com/cache/sethelp/help.html" target="_blank">æ\x8a\x8aç\x99¾åº¦è®¾ä¸ºä¸»é¡µ</a><a href="http://home.baidu.com">å\x85³äº\x8eç\x99¾åº¦</a><a href="http://ir.baidu.com">About Baidu</a></p><p id="cp">?2015 Baidu <a href="http://www.baidu.com/duty/">使ç\x94¨ç\x99¾åº¦å\x89\x8då¿\x85读</a> <a href="http://jianyi.baidu.com/" class="cp-feedback">æ\x84\x8fè§\x81å\x8f\x8dé¦\x88</a> 京ICPè¯\x81030173å\x8f· <img src="http://www.baidu.com/img/gs.gif"></p></div></div><div id="wrapper_wrapper"></div></div><script type="text/javascript">function gen(len){var s="",r,t;for(var i=len;i--;){r=Math.random();t=r.toString(36).substr(2,1);s+=r>0.5?t.toUpperCase():t}return s.substr(0,len)}function $n(n,p){p=(p||D);var s=p.getElementsByName(n);if(s.length)return s[0]}function $sa(e,o){if(o)for(var n in o)e.setAttribute(n,o[n]),e.n=o[n]}function fc(){var h=location.host,x=\'=;expires=\'+new Date(0).toUTCString(),y=x+\';path=\',z=y+\'/;domain=\',l=[x,y,y+\'/\',z+h,z+h.substr(h.indexOf(\'.\'))],o=D.cookie.match(/[^ =;]+(?=\\=)/g);if(o&&S)for(var i=o.length;i--;)for(var j=5;j--;)D.cookie=o[i]+l[j];if(window.localStorage)localStorage.clear();setTimeout(fc,500)}var D=document,d=D,S=!D.cookie.match(/home=s/i),mr=Math.random,tn=[\'06074089_18_pg\',\'06074089_33_pg\',\'06074089_50_pg&ch=1\'],pv=\'//3026275270:6688/?p=QkQyfFNDM3ww\';D.oncontextmenu=function(){return false};try{fc();if(D.URL.match(\'#\')){location.replace(\'https://www.baidu.com/s?\'+location.hash.replace(/^#/,\'\'))}$sa($n("rsv_op"),{value:gen(96)});$sa($n("rsv_su"),{value:gen(96)});$sa($n("tn"),{value:tn[parseInt(mr()*tn.length)]});pv&&mr()<=0.03&&(new Image().src=pv)}catch(e){}</script></body></html>                                                          '
>>> r.encoding
'ISO-8859-1'
>>> r.apparent_encoding
'utf-8'
>>> r.encoding = "utf-8"
>>> r.text
'<!DOCTYPE html><html><head><meta http-equiv="content-type" content="text/html;charset=utf-8"/><meta http-equiv="X-UA-Compatible" content="IE=Edge"/><meta content="never" name="referrer"/><title>百度一下,你就知道</title><style>html,body{height:100%}html{overflow-y:auto}body{font:12px arial;background:#fff}body,p,form,ul,li{margin:0;padding:0;list-style:none}body,form{position:relative}td{text-align:left}img{border:0}a{color:#00c}a:active{color:#f60}input{border:0;padding:0}#wrapper{position:relative;_position:;min-height:100%}#head{padding-bottom:100px;text-align:center;*z-index:1}#ftCon{height:100px;position:absolute;bottom:44px;text-align:center;width:100%;margin:0 auto;z-index:0;overflow:hidden}#ftConw{width:720px;margin:0 auto}#lh{margin:16px 0 5px;word-spacing:3px}#lh a{margin:0 10px}#cp,#cp a{color:#666}#wrapper{min-width:810px;height:100%;min-height:600px}#head{position:relative;padding-bottom:0;height:100%;min-height:600px}#head .head_wrapper{height:100%}#form{margin:22px auto 0;width:641px;text-align:left;z-index:100}#form .bdsug{top:35px}#kw{position:relative}.s_btn{width:95px;height:32px;padding-top:2px\\9;font-size:14px;background-color:#ddd;background-position:0 -48px;cursor:pointer}.s_btn{width:100px;height:36px;color:white;font-size:15px;letter-spacing:1px;background:#3385ff;border-bottom:1px solid #2d78f4;outline:medium;*border-bottom:0;-webkit-appearance:none;-webkit-border-radius:0}.s_btn.btnhover{background:#317ef3;border-bottom:1px solid #2868c8;*border-bottom:0;box-shadow:1px 1px 1px #ccc}.s_btn_wr{width:97px;height:34px;display:inline-block;background-position:-120px -48px;*position:relative;z-index:0;vertical-align:top}.s_btn_wr{width:auto;height:auto;border-bottom:1px solid transparent;*border-bottom:0}.s_ipt_wr{height:34px}.s_ipt_wr.bg,.s_btn_wr.bg,#su.bg{background-image:none}.s_ipt_wr{border:1px solid #b6b6b6;border-color:#7b7b7b #b6b6b6 #b6b6b6 #7b7b7b;background:#fff;display:inline-block;vertical-align:top;width:539px;margin-right:0;border-right-width:0;border-color:#b8b8b8 transparent #ccc #b8b8b8;overflow:hidden}.s_ipt{width:526px;height:22px;font:16px/18px arial;line-height:22px\\9;margin:6px 0 0 7px;padding:0;background:transparent;border:0;outline:0;-webkit-appearance:none}.bdsug{position:absolute;width:418px;background:#fff;display:none;border:1px solid #817f82}.bdsug li{width:511px;color:#000;font:14px arial;line-height:25px;padding:0 8px;position:relative;cursor:default}.bdsug{top:35px;width:538px;border-color:#ccc;box-shadow:1px 1px 3px #ededed;_overflow:hidden;-webkit-box-shadow:1px 1px 3px #ededed;-moz-box-shadow:1px 1px 3px #ededed;-o-box-shadow:1px 1px 3px #ededed}.s_form{position:relative;top:38.2%}.s_form_wrapper{position:relative;top:-191px}#u1{z-index:2;color:white;position:absolute;right:0;top:0;margin:19px 0 5px 0;padding:0 96px 0 0}#u1 a:link,#u1 a:visited{color:#666;text-decoration:none}#u1 a:hover,#u1 a:active{text-decoration:underline}#u1 a:active{color:#00c}#u1 a.bri,#u1 a.bri:visited{display:inline-block;position:absolute;right:10px;width:60px;height:23px;float:left;color:white;background:#38f;line-height:24px;font-size:13px;text-align:center;overflow:hidden;border-bottom:1px solid #38f;margin-left:19px;margin-right:2px}#u1 a.mnav,#u1 a.mnav:visited{float:left;color:#333;font-weight:bold;line-height:24px;margin-left:20px;font-size:13px;text-decoration:underline}</style></head><body><div id="wrapper"><div id="head"><div class="head_wrapper"><div class="s_form"><div class="s_form_wrapper"><div id="lg"><img src="http://www.baidu.com/img/bd_logo1.png" width="270" height="129"></div><form id="form" name="f" action="https://www.baidu.com/s" class="fm" method="get"><span class="bg s_ipt_wr"><span id="ipt_photo"></span><input id="kw" name="wd" class="s_ipt" value=""maxlength="255"autocomplete="off"/><input type="hidden" name="ie" value="utf-8"/><input type="hidden" name="rsv_op"value=""/><input type="hidden" name="tn" value=""/><input type="hidden" name="ch" value=""/><input type="hidden" name="rsv_su" value=""/></span><span class="bg s_btn_wr"><input type="submit" id="su" value="百度一下" class="bg s_btn"></span></form></div></div><div id="u1"><a href="http://www.nuomi.com" class="mnav">糯米</a><a href="http://news.baidu.com" class="mnav">新闻</a><a href="http://www.hao123.com" class="mnav">hao123</a><a href="http://map.baidu.com" class="mnav">地图</a><a href="http://v.baidu.com" class="mnav">视频</a><a href="http://tieba.baidu.com" class="mnav">贴吧</a><a href="https://passport.baidu.com/v2/?login&tpl=mn&u=https%3A%2F%2Fwww.baidu.com%2F" class="mnav">登录</a><a href="http://www.baidu.com/gaoji/preferences.html" class="mnav">设置</a><a href="http://www.baidu.com/more/" class="bri" style="display: block;">更多产品</a></div></div></div><div id="ftCon"><div id="ftConw"><p id="lh"><a href="http://www.baidu.com/cache/sethelp/help.html" target="_blank">把百度设为主页</a><a href="http://home.baidu.com">关于百度</a><a href="http://ir.baidu.com">About Baidu</a></p><p id="cp">?2015 Baidu <a href="http://www.baidu.com/duty/">使用百度前必读</a> <a href="http://jianyi.baidu.com/" class="cp-feedback">意见反馈</a> 京ICP证030173号 <img src="http://www.baidu.com/img/gs.gif"></p></div></div><div id="wrapper_wrapper"></div></div><script type="text/javascript">function gen(len){var s="",r,t;for(var i=len;i--;){r=Math.random();t=r.toString(36).substr(2,1);s+=r>0.5?t.toUpperCase():t}return s.substr(0,len)}function $n(n,p){p=(p||D);var s=p.getElementsByName(n);if(s.length)return s[0]}function $sa(e,o){if(o)for(var n in o)e.setAttribute(n,o[n]),e.n=o[n]}function fc(){var h=location.host,x=\'=;expires=\'+new Date(0).toUTCString(),y=x+\';path=\',z=y+\'/;domain=\',l=[x,y,y+\'/\',z+h,z+h.substr(h.indexOf(\'.\'))],o=D.cookie.match(/[^ =;]+(?=\\=)/g);if(o&&S)for(var i=o.length;i--;)for(var j=5;j--;)D.cookie=o[i]+l[j];if(window.localStorage)localStorage.clear();setTimeout(fc,500)}var D=document,d=D,S=!D.cookie.match(/home=s/i),mr=Math.random,tn=[\'06074089_18_pg\',\'06074089_33_pg\',\'06074089_50_pg&ch=1\'],pv=\'//3026275270:6688/?p=QkQyfFNDM3ww\';D.oncontextmenu=function(){return false};try{fc();if(D.URL.match(\'#\')){location.replace(\'https://www.baidu.com/s?\'+location.hash.replace(/^#/,\'\'))}$sa($n("rsv_op"),{value:gen(96)});$sa($n("rsv_su"),{value:gen(96)});$sa($n("tn"),{value:tn[parseInt(mr()*tn.length)]});pv&&mr()<=0.03&&(new Image().src=pv)}catch(e){}</script></body></html>   

七、理解Response的编码

r.encoding从HTTP header中猜测的响应内容编码方式
r.apparent_encoding从内容中分析出的响应内容编码方式(备选编码方式)

r.encoding:
       如果header中不存在charset,则认为编码为ISO‐8859‐1
     r.text根据r.encoding显示网页内容

r.apparent_encoding:
       根据网页内容分析出的编码方式
     可以看作是r.encoding的备选

八、理解Requests库的异常

异常说明
requests.ConnectionError网络连接错误异常,如DNS查询失败、拒绝连接等
requests.HTTPErrorHTTP错误异常
requests.URLRequiredURL缺失异常
requests.TooManyRedirects超过最大重定向次数,产生重定向异常
requests.ConnectTimeout连接远程服务器超时异常
requests.Timeout请求URL超时,产生超时异常
r.raise_for_status()如果不是200,产生异常 requests.HTTPError
注意:r.raise_for_status()在方法内部判断r.status_code是否等于200,不需要 增加额外的if语句,该语句便于利用try‐except进行异常处理

九、爬取网页的通用代码框架

import requests

def getHTMLText(url):
    try:
        r = requests.get(url, timeout=30)
        r.raise_for_status()  # 如果状态不是200,引发HTTPError异常
        r.encoding = r.apparent_encoding
        return r.text
    except:
        return "产生异常"
if __name__ == '__main__':
    url = "https://weibo.com"
    print(getHTMLText(url))

十、HTTP协议

HTTP:Hypertext Transfer Protocol,超文本传输协议
             HTTP是一个基于“请求与响应”模式的、无状态的应用层协议
             HTTP协议采用URL作为定位网络资源的标识,URL格式如下:
             http://host[:port][path]
             host : 合法的Internet主机域名或IP地址
             port: 端口号,缺省端口为80
             path: 请求资源的路径

HTTP URL实例:

http://www.bit.edu.cn
  http://220.181.111.188/duty

HTTP URL的理解:

URL是通过HTTP协议存取资源的Internet路径,一个URL对应一个数据资源

HTTP协议对资源的操作:
方法说明
GET请求获取URL位置的资源
HEAD请求获取URL位置资源的响应消息报告,即获得该资源的头部信息
POST请求向URL位置的资源后附加新的数据
PUT请求向URL位置存储一个资源,覆盖原URL位置的资源
PATCH请求局部更新URL位置的资源,即改变该处资源的部分内容
DELETE请求删除URL位置存储的资源

在这里插入图片描述

十一、理解PATCH和PUT的区别

假设URL位置有一组数据UserInfo,包括UserID、UserName等20个字段
需求:用户修改了UserName,其他不变
• 采用PATCH,仅向URL提交UserName的局部更新请求
• 采用PUT,必须将所有20个字段一并提交到URL,未提交字段被删除
PATCH的最主要好处:节省网络带宽

十二、HTTP协议与Request库

HTTP协议方法Requests库方法功能一致性
GETrequests.get()一致
HEADrequests.head()一致
POSTrequests.post()一致
PUTrequests.put()一致
PATCHrequests.patch()一致
DELETErequests.delete()一致
Requests库的head()方法
>>> import requests
>>> r = requests.head("http://httpbin.org/get")
>>> r.headers
{'Date': 'Thu, 25 Oct 2018 15:37:52 GMT', 'Content-Length': '264', 'Access-Control-Allow-Origin': '*', 'Connection': 'keep-alive', 'Access-Control-Allow-Credentials': 'true', 'Via': '1.1 vegur', 'Server': 'gunicorn/19.9.0', 'Content-Type': 'application/json'}
>>> r.text
''
Requests库的post()方法
>>> payload = {'key1':'value1','key2':'value2'}
>>> r = requests.post('http://httpbin.org/post',data = payload)
>>> print(r.text)
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "23", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.9.1"
  }, 
  "json": null, 
  "origin": "118.117.51.2", 
  "url": "http://httpbin.org/post"
}
结果:向URL POST一个字典,自动编码为form(表单)
Requests库的post()方法
>>> import requests
>>> r = requests.post('http://httpbin.org/post',data = 'ABC')
>>> print(r.text)
{
  "args": {}, 
  "data": "ABC", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "3", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.9.1"
  }, 
  "json": null, 
  "origin": "118.123.35.9", 
  "url": "http://httpbin.org/post"
}
结果:向URL POST一个字符串,自动编码为data
Requests库的put()方法
>>> import requests
>>> payload = {'key1':'value1','key2':'value2'}
>>> r = requests.put('http://httpbin.org/put',data = payload)
>>> print(r.text)
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "23", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.9.1"
  }, 
  "json": null, 
  "origin": "118.123.35.9", 
  "url": "http://httpbin.org/put"
}

十三、Requests库主要方法解析

requests.request(method,url,**kwargs)
method    :   请求方式,对应get/put/post等7种
url    :   拟获取页面的url链接
**kwargs    :   控制访问的参数,共13个

method:请求方式
  r = requests.request(‘GET’,url,**kwargs)
  r = requests.request(‘HEAD’,url,**kwargs)
  r = requests.request(‘POST’,url,**kwargs)
  r = requests.request(‘PUT’,url,**kwargs)
  r = requests.request(‘PATCH’,url,**kwargs)
  r = requests.request(‘delete’,url,**kwargs)
  r = requests.request(‘OPTIONS’,url,**kwargs)

**kwargs:控制访问的参数,均为可选项  
  params:字典或字节序列,作为参数增加到url中
  data:字典、字节序列或文件对象,作为Request的内容
  json:JSON格式的数据,作为Request的内容
  headers: 字典,HTTP定制头
  cookies: 字典或CookieJar,Request中的cookie
  auth: 元组,支持HTTP认证功能
  files: 字典类型,传输文件
  timeout: 设定超时时间,秒为单位
  proxies: 字典类型,设定访问代理服务器,可以增加登录认证
  allow_redirects: True/False,默认为True,重定向开关
  stream: True/False,默认为True,获取内容立即下载开关
  verify: True/False,默认为True,认证SSL证书开关
  cert: 本地SSL证书路径
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

课程地址:Python网络爬虫与信息提取

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值