Python requests

最新推荐文章于 2022-08-05 23:30:00 发布

qq_39746270

最新推荐文章于 2022-08-05 23:30:00 发布

阅读量148

点赞数

分类专栏： python 文章标签： requests

python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Requests模块是一个用于网络访问的模块，其实类似的模块有很多，比如urllib，urllib2，httplib，httplib2，他们基本都提供相似的功能，那为什么Requests模块就能够脱引而出呢？可以打开它的官网看一下，是一个“人类“用的http模块。那么，它究竟怎样的人性化呢？相信如果你之前用过urllib之类的模块的话，对比下就会发现它确实很人性化。

一、导入

下载完成后，导入模块很简单，代码如下：

1	`import` `requests`

二、请求url

这里我们列出最常见的发送get或者post请求的语法。

1.发送无参数的get请求：

1	`r` `=` `requests.get(` `"http://pythontab.com/justTest"` `)`

现在，我们得到了一个响应对象r，我们可以利用这个对象得到我们想要的任何信息。

上面的例子中，get请求没有任何参数，那如果请求需要参数怎么办呢？

2.发送带参数的get请求

1 2	`payload` `=` `{` `'key1'` `:` `'value1'` `,` `'key2'` `:` `'value2'` `}` `r` `=` `requests.get(` `"http://pythontab.com/justTest"` `, params` `=` `payload)`

以上得知，我们的get参数是以params关键字参数传递的。

我们可以打印请求的具体url来看看到底对不对：

1 2	`>>>print r.url` `http:` `//pythontab` `.com` `/justTest` `?key2=value2&key1=value1`

可以看到确实访问了正确的url。

还可以传递一个list给一个请求参数：

1

2

3

4

 
        >>> payload = { 
        'key1' 
        :  
        'value1' 
        ,  
        'key2' 
        : [ 
        'value2' 
        ,  
        'value3' 
        ]} 
       
 
        >>> r = requests.get( 
        "http://pythontab.com/justTest" 
        , params=payload) 
       
 
        >>> print r.url 
       
 
        http: 
        //pythontab 
        .com 
        /justTest 
        ?key1=value1&key2=value2&key2=value3 
       

以上就是get请求的基本形式。

3.发送post请求

1	`r` `=` `requests.post(` `"http://pythontab.com/postTest"` `, data` `=` `{` `"key"` `:` `"value"` `})`

以上得知，post请求参数是以data关键字参数来传递的。

现在的data参数传递的是字典，我们也可以传递一个json格式的数据，如下：

1

2

3

4

 
        >>>  
        import  
        json 
       
 
        >>>  
        import  
        requests 
       
 
        >>> payload = { 
        "key" 
        : 
        "value" 
        } 
       
 
        >>> r = requests.post( 
        "http://pythontab.com/postTest" 
        , data = json.dumps(payload)) 
       

由于发送json格式数据太常见了，所以在Requests模块的高版本中，又加入了json这个关键字参数，可以直接发送json数据给post请求而不用再使用json模块了，见下：

1 2	`>>> payload = {` `"key"` `:` `"value"` `}` `>>> r = requests.post(` `"http://pythontab.com/postTest"` `, json=payload)`

如果我们想post一个文件怎么办呢？这个时候就需要用到files参数了：

1

2

3

4

 
        >>> url =  
        'http://pythontab.com/postTest' 
       
 
        >>> files = { 
        'file' 
        :  
        open 
        ( 
        'report.xls' 
        ,  
        'rb' 
        )} 
       
 
        >>> r = requests.post(url, files=files) 
       
 
        >>> r.text 
       

我们还可以在post文件时指定文件名等额外的信息：

1

2

3

 
        >>> url =  
        'http://pythontab.com/postTest' 
       
 
        >>> files = { 
        'file' 
        : ( 
        'report.xls' 
        ,  
        open 
        ( 
        'report.xls' 
        ,  
        'rb' 
        ),  
        'application/vnd.ms-excel' 
        , { 
        'Expires' 
        :  
        '0' 
        })} 
       
 
        >>> r = requests.post(url, files=files) 
       

tips：强烈建议使用二进制模式打开文件，因为如果以文本文件格式打开时，可能会因为“Content-Length”这个header而出错。

可以看到，使用Requests发送请求简单吧！

三、获取返回信息

下面我们来看下发送请求后如何获取返回信息。我们继续使用最上面的例子：

1

2

3

 
        >>>  
        import  
        requests 
       
        >>> r=requests.get( 
        'http://pythontab.com/justTest' 
        ) 
       
        >>> r.text

r.text是以什么编码格式输出的呢？

1 2	`>>> r.encoding` `'utf-8'`

原来是以utf-8格式输出的。那如果我想改一下r.text的输出格式呢？

1	`>>> r.encoding =` `'ISO-8859-1'`

这样就把输出格式改为“ISO-8859-1”了。

还有一个输出语句，叫r.content，那么这个和r.text有什么区别呢？r.content返回的是字节流，如果我们请求一个图片地址并且要保存图片的话，就可以用到，这里举个代码片段如下：

 
        def  
        saveImage( imgUrl,imgName  
        = 
        "default.jpg"  
        ): 
       
        r  
        =  
        requests.get(imgUrl, stream 
        = 
        True 
        ) 
       
        image  
        =  
        r.content 
       
        destDir 
        = 
        "D:\" 
       
        print 
        ( 
        "保存图片" 
        + 
        destDir 
        + 
        imgName 
        + 
        "\n" 
        ) 
       
        try 
        : 
       
        with  
        open 
        (destDir 
        + 
        imgName , 
        "wb" 
        ) as jpg: 
       
        jpg.write(image)      
       
        return 
       
        except  
        IOError: 
       
        print 
        ( 
        "IO Error" 
        ) 
       
        return 
       
        finally 
        : 
       
        jpg.close

刚才介绍的r.text返回的是字符串，那么，如果请求对应的响应是一个json，那我可不可以直接拿到json格式的数据呢？r.json()就是为这个准备的。

我们还可以拿到服务器返回的原始数据，使用r.raw.read()就可以了。不过，如果你确实要拿到原始返回数据的话，记得在请求时加上“stream=True”的选项，如：

1	`r` `=` `requests.get(` `'https://api.github.com/events'` `, stream` `=` `True` `)。`

我们也可以得到响应状态码：

1

2

3

 
        >>> r = requests.get( 
        'http://pythontab.com/justTest' 
        ) 
       
        >>> r.status_code 
       
        200

也可以用requests.codes.ok来指代200这个返回值：

1 2	`>>> r.status_code == requests.codes.ok` `True`

四、关于headers

我们可以打印出响应头：

1 2	`>>> r= requests.get(` `"http://pythontab.com/justTest"` `)` `>>> r.headers`

｀r.headers｀返回的是一个字典，例如：

 
        { 
       
        'content-encoding' 
        :  
        'gzip' 
        , 
       
        'transfer-encoding' 
        :  
        'chunked' 
        , 
       
        'connection' 
        :  
        'close' 
        , 
       
        'server' 
        :  
        'nginx/1.0.4' 
        , 
       
        'x-runtime' 
        :  
        '147ms' 
        , 
       
        'etag' 
        :  
        '"e1ca502697e5c9317743dc078f67693a"' 
        , 
       
        'content-type' 
        :  
        'application/json' 
       
        }

我们可以使用如下方法来取得部分响应头以做判断：

1	`r.headers[` `'Content-Type'` `]`

或者

1	`r.headers.get(` `'Content-Type'` `)`

如果我们想获得请求头（也就是我们向服务器发送的头信息）该怎么办呢？可以使用r.request.headers直接获得。

同时，我们在请求数据时也可以加上自定义的headers（通过headers关键字参数传递）：

1 2	`>>> headers = {` `'user-agent'` `:` `'myagent'` `}` `>>> r= requests.get(` `"http://pythontab.com/justTest"` `,headers=headers)`

五、关于Cookies

如果一个响应包含cookies的话，我们可以使用下面方法来得到它们：

1

2

3

4

 
        >>> url =  
        'http://www.pythontab.com' 
       
        >>> r = requests.get(url) 
       
        >>> r.cookies[ 
        'example_cookie_name' 
        ] 
       
        'example_cookie_value'

我们也可以发送自己的cookie(使用cookies关键字参数)：

1

2

3

 
        >>> url =  
        'http://pythontab.com/cookies' 
       
        >>> cookies={ 
        'cookies_are' 
        : 
        'working' 
        } 
       
        >>> r = requests.get(url, cookies=cookies)

六、关于重定向

有时候我们在请求url时，服务器会自动把我们的请求重定向，比如github会把我们的http请求重定向为https请求。我们可以使用r.history来查看重定向：

1

2

3

4

5

 
        >>> r = requests.get( 
        'http://pythontab.com/' 
        ) 
       
        >>> r.url 
       
        'http://pythontab.com/' 
       
        >>> r. 
        history 
       
        []

从上面的例子中可以看到，我们使用http协议访问，结果在r.url中，打印的却是https协议。那如果我非要服务器使用http协议，也就是禁止服务器自动重定向，该怎么办呢？使用allow_redirects 参数：

1	`r = requests.get(` `'http://pythontab.com'` `, allow_redirects=False)`

七、关于请求时间

我们可以使用timeout参数来设定url的请求超时时间（时间单位为秒）：

1	`requests.get(` `'http://pythontab.com'` `, timeout=1)`

八、关于代理

我们也可以在程序中指定代理来进行http或https访问（使用proxies关键字参数），如下：

1

2

3

4

5

 
        proxies = { 
       
        "http" 
        :  
        "http://10.10.1.10:3128" 
        , 
       
        "https" 
        :  
        "http://10.10.1.10:1080" 
        , 
       
        } 
       
        requests.get( 
        "http://pythontab.com" 
        , proxies=proxies)

九、关于session

我们有时候会有这样的情况，我们需要登录某个网站，然后才能请求相关url，这时就可以用到session了，我们可以先使用网站的登录api进行登录，然后得到session，最后就可以用这个session来请求其他url了：

1

2

3

4

5

 
        s=requests.Session() 
       
 
        login_data={ 
        'form_email' 
        : 
        'youremail@example.com' 
        , 
        'form_password' 
        : 
        'yourpassword' 
        } 
       
 
        s.post( 
        "http://pythontab.com/testLogin" 
        ,login_data) 
       
 
        r = s.get( 
        'http://pythontab.com/notification/' 
        ) 
       
 
        print r.text 
       

其中，form_email和form_password是豆瓣登录框的相应元素的name值。

十、下载页面

使用Requests模块也可以下载网页，代码如下：

1

2

3

4

 
        r 
        = 
        requests.get( 
        "http://www.pythontab.com" 
        ) 
       
 
        with  
        open 
        ( 
        "haha.html" 
        , 
        "wb" 
        ) as html: 
       
 
             
        html.write(r.content) 
       
 
        html.close() 
       

qq_39746270

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python requests

Requests模块是一个用于网络访问的模块，其实类似的模块有很多，比如urllib，urllib2，httplib，httplib2，他们基本都提供相似的功能，那为什么Requests模块就能够脱引而出呢？可以打开它的官网看一下，是一个“人类“用的http模块。那么，它究竟怎样的人性化呢？相信如果你之前用过urllib之类的模块的话，对比下就会发现它确实很人性化。一、导入下载完成后，导入模块很简...
复制链接

扫一扫