爬虫 Task1 学习get与post请求

本文介绍了Python爬虫中GET和POST请求的使用,探讨了在网络断开时请求的响应状态码,并讲解了请求头的概念及添加方法。在实践中,使用requests库进行GET请求时遇到中文乱码问题,而断网情况下请求会报错,提示需要进一步查询错误详情。
摘要由CSDN通过智能技术生成

1.学习get与post请求,

尝试使用requests或者是urllib用get方法向百度一下,你就知道​发出一个请求,并将其返回结果输出。

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2019/3/1 17:19
# @Author  : StalloneYang
# @File    : day01.py
# @desc:

import requests

url = "https://www.baidu.com"

req = requests.get(url)

print(req.text)

在这里插入图片描述
PS:使用get请求百度,未加任何请求头和请求参数,返回结果也未解码,出现了中文乱码

2.如果是断开了网络,再发出申请,结果又是什么。了解申请返回的状态码。

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2019/3/1 17:19
# @Author  : StalloneYang
# @File    : day01.py
# @desc:

import requests

url = "https://www.baidu.com"

req = requests.get(url)

print(req.text)

报错信息如下:

"D:\Program Files\Python37\python.exe" D:/workspace/test/spider/day01.py
Traceback (most recent call last):
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connection.py", line 159, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "D:\Program Files\Python37\lib\site-packages\urllib3\util\connection.py", line 57, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "D:\Program Files\Python37\lib\socket.py", line 748, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connectionpool.py", line 839, in _validate_conn
    conn.connect()
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connection.py", line 301, in connect
    conn = self._new_conn()
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connection.py", line 168, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x0000026291663D30>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Program Files\Python37\lib\site-packages\requests\adapters.py", line 449, in send
    timeout=timeout
  File "D:\Program Files\Python37\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "D:\Program Files\Python37\lib\site-packages\urllib3\util\retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.baidu.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0000026291663D30>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/workspace/test/spider/day01.py", line 12, in <module>
    req = requests.get(url)
  File "D:\Program Files\Python37\lib\site-packages\requests\api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\requests\sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "D:\Program Files\Python37\lib\site-packages\requests\sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\requests\adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.baidu.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0000026291663D30>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

Process finished with exit code 1

在这里插入图片描述
断开网络后,请求,报错了,具体的报错信息还需去查询

3.了解什么是请求头,如何添加请求头。

个人一开始理解的请求头就是接口发送请求的头,具体有什么用从来没去思考过,百度安利了一下:

http请求头,HTTP客户程序(例如浏览器),向服务器发送请求的时候必须指明请求类型(一般是GET或者POST)。如有必要,客户程序还可以选择发送其他的请求头

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2019/3/1 17:19
# @Author  : StalloneYang
# @File    : day01.py
# @desc:

import requests

# 请求地址
url = "https://www.baidu.com"

# 请求参数
payload = "ie=utf-8&mod=1&isbd=1&isid=8fd52c1300006e78&ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=python%E5%A2%9E%E5%8A%A0%E8%AF%B7%E6%B1%82%E5%A4%B4&oq=python%25E5%25A2%259E%25E5%258A%25A0%25E8%25AF%25B7%25E6%25B1%2582%25E5%25A4%25B4&rsv_pq=8fd52c1300006e78&rsv_t=5edbtJG6xckLrDX02yHBZbiWfInPpROwkhHY1oZThB8GQ0vbRqZNuGpqZXw&rqlang&
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值