在浏览器访问web服务器的时候,服务器收到的是一个请求报文,大概GET请求的格式大概如下:
先随便拿到一个请求报文,蓝色即为我们要获取的
GET /index.html HTTP/1.1
Host: www.baidu.com
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3528.4 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8
方法一:使用正则表达式中的match方法
1 import re
2
3 request = """GET /index.html HTTP/1.1
4 Host: www.baidu.com
5 Connection: keep-alive
6 Cache-Control: max-age=0
7 Upgrade-Insecure-Requests: 1
8 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3528.4 Safari/537.36
9 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
10 Accept-Encoding: gzip, deflate, br
11 Accept-Language: zh-CN,zh;q=0.9,en;q=0.8
12 """
13
14 # 将上面的请求报文以行的形式分割返回一个列表
15 request_lines = request.splitlines()
16
17 # 方法一:使用正则表达式中的match方法
18 #