UserAgent
- UserAgent: 用户代理, 简称 UA, 属于 headers 的一部分,服务器通过 UA 来判断访问者身份
常见的 UA 值,使用的时候可以直接 copy,也可以用浏览器访问的时候抓包
Android
- Mozilla/5.0 (Linux;Android 4.1.1; Nexus 7 Build/JRO03D) AppleWebKit/535.19 (KHTML, )
- Mozilla/5.0 (Linux; U; Android 4.0.4; en-gb; GT-I9300 Build/IMM76D) AppleWebKit/534 (KHTML, )
- Mozilla/5.0 (Linux; U; Android 4.2; en-gb; GT-P1000 Build/FROYO) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
FireFox
- Mozilla/5.0 (Windows NT 6.2; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0
- Mozilla/5.0 (Android; Mobile; rv:14.0) Gecko/14.0 Firefox/14.0
Google Chrome
- Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36
- Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWebKit/537.19 (KHTML, like Gecko) Chrome/18.0.1025.133 Mobile Safari/535.19
IOS
- Mozilla/5.0 (iPad; CPU OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3
- Mozilla/5.0 (iPod; U; CPU like Mac OS X; en) AppleWebKit/420.1 (KHTML, like Gecko) Version/3.0 Mobile/3A101a Safari/419.3
设置 UA 可以通过两种方式:
- headers
- add_header
案例
from urllib import request, error
def main():
url = "http://www.baidu.com"
try:
# 第一种,使用 headels 方法伪装 UA
# headers = {}
# headers['User-Agent'] = 'Mozilla/5.0 (iPad; CPU OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3'
# req = request.Request(url, headers=headers)
# 第二种,使用 add_header 方法
req = request.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36')
# 正常访问
rsp = request.urlopen(req)
html = rsp.read().decode()
print(html)
except error.HTTPError as e:
print('HTTPError: {0}'.format(e.reason))
except error.URLError as e:
print('URLError: {0}'.format(e.reason))
except Exception as e:
print(e)
print("do.......")
if __name__ == '__main__':
main()