伪装为浏览器
headers = {
‘User-Agent’: ‘Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36’}
设置cookies和session
import requests
def login(session):
##登录地址
url = ‘http://www.renren.com/PLogin.do’
headers = {
‘User-Agent’: ‘Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0;’
}
##提交的账号密码
data = {
'email':'111111',
'password':'11118'
}
##发送请求通过post请求获取session
response = session.post(url,data,headers)
##访问登录的页面
url2 = 'http://www.renren.com/971983932'
content = session.get(url2,headers=headers).content.decode('utf-8')
with open('人人网/renren.htm','w',encoding='utf-8') as fp:
fp.write(content)
if name == ‘main’:
session = requests.session()
login(session)
sll证书验证默认为TRUE 免验证。
设置代理
proxies = {
‘http’:‘http://123.207.11.119:1080’,
# ‘https’:‘https://112.85.165.80:9999’
}
headers = {
‘User-Agent’: ‘Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like