因为很多网站都增加了登录验证,所以需要添加一段利用cookies跳过登陆验证码的操作
import pandas as pd
import requests
from lxml import etree
# 通过Chrome浏览器F12来获取cookies,agent,headers
cookies ={'ssxmod_itna2':'eqfx0DgQGQ0QG=DC8DXxxxxx',
'ssxmod_itna':'euitGKD5iIgGxxxxx'}
agent ='Mozilla/5.0 (Windows NT 10.0; Win64; x64)xxxxxxx'
headers = {
'User-Agent' : agent,
'Host':'www.xxx.com',
'Referer':'https://www.xxx.com/'
}
#建立会话
session = requests.session()
session.headers = headers
cookies获取方式
chrmoe浏览器,F12,把name和value填入cookies
agent获取方式
任意点击一条网络资源,右侧headers往下翻到底