lxml所需的知识点:
1.python基础是必须的;
2.网页的审查元素(快捷键F12),了解网页结构;
3.xpath语法,相关内容可查阅csdn或者直接百度。
代码,自测可用(需要安装requests库,lxml库)
# -*- coding : utf-8 -*-
#注意:此段代码,仅限学习交流使用
import requests
from lxml import etree
import time
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',
#这个是个人cookie,登录自己账号时,请替换
'Cookie':'antipas=18148a4j1k55Qy16346M370; uuid=a4c73efd-c5eb-4235-d0da-ccf826720024; ganji_uuid=2249827056567416173723; lg=1; financeCityDomain=sjz; a4c73efd-c5eb-4235-d0da-ccf826720024_views=1; 424719c5-85c6-42a5-ec46-c46b76518b72_views=1; Hm_lvt_e6e64ec34653ff98b12aab73ad895002=1591934543; Hm_lvt_936a6d5df3f3d309bda39e92da3dd52f=1591934529,1591934556; cityDoma