【求教】python爬取到的智联网址链接不全
提取 @href属性
部分Python代码
html = requests.get(url,headers=headers_1)
selector = etree.HTML(html.text)
infos = selector.xpath('//div[@class="joblist-box__item clearfix"]')
for info in infos:
list = info.xpath('a/@href')
url_lists.append(list)
print(len(url_lists))
job_name = info.xpath('./a/div[1]/div[1]/span[1]/span/text()')
print(job_name)
print('*'*50)
print(url_lists)
输出结果
[[‘http://jobs.zhaopin.com/CC318353680J40161893401.htm?refcode=4019&srccode=&preactionid=’], [‘http://jobs.zhaopin.com/CC711995980J40155443811.htm?refcode=4019&srccode=&prea