python进程池爬取下载美女图片（xpath）--lowbiprogrammer

最新推荐文章于 2020-12-10 21:12:58 发布

Xcsg

最新推荐文章于 2020-12-10 21:12:58 发布

阅读量376

点赞数

分类专栏： python 文章标签： python xpath

本文链接：https://blog.csdn.net/qq_38450402/article/details/85077855

版权

python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

-- coding: utf-8 --

import requests,os
from lxml import etree
import multiprocessing
from retrying import retry

创建地址池

urllist = [“http://www.zhuangxiule.cn/c{}p{}/”.format(i,x) for i in range(16,26) for x in range(0,25)]
@retry(stop_max_attempt_number=3)
def get_data(url):
response = requests.get(url,timeout=3)
data = response.content
html = etree.HTML(data)

xpath匹配首页的标题和详情的url

mes = html.xpath("//div[@class=‘main’]/dl[@class=‘list-left public-box’]/*")
for i in mes:
if i.xpath("./a/span/text()"):
title = i.xpath("./a/span/text()")
poto_url= i.xpath("./a/@href")[0] if len(i.xpath("./a/@href"))>0 else None
print(title)
poto=requests.get(poto_url)

请求每个详情页的图片地址

html = etree.HTML(poto.content)
mes = html.xpath("//img/@src")

创建下载图片的地址路径及写入图片

path = “f:/img/”
if not os.path.exists(path):
os.makedirs(path)
for photo in mes:
potomes = requests.get(photo)
filename = photo.split("/")[-1]
with open(path+filename,“wb”) as f:
f.write(potomes.content)
if name == ‘main’: