一般步骤
1.查看网页地址,F2查看要获取的图片url
2.根据要获取的url图片地址,写出合适的正则表达式
例:获取页面中如下区域图片:https://blog.csdn.net/julielele?spm=3001.5343
F12查看图片链接
获得正则表达式:
format = r'src="(.*).png\?x-oss-process=image\/resize,m_fixed,h_64,w_64" alt'
代码示例
import os
import re,urllib.request
import time
def getImage(format,url,filePath):
'''
:param format: 匹配的正则表达式
:param url: 获取图片的网址
:param filePath: 获取的图片存入的文件夹
:return:
'''
request = urllib.request.urlopen(url)
buf = request.read().decode('utf-8')
# 获取符合条件的图片链接
listurl = re.findall(format,buf)
print(listurl)
#筛选拼接图片链接
res=[]
for url in listurl:
res.append(url+".png")
timestr = time.strftime("%Y-%m-%d-%H-%M-%S",time.localtime())
path = filePath+"\img"+timestr+"\\"
isExists=os.path.exists(path)
if not isExists: os.makedirs(path)
index = 0
for url in res:
print(url)
try:
f = open(path+str(index)+'.png', 'wb')
request = urllib.request.urlopen(url)
buf = request.read()
f.write(buf)
index = index + 1
except Exception:
continue
finally:
#关闭文件
f.close()
url = "https://blog.csdn.net/julielele?spm=3001.5343"
#匹配截取开头的url('结尾的.png后的数据
# format = r'url\(\'(.*)\.png'
format = r'src="(.*).png\?x-oss-process=image\/resize,m_fixed,h_64,w_64" alt'
filePath = "d:\img"
getImage(format,url,filePath)
运行后结果: