**
【学习笔记】关于小白学习Python爬虫的一些笔记
**
Requests与BeautifulSoup爬取一些网站图片的经验
这是第一次写的爬取网站的程序,写得不够简洁有些地方都写的不是很规范,希望在以后能够不断勉励写出更好的 代码。也做作为自己以后学习的一个参考
import requests
from bs4 import BeautifulSoup
import warnings
import os
import lxml
warnings.filterwarnings('ignore')
os.makedirs('./abcd',exist_ok='True')
url='https://www.mzitu.com/'
def get_img(page):
head={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36','Referer': 'https://www.mzitu.com/'}
response=requests.get('https://www.mzitu.com/page/'+f'{page}',headers=head)
response.encoding='UTF-8'
soup=BeautifulSoup(response.text,'lxml',fromEncoding='gb2312')
print(soup.select('a'))
img=soup.find_all('img')
for imgu in img:
gg=imgu.get('data-original')
if gg is None:
continue
r=requests.get(gg,headers=head)
imgname=str(gg).split('/')[-1]
print(imgname)
with open(f'./abcd/{imgname}','wb') as fd:
for rr in r.iter_content(256):
fd.write(rr)
for page in range(129,150):
get_img(page)