python爬虫：从页面下载图片以及编译错误解决。

最新推荐文章于 2022-09-18 19:16:36 发布

CodeMan杰瑞

最新推荐文章于 2022-09-18 19:16:36 发布

阅读量1.6k

点赞数

分类专栏： Python 文章标签： python ubuntu 爬虫

本文链接：https://blog.csdn.net/qq_18144747/article/details/78511587

版权

Python 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

#!/usr/bin/python
import re
import urllib

def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html
def getImage(html):
reg = r'src="(.*?\.jpg)" title'
image = re.compile(reg)
imglist = re.findall(image,html)
x = 0
for imgurl in imglist:
urllib.urlretrieve(imgurl,'%s.jpg' % x)
x+=1

html = getHtml("http://desk.zol.com.cn/tiyu/1920x1080/")
print(getImage(html))

 
 报错： 

“AttributeError: 'module' object has no attribute 'urlopen'”

原因是Python3里的urllib模块已经发生改变，此处的urllib都应该改成urllib.request。

#!/usr/bin/python
import re
import urllib.request

def getHtml(url):
page = urllib.request.urlopen(url)
html = page.read()
return html
def getImage(html):
reg = r'src="(.*?\.jpg)" title'
image = re.compile(reg)
html = html.decode('GBK')
imglist = re.findall(image,html)
x = 0
for imgurl in imglist:
urllib.request.urlretrieve(imgurl,'%s.jpg' % x)
x+=1

html = getHtml("http://desk.zol.com.cn/tiyu/1920x1080/")
print(getImage(html))