Python简单爬虫

最新推荐文章于 2018-05-11 13:43:22 发布

freelamb

最新推荐文章于 2018-05-11 13:43:22 发布

阅读量483

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/yybmec/article/details/40478329

版权

Python 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

简单Python爬虫，获得网页上的照片

#coding=utf-8

import urllib
import re


def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html

def getImg(html):
    reg = r'src="(.+?\.jpg)" pic_ext'
    imgre = re.compile(reg)
    imglist = re.findall(imgre, html)
    return imglist

// 网站地址
url = "http://tieba.baidu.com/p/3368048910?pn=2"
html = getHtml(url)

listimg = getImg(html)
x = 0
for imgAddress in listimg:
    print imgAddress
    urllib.urlretrieve(imgAddress, 'image%s.jpg' % x)
    x+=1