python简单爬虫程序

最新推荐文章于 2024-08-14 10:14:38 发布

远去的列车1993

最新推荐文章于 2024-08-14 10:14:38 发布

阅读量456

点赞数

分类专栏： python 文章标签： python 爬虫

本文链接：https://blog.csdn.net/u011761393/article/details/50467843

版权

python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

# -*- coding:gb2312 -*-
#coding=utf-8
import re
import urllib
import os

def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html

def getImg(html):
reg = r'objURL\":"(.+?\.jpg)"\,'
imgre = re.compile(reg)
imglist = re.findall(imgre,html)
x =0
for imgurl in imglist:
urllib.urlretrieve(imgurl,'E:\\python\\%s\\%s.jpg' %(search,x))
x+=1

url = "http://image.baidu.com/search/index?tn=baiduimage&ipn=r&ct=201326592&cl=2&lm=-1&st=-1&fm=result&fr=&sf=1&fmq=1452063972940_R&pv=&ic=0&nc=1&z=&se=1&showtab=0&fb=0&width=&height=&face=0&istype=2&ie=utf-8&word="
search = "leihou"

os.mkdir(r'E://python//%s/' %search)
real_url=url+search
html = getHtml(real_url)

print getImg(html)