python爬虫淘宝比价

最新推荐文章于 2023-12-17 19:00:00 发布

VerdureChen

最新推荐文章于 2023-12-17 19:00:00 发布

阅读量2.7k

点赞数

分类专栏： python学习

本文链接：https://blog.csdn.net/MarrieChen/article/details/77885308

版权

python学习专栏收录该内容

16 篇文章 1 订阅

订阅专栏

首先是源码

import re
import requests

def getHTMLText(url):
    try:
    	r=requests.get(url,timeout=30)
    	r.raise_for_status()
    	r.encoding=r.apparent_encoding
    	return r.text
    except:
    	return ""

def parsePage(ilt,html):
    try:
    	plt=re.findall(r'\"view_price\"\:\"[\d\.]*\"',html)
    	tlt=re.findall(r'\"raw_title\"\:\".*?\"',html)
    	for i in range(len(plt)):
    		price=eval(plt[i].split(':')[1])
    		title=eval(tlt[i].split(':')[1])
    		ilt.append([price,title])
    except:
    	print("")

def printGoodsList(ilt):
	tplt="{:4}\t{:8}\t{:50}"
	print(tplt.format("序号","价格","商品名称"))
	count=0
	for g in ilt:
		count =count+1
		print(tplt.format(count,g[0],g[1]))
	

def main():
	goods='枕头'
	depth=2
	start_url="https://s.taobao.com/search?q="+goods
	infoList=[]
	for i in range(depth):
		try:
			url=start_url+'&s='+str(44*i)
			html=getHTMLText(url)
			parsePage(infoList,html)
		except:
			continue
	printGoodsList(infoList)

main()

今天在安装requests库时还遇到了一个小问题，就是pip在之前安装可以运行，但是今天却运行不了，通过查找资料和多番尝试，最终我发现，是因为在之前因为同时安装了python2.7和python3.6，我修改了python的EXE文件的名字。于是我把Python的EXE文件名改回去，再次尝试发现可以运行。