php 抓取大众点评,大众点评页面抓取实例

最新推荐文章于 2024-06-12 13:55:46 发布

晨雨竹

最新推荐文章于 2024-06-12 13:55:46 发布

阅读量205

点赞数

文章标签： php 抓取大众点评

#coding:utf-8

import re

from bs4 import BeautifulSoup as bs

with open('dianping.html','rb') as f:

html = f.read().decode()

dianping = bs(html,'lxml')

allshops = dianping.find_all('div', attrs={'class':'shop-list J_shop-list shop-all-list'})[0]

shops = allshops.find_all('li')

for eachshop in shops:

name = eachshop.h4.string

shopurl = eachshop.a["href"]

try:

star = re.findall('title="(.*)">',str(eachshop.find_all('span')[0]))[0]

except:

star = ''

try:

cls = re.findall('(.*?)',str(eachshop.find_all('span')))[0]

except:

cls = ''

try:

area = re.findall('(.*?)',str(eachshop.find_all('span')))[1]

except:

area = ''

try:

addr = re.findall('(.*?)',str(eachshop.find_all('span')))[0]

except:

addr = ''

try:

comments = re.findall('(.*?)',str(eachshop.find_all('b')[0]))[0]

except:

comments = ''

try:

mean = re.findall('(.*?)',str(eachshop.find_all('b')[1]))[0]

except:

mean = ''

try:

taste = re.findall('(.*?)',str(eachshop.find_all('b')[2]))[0]

except:

taste = ''

try:

envior = re.findall('(.*?)',str(eachshop.find_all('b')[3]))[0]

except:

envior = ''

try:

service = re.findall('(.*?)',str(eachshop.find_all('b')[4]))[0]

except:

service = ''

print (name,shopurl,star,cls,area,addr,mean,taste,envior,service,comments)

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

晨雨竹

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
php 抓取大众点评,大众点评页面抓取实例

#coding:utf-8import refrom bs4 import BeautifulSoup as bswith open('dianping.html','rb') as f:html = f.read().decode()dianping = bs(html,'lxml')allshops = dianping.find_all('div', attrs={'class':'shop...
复制链接

扫一扫