php 抓取 大众点评,大众点评页面抓取实例

#coding:utf-8

import re

from bs4 import BeautifulSoup as bs

with open('dianping.html','rb') as f:

html = f.read().decode()

dianping = bs(html,'lxml')

allshops = dianping.find_all('div', attrs={'class':'shop-list J_shop-list shop-all-list'})[0]

shops = allshops.find_all('li')

for eachshop in shops:

name = eachshop.h4.string

shopurl = eachshop.a["href"]

try:

star = re.findall('title="(.*)">',str(eachshop.find_all('span')[0]))[0]

except:

star = ''

try:

cls = re.findall('(.*?)',str(eachshop.find_all('span')))[0]

except:

cls = ''

try:

area = re.findall('(.*?)',str(eachshop.find_all('span')))[1]

except:

area = ''

try:

addr = re.findall('(.*?)',str(eachshop.find_all('span')))[0]

except:

addr = ''

try:

comments = re.findall('(.*?)',str(eachshop.find_all('b')[0]))[0]

except:

comments = ''

try:

mean = re.findall('(.*?)',str(eachshop.find_all('b')[1]))[0]

except:

mean = ''

try:

taste = re.findall('(.*?)',str(eachshop.find_all('b')[2]))[0]

except:

taste = ''

try:

envior = re.findall('(.*?)',str(eachshop.find_all('b')[3]))[0]

except:

envior = ''

try:

service = re.findall('(.*?)',str(eachshop.find_all('b')[4]))[0]

except:

service = ''

print (name,shopurl,star,cls,area,addr,mean,taste,envior,service,comments)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值