python爬虫之一键获取二手房信息

本文仅作为学习笔记记录,若有侵权,即刻删除!

案例一:

爬取redbull公司名称、地址、邮编及电话:

import requests
import re
import bs4
import pandas as pd

url = "http://redbull.com.cn/about/branch"
response = requests.get(url)
soup = bs4.BeautifulSoup(response.text)

# company = re.findall('<h2>(.*?)</h2>', response.text)
# add = re.findall('<p class=\'mapIco\'>(.*?)</p>', response.text)

company = [i.text for i in soup.findAll(name = 'h2')]
add = [i.text for i in soup.findAll(name = 'p', attrs = {'class':'mapIco'})]
mail = [i.text for i in soup.findAll(name = 'p', attrs = {'class':'mailIco'})]
tel = [i.text for i in soup.findAll(name = 'p', attrs = {'class':'telIco'})]

pd.DataFrame({'company':company, 'add':add, 'mail':mail, 'tel':tel})

在这里插入图片描述
案例二:

链家二手房信息,一键爬取并保存为csv

import requests
import bs4
import pandas as pd

huxing = []
area = []
style = []
name = []
price = []

for i in range(1,3):
    url = r'https://su.lianjia.com/ershoufang/pg' + str(i) + '/'
    response = requests.get(url)
    soup = bs4.BeautifulSoup(response.text)
#     xq = [i.text for i in soup.findAll(name = 'div', attrs = {'class':'houseInfo'})]
#     ['4室2厅 | 124.66平米 | 南 北 | 精装 | 低楼层(共15层)  | 板楼',
#     '2室2厅 | 99.28平米 | 南 | 精装 | 中楼层(共25层)  | 板楼']
#      xq得到的为以| 分隔的列表,需手动分割出来
    huxing.extend([i.text.split('|')[0].strip() for i in soup.findAll(name = 'div', attrs = {'class':'houseInfo'})])
    area.extend([i.text.split('|')[1].strip() for i in soup.findAll(name = 'div', attrs = {'class':'houseInfo'})])
    style.extend([i.text.split('|')[3].strip() for i in soup.findAll(name = 'div', attrs = {'class':'houseInfo'})])
    
    name.extend([i.text for i in soup.findAll(name = 'a', attrs = {'data-el':'region'})])
    price.extend([i.text for i in soup.findAll(name = 'div', attrs = {'class':'totalPrice'})])
    
df = pd.DataFrame({'name':name, 'huxing':huxing, 'area':area, 'style':style, 'price':price})
df.to_csv('ershoufang_3.csv', encoding="utf_8_")
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值