利用Python中的BeautifulSoup库爬取安居客第一页信息

拾清心

于 2021-10-21 20:34:35 发布

阅读量339

点赞数

分类专栏：网络爬虫文章标签： python 爬虫开发语言

本文链接：https://blog.csdn.net/my_daily_life/article/details/120894318

版权

网络爬虫专栏收录该内容

4 篇文章 0 订阅

订阅专栏

题目：
网址为https://beijing.anjuke.com/sale/，
利用BeautifulSoup库，爬取第1页的信息，具体信息如下：进入每个房源的页面，爬取小区名称、参考预算、发布时间和核心卖点，并将它们打印出来。（刚学网络爬虫。若有错误，望指正）
代码如下：

import requests
from bs4 import BeautifulSoup
headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36 Edg/94.0.992.50'
}

info_lists = []

house=requests.get("https://beijing.anjuke.com/sale/",headers=headers)
soup=BeautifulSoup(house.text,"lxml")
names=soup.select("h3")
positions=soup.select("p.property-content-info-comm-name")
moneys=soup.select("div.property-price > p.property-price-total > span.property-price-total-num")
years=soup.select("div.property-content > div.property-content-detail > section > div:nth-of-type(1) > p:nth-of-type(5)")
points=soup.select("div.property-content > div.property-content-detail > section > div:nth-of-type(3)")



for name,position,money,year,point in zip(names,positions,moneys,years,points):
    info = {
        'name':name.get_text().strip(),
        'position':position.get_text().strip(),
        'money':money.get_text().strip(),
        'year':year.get_text().strip(),
        'point':point.get_text().strip()
    }
    info_lists.append(info)
    
for info_list in info_lists:
    f = open(r'C:\Users\23993\Desktop\house_info.txt','a+')
    try:
        f.write(info_list["name"]+'  '+info_list["position"]+'  '+info_list["money"]+'万'+'  '+info_list["year"]+'  '+info_list["point"]+'\n')
        f.close()
    except UnicodeEncodeError:
        pass

部分结果截图：
在这里插入图片描述

拾清心

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
利用Python中的BeautifulSoup库爬取安居客第一页信息

题目：网址为https://beijing.anjuke.com/sale/，利用BeautifulSoup库，爬取第1页的信息，具体信息如下：进入每个房源的页面，爬取小区名称、参考预算、发布时间和核心卖点，并将它们打印出来。（刚学网络爬虫。若有错误，望指正）代码如下：import requestsfrom bs4 import BeautifulSoupheaders = { 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64)
复制链接

扫一扫

专栏目录