用Jupyter—Notebook爬取网页数据实例7

最新推荐文章于 2021-08-10 20:00:35 发布

HongMeng07

最新推荐文章于 2021-08-10 20:00:35 发布

阅读量2.4k

点赞数

分类专栏：学习实例

本文链接：https://blog.csdn.net/HongMeng07/article/details/109794215

版权

学习实例专栏收录该内容

13 篇文章 15 订阅

订阅专栏

用selenium爬取太原链家网

在这里插入图片描述

啥也不说了，直接上代码

#引入selenium、 pandas、openpyxl库
from selenium import webdriver
import pandas as pd
import openpyxl
#定义存储变量
q=[]
sq=[]
xq=[]
mj=[]
cx=[]
hx=[]
yz=[]
#获取网页源代码
for i in range(1,101):
    url='https://ty.lianjia.com/zufang/pg'+str(i)
    browser = webdriver.Chrome()
    browser.get(url)
#解析源代码，提取所需数据信息    
    try:
        for i in browser.find_elements_by_class_name('content__list--item--main'):
            q.append(i.find_elements_by_class_name('content__list--item--des')[0].find_elements_by_tag_name('a')[0].text)
            sq.append(i.find_elements_by_class_name('content__list--item--des')[0].find_elements_by_tag_name('a')[1].text)
            xq.append(i.find_elements_by_class_name('content__list--item--des')[0].find_elements_by_tag_name('a')[2].text)
            mj.append(i.find_elements_by_class_name('content__list--item--des')[0].text.replace("\n","").replace(" ","").split("/")[1])
            cx.append(i.find_elements_by_class_name('content__list--item--des')[0].text.replace("\n","").replace(" ","").split("/")[2])
            hx.append(i.find_elements_by_class_name('content__list--item--des')[0].text.replace("\n","").replace(" ","").split("/")[3])
            yz.append(i.find_elements_by_class_name('content__list--item-price')[0].text)
    except:
        pass
pd.DataFrame({'区':q,'商圈':sq,'小区':xq,'面积':mj,'朝向':cx,'户型':hx,'月租':yz})
data=pd.DataFrame({'区':q,'商圈':sq,'小区':xq,'面积':mj,'朝向':cx,'户型':hx,'月租':yz})
writer=pd.ExcelWriter('s-lianjia.xlsx')
data.to_excel(writer,'爬虫数据')
writer.save()

爬取结果

HongMeng07

关注

0
点赞
踩
19

收藏

觉得还不错? 一键收藏
0
评论
用Jupyter—Notebook爬取网页数据实例7

用selenium爬取太原链家网啥也不说了，直接上代码#引入selenium、 pandas、openpyxl库from selenium import webdriverimport pandas as pdimport openpyxl#定义存储变量q=[]sq=[]xq=[]mj=[]cx=[]hx=[]yz=[]#获取网页源代码for i in range(1,101): url='https://ty.lianjia.com/zufang/pg'+str(i)
复制链接

扫一扫