R爬虫:链家租房数据爬取

单一网页尝试
lianjia_url <- "https://bj.lianjia.com/zufang/pg1/"
lianjia_web<-read_html(lianjia_url,encoding = "UTF-8")
where <- html_nodes(lianjia_web,".where")%>%html_text()
other <- html_nodes(lianjia_web,".other")%>%html_text()
chanquan <- html_nodes(lianjia_web,".chanquan")%>%html_text()
price <- html_nodes(lianjia_web,".price")%>%html_text()
data1 <- data.frame(where,other,price,chanquan)

循环爬取所有数据
lianjia_data <- data.frame(where=0,other=0,price=0,chanquan=0,quyu=0)              
lianjia_data = lianjia_data[-1,]
area_list <- c("dongcheng","xicheng","chaoyang","haidian","fengtai","shijingshan","tongzhou","changping","daxing","yizhuangkaifaqu","shunyi","fangshan","mentougou","yanjiao")
page_list <- c(23,37,100,57,53,13,38,28,29,12,22,15,19,100)             
for (i in 1:14){
	for (m in 1:page_list[i]){
lianjia_url <- paste0("https://bj.lianjia.com/zufang/",area_list[i],"/pg",m,"/")
lianjia_web<-read_html(lianjia_url,encoding = "UTF-8")
where <- html_nodes(lianjia_web,".where")%>%html_text()
other <- html_nodes(lianjia_web,".other")%>%html_text()
chanquan <- html_nodes(lianjia_web,".chanquan")%>%html_text()
price <- html_nodes(lianjia_web,".price")%>%html_text()
data1 <- data.frame(where,other,price,chanquan,quyu=area_list[i])
lianjia_data <- rbind(lianjia_data,data1) 
print(c(area_list[i],m))
}
}

爬取结果下载:https://pan.baidu.com/s/1qZz9WhYALJYJOQH7_TpfyA

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值