RT,用R语言rvest包爬取网页数据遇到这样的问题,网址如下:(欧冠杯-分组赛)
http://odds.cp.360.cn/liansai/scorerank?r_a=rQBzUn&leaid=103&season=2015-2016&subseason=%B7%D6%D7%E9%C8%FC
用rvest只能爬A组的比赛,B--H组的爬不了,以下是程序,本人是新手,求大神指教
--------------------------------------------------------------------------------------------------------------
library(rvest)
url="http://odds.cp.360.cn/liansai/scorerank?r_a=rQBzUn&leaid=103&season=2015-2016&subseason=%B7%D6%D7%E9%C8%FC"
web=read_html(url,encoding="GBK")
#比赛时间
time=web%>%html_nodes(xpath="//tbody/tr/td[@class='gray999']")%>%html_text()%>%.[1:12]
TimeID=gsub("\\D", "", time)
#主队
Home=web%>%html_nodes(xpath="//tbody/tr/td[2]/a")%>%html_text()%>%.[1:12]
#客队
Away=web%>%html_nodes(xpath="//tbody/tr/td[4]/a")%>%html_text()%>%.[1:12]
#比分
TBF=web%>%html_nodes(xpath="//tbody/tr/td[3]/em[1]")%>%html_text()%>%.[1:12]
HBF=web%>%html_nodes(xpath="//tbody/tr/td[3]/em[2]")%>%html_text()%>%.[1:12]
#赛果
SG=web%>%html_nodes(xpath="//tbody/tr/td[5]")%>%html_text()%>%.[1:12]
#基础内容汇总
match=data.frame(TimeID,Home,Away,TBF,HBF,SG)
match