最早接触爬虫是利用java写脚本,后来自学了利用python进行爬虫来做入门,会用scrapy,最近用了下R,找了几个不同类型的字段获取,当作好玩吧。
爬取内容
R代码
library(XML)
library(RCurl)
library(stringr)
giveNovel_name = function(rootNode){
novel_name <- xpathSApply(rootNode,"//div[@class='title']/h1/text()",xmlValue)
novel_name=gsub("([\r\n ])","",novel_name)
}
giveAuthor_name =