R爬取对应IP位置数据

最新推荐文章于 2024-07-29 15:11:22 发布

维格堂406小队

最新推荐文章于 2024-07-29 15:11:22 发布

阅读量1.1k

点赞数

分类专栏： ★★★R软件 # ★★R爬虫

本文链接：https://blog.csdn.net/wendaomudong_l2d4/article/details/78253793

版权

★★★R软件同时被 2 个专栏收录

45 篇文章 2 订阅

订阅专栏

★★R爬虫

6 篇文章 0 订阅

订阅专栏

用的淘宝接口。
不过貌似封IP，查到第三个就要用20s左右。专业的反爬虫什么的我也不会，晚上放在服务器上跑数据，业务上凑活用吧。

# Get_areadata_by_IP("60.191.4.194")
## 传入IP返回相关数据
library(RCurl)
library(bitops)
Get_areadata_by_IP <- function(ip_element) {
  tryCatch(
    expr = {
      # ip_element <- "39.186.159.103"
      if (!is.na(ip_element)) {
        url <- 'http://ip.taobao.com/service/getIpInfo.php?ip='
        Ip_Postition <- getURL(paste(url, ip_element, sep = ''))
        json_list <- rjson::fromJSON(Ip_Postition)$data
        country <- "default"
        area <- "default"
        
        province <-
          "default"
        city <- "default"
        carrier_operator <- "default"
        
        # 国家
        country <- json_list$country
        # 区域
        area <- json_list$area
        #省份
        province <- json_list$region
        # 市
        city <- json_list$city
        # 运营商
        carrier_operator <- json_list$isp
        return(
          Ip_Postition_Data = data.frame(
            Ip = ip_element,
            Ip_country = country,
            Ip_province = province,
            Ip_city = city,
            Ip_carrier_operator = carrier_operator
          )
        )
      } else{
        return(
          Ip_Postition_Data = data.frame(
            Ip = "no_data",
            Ip_country = "no_data",
            Ip_province = "no_data",
            Ip_city = "no_data",
            Ip_carrier_operator = "no_data"
          )
        )
      }
      
    },
    error = function(e) {
      return(
        Ip_Postition_Data = data.frame(
          Ip = ip_element,
          Ip_country = "error",
          Ip_province = "error",
          Ip_city = "error",
          Ip_carrier_operator = "error"
        )
      )
    }
  )
}

2017-10-16 于杭州