报错:
Error:(29, 32) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
代码如下
val processRdd = source.map(row => {
var dbSearcher: DbSearcher = new DbSearcher(new DbConfig(), "dataset/ip2region.db")
var lookupService: LookupService = new LookupService("dataset/GeoLiteCity.dat")
val ip = row.getAs[String]("ip")
val regionAll = dbSearcher.binarySearch(ip).getRegion
val region = regionAll.split("\\|")(2)
val city = regionAll.split("\\|")(3)
val longitude = lookupService.getLocation(ip).longitude.toDouble
val latitude = lookupService.getLocation(ip).latitude.toDouble
Row(ip, region, city, longitude, latitude)
})
问题在最后一行返回的数据类型,不是spark通过自身的反射能完成的自动编码部分的类型,需要使用基本类型或者样case class
最后一行代码用case class封装
ProcessIp(ip, region, city, longitude, latitude)
case class ProcessIp(ip:String, region: String, city :String, longitude:Double, latitude:Double)