注意事项:
1.udf、udaf函数的使用都需要使用sqlContext来创建function,如果是scala里需要引用Java的方法或者函数的话,需要包装一下,再写个scala的方法,将Java的返回值输出。
2.scala中的udf函数注册park.sqlContext.udf.register("date_splits",date_splits _)
3.UDTF函数使用的时候,需要创建SparkSession对象,由SparkSession执行sql语句CREATE TEMPORARY FUNCTION myUDTF as '自己实现的UDTF位置’来创建
测试数据
1-174,"121.31583075,30.67559298","121.31583075,30.67784745","121.31848407,30.67784745","121.31848407,30.67559298"
1-175,"121.31848407,30.67559298","121.31848407,30.67784745","121.32113740000001,30.67784745","121.32113740000001,30.67559298"
1-176,"121.32113740000001,30.67559298","121.32113740000001,30.67784745","121.32379073,30.67784745","121.32379073,30.67559298"
1-177,"121.32379073,30.67559298","121.32379073,30.67784745","121.32644406,30.67784745","121.32644406,30.67559298"
1-178,"121.32644406,30.67559298","121.32644406,30.67784745","121.32909739,30.67784745","121.32909739,30.67559298"
1-179,"121.32909739,30.67559298","121.32909739,30.67784745","121.33175072,30.67784745","121.33175072,30.67559298"
1-180,"121.33175072,30.67559298","121.33175072,30.67784745","121.33440404,30.67784745","121.33440404,30.67559298"
1-181,"121.33440404,30.67559298","121.33440404,30.67784745","121.33705737,30.67784745","121.33705737,30.67559298"
本地测试代码
class BaseJob { }
case class LngLatSH1(id:String,lng1:Double,lat1:Double,lng2:Double,lat2:Double,lng3:Double,lat3:Double,lng4:Double,lat4:Double)
object BaseJob{
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession
.builder()
.appName("base_job")
.enableHiveSupport()
.master("local[2]")
.config("spark.sql.warehouse.dir","/user/hive/warehouse")
.config("spark.sql.shuffle.partitions",100)
.getOrCreate()
val sc: SparkC