1.filter过滤指定字符串的值
val a = lines.filter(x=>x.contains("python")) //选出包含“python”的行
2.Dataframe filter sql语法正则匹配
val a = df.filter("columnName rlike '正则'")
3.Dataframe filter 过滤列表内值
val dateList = List(1,2,3,4,5)
val a = df.filter('dt.isInCollection(dateList))
4.sql过滤数组(选出数组内的指定渠道)
df.filter("channel in (2,121,53,129)
5.创建空表+空列
spark.emptyDataFrame.toDF().
selectExpr("'0' as channel", "'0' as time", "'0' as ip", "'0' as long", "'0' as uin","'0' as version", "'0' as country", "'0' as language", "'0' as id", "'0' as platform")