spark sql中
单引号意味着要某个变量所代表的值
如
var q = 111
spark.sql(s"select '$q' as t").show()
+---+
| t|
+---+
|111|
+---+
去掉单引号
spark.sql(s"select $q as t").show()
+---+
| t|
+---+
|111|
+---+
两者在该情境下是一样的。
但是
在进行UDF传参时结果一定不一样!!!
如:
// 判断featName中元素是否在deleteBeginFeaStr和deleteEndFeaStr中,在返回false 不在返回true 是一个筛选函数
def DeleRelFea(feaName:String, deleteBeginFeaStr:String, deleteEndFeaStr:String):Boolean={
var feaArray = feaName.split("@")
!(deleteBeginFeaStr.split("-").contains(feaArray(0))||deleteEndFeaStr.split("-").contains(feaArray(1)))
}
spark.udf.register("DeleRelFea", DeleRelFea _)
DeleRelFea("3.0@7.9", "3.0-1.4","10.0-14.0")
结果:
DeleRelFea: (feaName: String, deleteBeginFeaStr: String, deleteEndFeaStr: String)Boolean
res1496: org.apache.spark.sql.expressions.UserDefinedFunction = UserDefinedFunction(<function3>,BooleanType,Some(List(StringType, StringType, StringType)))
res1497: Boolean = false
而在spark sql中:
- 带单引号’’
var deleteBeginFeaStr = "3.0-1.4"
var deleteEndFeaStr = "10.0-14.0"
println(deleteBeginFeaStr,"======",deleteEndFeaStr)
spark.sql(s"" +
s"" +
s"select *,DeleRelFea(feaName, '$deleteBeginFeaStr', '$deleteEndFeaStr') as ps from data0327 " +
s"").show()
结果:
deleteBeginFeaStr: String = 3.0-1.4
deleteEndFeaStr: String = 10.0-14.0
(3.0-1.4,======,10.0-14.0)
+--------+-----------+------+----+----+-----+
| feaName| labelStr|Pclass|nums|name| ps|
+--------+-----------+------+----+----+-----+
| 3.0@7.9|1.0-0.0_1.0| 196| 1| bai|false|
|4.7@10.0|1.0-0.0_1.0| 196| 1| bai|false|
+--------+-----------+------+----+----+-----+
- 不带单引号’’
spark.sql(s"" +
s"" +
s"select *,DeleRelFea(feaName, $deleteBeginFeaStr, $deleteEndFeaStr) as ps from data0327 " +
s"").show()
结果:
+--------+-----------+------+----+----+----+
| feaName| labelStr|Pclass|nums|name| ps|
+--------+-----------+------+----+----+----+
| 3.0@7.9|1.0-0.0_1.0| 196| 1| bai|true|
|4.7@10.0|1.0-0.0_1.0| 196| 1| bai|true|
+--------+-----------+------+----+----+----+
两者结果完全不同,
传参时用单引号’‘去获取变量的值传入参数,
而不使用单引号’'则说明是引用名字为deleteBeginFeaStr和deleteEndFeaStr的列,但是表中没有这两列,所以传入String为空,所以返回true。