我们必须要使用===而不是=或者==
我们来看一个例子:
假如这么一个表,我们想进行条件查询
+---+-----+---+----+-------+
| id| name|age|addr| salary|
+---+-----+---+----+-------+
| 1|zhang| 49| bj|10000|
| 2| wang| 34| sh| 1000|
| 3| li| 28| sz| 5000|
(1)===
df2.select($"name",$"addr").where($"name" === "li").show()
结果:
+----+----+
|name|addr|
+----+----+
| li| sz|
+----+----+
(2)==
scala> df2.select($"name",$"addr").where($"name" == "li").show()
<console>:29: error: overloaded method value where with alternatives:
(conditionExpr: String)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(condition: org.apache.spark.sql.Column)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
cannot be applied to (Boolean)
df2.select($"name",$"addr").where($"name" == "li").show()
(3)=
scala> df2.select($"name",$"addr").where($"name" = "li").show()
<console>:29: error: missing argument list for method $ in class StringToColumn
Unapplied methods are only converted to functions when a function type is expected.
You can make this conversion explicit by writing `$ _` or `$(_)` instead of `$`.
df2.select($"name",$"addr").where($"name" = "li").show()