## 水位线的作用
简而言之,数据驱动水位线,被水位线没过的窗口会进行计算,计算完丢弃掉窗口
//不支持scala语言我也很无奈呀
//一段案例代码
```
object HandlingWatermarkSss{
def main(args: Array[String]): Unit = {
val sparkSession = SparkSession.builder().master("local[*]").appName("Handling watermark sss").getOrCreate()
sparkSession.sparkContext.setLogLevel("ERROR")
val dsr = sparkSession
.readStream
.format("socket")
.option("host", "Spark")
.option("port", 4444)
.load
import sparkSession.implicits._
val dataFrame = dsr
.as[String]
.map(t => {
val strings = t.split(",")
val word = strings(0)
val timestamp = strings(1)
(word, new Timestamp(timestamp.toLong))
})
.toDF("word", "timestamp")
import org.apache.spark.sql.functions._
dataFrame
.withWatermark("timestamp","10 seconds")//根据当前事件时间确定水位线
.groupBy(window($"timestamp","10 seconds","5 econds"),$"word")//根据时间戳及单词分区
.count
//.printSchema()//打印表结构
.map(t=>(t.getStruct(0).getTimestamp(0),t.getStruct(0).getTimestamp(1),t.getString(1),t.getLong(2)))
.withColumnRenamed("_1","start time")
.withColumnRenamed("_2","end time")
.withColumnRenamed("_3","word")
.withColumnRenamed("_4","count")
.writeStream
.format("console")
.outputMode("update")
.start
.awaitTermination
}
}
```